I finally got an Nvidia developer account a few days ago which gave me access to a very useful library to use with CUDA.
cuPrintf allows printf equivalent statements to be placed inside CUDA kernels without the need for -deviceemu.
The following example demonstrates a simple use for cuPrintf and displays the current thread ID.
-
#include <cuda.h>
-
#include "cuPrintf.cu"
-
-
__global__ void cuPrintfExample()
-
{
-
int tid;
-
tid = blockIdx.x * blockDim.x + threadIdx.x;
-
cuPrintf("%d\n", tid);
-
}
-
-
int main()
-
{
-
cudaPrintfInit();
-
cuPrintfExample <<< 5, 2 >>> ();
-
cudaPrintfDisplay(stdout, true);
-
cudaPrintfEnd();
-
return 0;
-
}
cudaPrintfInit and cudaPrintfEnd only need be called once throughout your entire project.
Output is not automatically displayed on the screen, but stored in a buffer which is cleared and displayed when cudaPrintfDisplay is called. The size of the buffer can be specified with the optional argument cudaPrintfInit(size_t bufferLen).
cudaPrintfEnd simply frees the memory allocated by cudaPrintfInit.
When cudaPrintfDisplay is called, output stored in the buffer is displayed to the console. The second argument in this call either displays the current thread (true) or doesn’t (false). The first arguemnt, specified by stdout in this example, simply defines the descriptor where the cuPrintf log is sent.
On another note, I’ve found that using cuPrintf impacts on the performance of my kernels, presumably due to the data transfer performed every time cuPrintfDisplay() is called.
Hey,
So I know this thread is a little old, but it is no less relevant.
I’m trying to get print statements to work with cuda 3.1 (on Ubuntu linux 9.10) and having very little success. I tried copying your code into a toy project, and it compiled fine, but it didn’t print anything. I’ve done a little digging into the source code, but I can’t find anything obvious. Any idea what might be going on?
Do we need any other files besides cuPrintf.cuh and cuPrintf.cu?
Nope, these are the only files that you need.
Hi there,
That is strange that the code doesn’t work. What GPU are you using? Fermi GPU’s now support the standard printf statement, without the need for cuPrintf.
Hi,
Thanks for getting back to me. Unfortunately, we’re don’t have Fermi architecture. We’re using a Tesla C1060 with a CPU running Ubuntu 9.10, and Cuda 3.1. This was the file that we’re trying to test. Ideally, we want it to print out “42″.
———————————————–
#include
#include “cuPrintf.cu”
__global__ void printTest() {
int tid = 42;
cuPrintf(“%d\n”, tid);
}
int main() {
cudaPrintfInit();
printTest<<>>();
cudaPrintfDisplay(stdout, true);
fflush(stdout);
cudaPrintfEnd();
return 0;
}
———————————————–
Incidentally, we tried adding the cuPrintf files to our Makefile (standard NVIDIA make) which gave us redefinition errors, so our makefile looks like this:
———————————————–
# Add source files here
EXECUTABLE := printTest
# Cuda source files (compiled with cudacc)
CUFILES := printTest.cu
# C/C++ source files (compiled with gcc / c++)
CCFILES :=
include ../../common/common.mk
———————————————–
Do you see anything obvious?
Hey, sorry to spam. We tracked down the problem. Our system uses SM20 architecture, so it was getting into an empty else case. We compiled again with SM11 and got output.
Do you know if there are updates out there for newer architectures?
I’m glad that you managed to fix the problem. I’m not sure why you had to re-compile for a lower architecture. I was under the impression that it worked better with newer architectures!
As far as I’m aware, there hasn’t been an update since its initial release.
It works for me, but I get this in the compile log>
cuPrintf.cu(283): warning : conversion from pointer to smaller integer
cuPrintf.cu(283): warning : conversion from pointer to smaller integer
cuPrintf.cu: In function `_ZN66_INTERNAL_44_tmpxft_00001188_00000000_3_Karatsuba_cpp1_ii_3fd995e015cuPrintfStrncpyEPcPKciS0_’:
4> cuPrintf.cu:283: warning: cast from pointer to integer of different size
Should I worry?
It looks like that is just an error from the cuPrintf.cu file so I would imagine that there is nothing to worry about unless you have modified that file yourself.
Yep, i thought the same since I never touched it :p
GPUs with higher compute versions now support standard “printf” functions inside kernels without the need for cuPrintf.
cuPrintf is still valid though if you’re stuck on lower compute versions!
Hi.. I managed to get it to work. There was 2 problems. 1. if I wanted to call from main, I used .cuh there, and cu at the cuda file that I needed to send the cuprintf from..
second..
dim3 block(16,16,1);
dim3 grid(width/block.x,height/block.y, 1);
testKernel<<>>(10);
testKernel<<>>(10);
the first testKernel works.. the second dosn’t.. that’s why I had a lot of trouble.
do you know what’s the max ammount of data that can be copied with cuprintF.. I can see that it’s not a lot.. :)
hmm.. for some reason it dosen’t show like I wrote it..
but what I wanted to say was that calling
testKernel with grid, (16, 16, 1) works.
calling it with grid, bock dosen’t in the above setup. don’t know why
Can you give an example on how to make it work in a multiple file setup..
when I use 2 files , and have the cuda calculation in a kernel.cu, I get
1>main.cu.obj : error LNK2005: cudaPrintfEnd already defined in kernel.cu.obj
1>main.cu.obj : error LNK2005: cudaPrintfInit already defined in kernel.cu.obj
1>main.cu.obj : error LNK2005: cudaPrintfDisplay already defined in kernel.cu.obj
I can’t seem to figure out why it gives me this..
I’ve not used this in a multiple file setup before so I’m not sure how to fix the problem.
However, you make it seem like kernel.cu is the only file with any CUDA in it. Therefore, if you just have #include “cuPrintf.cu” in kernel.cu, it should work just fine.
Just make sure there are no references to cuPrintf in the other file.
This is totally amazing, i didn’t realize CUDA had this at all!
Yeah it’s great. Saved me a lot of time debugging :p