<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jeremy Kemp &#187; -deviceemu</title>
	<atom:link href="http://www.jeremykemp.co.uk/tag/deviceemu/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.jeremykemp.co.uk</link>
	<description>//TODO</description>
	<lastBuildDate>Sun, 15 Jan 2012 15:32:26 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>CUDA cuPrintf</title>
		<link>http://www.jeremykemp.co.uk/08/02/2010/cuda-cuprintf/</link>
		<comments>http://www.jeremykemp.co.uk/08/02/2010/cuda-cuprintf/#comments</comments>
		<pubDate>Mon, 08 Feb 2010 12:06:08 +0000</pubDate>
		<dc:creator>Jeremy</dc:creator>
				<category><![CDATA[CUDA]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[-deviceemu]]></category>
		<category><![CDATA[cuPrintf]]></category>

		<guid isPermaLink="false">http://www.jeremykemp.co.uk/?p=124</guid>
		<description><![CDATA[I finally got an Nvidia developer account a few days ago which gave me access to a very useful library to use with CUDA. cuPrintf allows printf equivalent statements to be placed inside CUDA kernels without the need for -deviceemu. The following example demonstrates a simple use for cuPrintf and displays the current thread ID. [...]]]></description>
			<content:encoded><![CDATA[<p>I finally got an Nvidia developer account a few days ago which gave me access to a very useful library to use with CUDA.</p>
<p>cuPrintf allows printf equivalent statements to be placed inside CUDA kernels without the need for -deviceemu.</p>
<p>The following example demonstrates a simple use for cuPrintf and displays the current thread ID.</p>
<div class="geshi no cpp">
<ol>
<li class="li1">
<div class="de1"><span class="co2">#include &lt;cuda.h&gt;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co2">#include &quot;cuPrintf.cu&quot;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">__global__ <span class="kw4">void</span> cuPrintfExample<span class="br0">&#40;</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw4">int</span> tid;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;tid <span class="sy1">=</span> blockIdx.<span class="me1">x</span> <span class="sy2">*</span> blockDim.<span class="me1">x</span> <span class="sy2">+</span> threadIdx.<span class="me1">x</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;cuPrintf<span class="br0">&#40;</span><span class="st0">&quot;%d<span class="es0">\n</span>&quot;</span>, tid<span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw4">int</span> main<span class="br0">&#40;</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;cudaPrintfInit<span class="br0">&#40;</span><span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;cuPrintfExample <span class="sy1">&lt;&lt;&lt;</span> <span class="nu0">5</span>, <span class="nu0">2</span> <span class="sy1">&gt;&gt;&gt;</span> <span class="br0">&#40;</span><span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;cudaPrintfDisplay<span class="br0">&#40;</span><span class="kw2">stdout</span>, <span class="kw2">true</span><span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;cudaPrintfEnd<span class="br0">&#40;</span><span class="br0">&#41;</span>;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">return</span> <span class="nu0">0</span>;</div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
</ol>
</div>
<p>cudaPrintfInit and cudaPrintfEnd only need be called once throughout your entire project.</p>
<p>Output is not automatically displayed on the screen, but stored in a buffer which is cleared and displayed when cudaPrintfDisplay is called. The size of the buffer can be specified with the optional argument cudaPrintfInit(size_t  bufferLen).</p>
<p>cudaPrintfEnd simply frees the memory allocated by cudaPrintfInit.</p>
<p>When cudaPrintfDisplay is called, output stored in the buffer is displayed to the console. The second argument in this call either displays the current thread (true) or doesn&#8217;t (false). The first arguemnt, specified by stdout in this example, simply defines the descriptor where the cuPrintf log is sent.</p>
<p>On another note, I&#8217;ve found that using cuPrintf impacts on the performance of my kernels, presumably due to the data transfer performed every time cuPrintfDisplay() is called.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jeremykemp.co.uk/08/02/2010/cuda-cuprintf/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
	</channel>
</rss>

