<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Not A Number &#187; Programming</title>
	<atom:link href="http://notanumber.net/archives/category/programming/feed" rel="self" type="application/rss+xml" />
	<link>http://notanumber.net</link>
	<description>Programming, Theory, and Math</description>
	<lastBuildDate>Sat, 21 Nov 2009 00:07:26 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Underhanded C: The Leaky Redaction</title>
		<link>http://notanumber.net/archives/54/underhanded-c-the-leaky-redaction</link>
		<comments>http://notanumber.net/archives/54/underhanded-c-the-leaky-redaction#comments</comments>
		<pubDate>Sat, 21 Nov 2009 00:03:25 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[C]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[hacks]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://notanumber.net/?p=54</guid>
		<description><![CDATA[So, it turns out I am the winner of the 2008 Underhanded C Contest. The goal of the contest is to write some straightforward C code to solve a simple task, incorrectly. In particular, you had to introduce a hidden security flaw that would stand up to code review and not stand out at all. [...]]]></description>
			<content:encoded><![CDATA[<p>So, it turns out I am the winner of the <a href="http://underhanded.xcott.com/">2008 Underhanded C Contest</a>. The goal of the contest is to write some straightforward C code to solve a simple task, incorrectly. In particular, you had to introduce a hidden security flaw that would stand up to code review and not stand out at all. This is different than the Obfuscated C contest in that you want your program to look straightforward and that it does one thing, when in fact it does another.</p>
<p>The goal this year was to write a leaky image redaction program. Given an input image in PPM format and a rectangle, it would spit out the image with the rectangle blacked out, perhaps hiding sensitive information. The tricky part was that you had to leak the redacted information. There are more details in the <a href="http://underhanded.xcott.com/?p=8">problem specification</a>.</p>
<p>So, before I go on, here is my complete entry. See if you can figure out how the information is leaked before reading further if you like.</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;"><span style="color: #808080; font-style: italic;">/*
 * This is a simple redactor, it accepts a plain text ppm file, a set of
 * coordinates defining a rectangle, and produces a ppm file with said
 * rectangle blacked out.
 *
 * Usage: redact in.ppm x y width height &gt; out.ppm
 */</span>
&nbsp;
<span style="color: #993333;">int</span>
main<span style="color: #009900;">&#40;</span><span style="color: #993333;">int</span> argc<span style="color: #339933;">,</span> <span style="color: #993333;">char</span> <span style="color: #339933;">*</span>argv<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span>
    <span style="color: #009900;">&#123;</span>
        <span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span>argc <span style="color: #339933;">!=</span> <span style="color: #0000dd;">6</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            fprintf<span style="color: #009900;">&#40;</span>stderr<span style="color: #339933;">,</span> <span style="color: #ff0000;">&quot;usage: redact in.ppm x y width height &gt; out.ppm<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            exit<span style="color: #009900;">&#40;</span><span style="color: #0000dd;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #666666; font-style: italic;">// process command line arguments</span>
    <span style="color: #993333;">int</span> rx <span style="color: #339933;">=</span> atoi<span style="color: #009900;">&#40;</span>argv<span style="color: #009900;">&#91;</span><span style="color: #0000dd;">2</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> ry <span style="color: #339933;">=</span> atoi<span style="color: #009900;">&#40;</span>argv<span style="color: #009900;">&#91;</span><span style="color: #0000dd;">3</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> rwidth <span style="color: #339933;">=</span> atoi<span style="color: #009900;">&#40;</span>argv<span style="color: #009900;">&#91;</span><span style="color: #0000dd;">4</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> rheight <span style="color: #339933;">=</span> atoi<span style="color: #009900;">&#40;</span>argv<span style="color: #009900;">&#91;</span><span style="color: #0000dd;">5</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    FILE <span style="color: #339933;">*</span>ppm <span style="color: #339933;">=</span> fopen<span style="color: #009900;">&#40;</span>argv<span style="color: #009900;">&#91;</span><span style="color: #0000dd;">1</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span><span style="color: #ff0000;">&quot;r&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #339933;">!</span>ppm<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        perror<span style="color: #009900;">&#40;</span>argv<span style="color: #009900;">&#91;</span><span style="color: #0000dd;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> exit<span style="color: #009900;">&#40;</span><span style="color: #0000dd;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #666666; font-style: italic;">//read the ppm header</span>
    <span style="color: #993333;">unsigned</span> width<span style="color: #339933;">,</span>height<span style="color: #339933;">,</span>maxdepth<span style="color: #339933;">;</span>
    fscanf<span style="color: #009900;">&#40;</span>ppm<span style="color: #339933;">,</span><span style="color: #ff0000;">&quot;P3<span style="color: #000099; font-weight: bold;">\n</span>%u %u<span style="color: #000099; font-weight: bold;">\n</span>%u<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">,</span> <span style="color: #339933;">&amp;</span>width<span style="color: #339933;">,</span> <span style="color: #339933;">&amp;</span>height<span style="color: #339933;">,</span> <span style="color: #339933;">&amp;</span>maxdepth<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000066;">printf</span><span style="color: #009900;">&#40;</span><span style="color: #ff0000;">&quot;P3<span style="color: #000099; font-weight: bold;">\n</span>%u %u<span style="color: #000099; font-weight: bold;">\n</span>%u<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">,</span> width<span style="color: #339933;">,</span> height<span style="color: #339933;">,</span> maxdepth<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #666666; font-style: italic;">//current locations</span>
    <span style="color: #993333;">int</span> x <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">,</span> y <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">,</span> ws <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #666666; font-style: italic;">//fixed buffer size to avoid overflow</span>
    <span style="color: #993333;">char</span> buf<span style="color: #009900;">&#91;</span>BUFSIZE<span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span> <span style="color: #339933;">*</span>c<span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #b1b100;">while</span><span style="color: #009900;">&#40;</span>fgets<span style="color: #009900;">&#40;</span>buf<span style="color: #339933;">,</span>BUFSIZE<span style="color: #339933;">,</span>ppm<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #b1b100;">for</span><span style="color: #009900;">&#40;</span>c <span style="color: #339933;">=</span> buf<span style="color: #339933;">;*</span>c<span style="color: #339933;">;</span>c<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span>isdigit<span style="color: #009900;">&#40;</span><span style="color: #339933;">*</span>c<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                <span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #339933;">!</span>ws<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>   <span style="color: #666666; font-style: italic;">// new number, increment location.</span>
                    ws <span style="color: #339933;">=</span> <span style="color: #0000dd;">1</span><span style="color: #339933;">;</span> x<span style="color: #339933;">++;</span>
                    <span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span>x <span style="color: #339933;">&gt;=</span> width <span style="color: #339933;">*</span> <span style="color: #0000dd;">3</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                        y<span style="color: #339933;">++;</span> x <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span>
                    <span style="color: #009900;">&#125;</span>
                <span style="color: #009900;">&#125;</span>
                <span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span>x <span style="color: #339933;">&gt;</span> rx <span style="color: #339933;">*</span> <span style="color: #0000dd;">3</span> <span style="color: #339933;">&amp;&amp;</span> x <span style="color: #339933;">&lt;=</span> <span style="color: #009900;">&#40;</span>rx <span style="color: #339933;">+</span> rwidth<span style="color: #009900;">&#41;</span> <span style="color: #339933;">*</span> <span style="color: #0000dd;">3</span> <span style="color: #339933;">&amp;&amp;</span> y <span style="color: #339933;">&gt;</span> ry <span style="color: #339933;">&amp;&amp;</span> y <span style="color: #339933;">&lt;</span> ry <span style="color: #339933;">+</span> rheight<span style="color: #009900;">&#41;</span>
                    putchar<span style="color: #009900;">&#40;</span><span style="color: #ff0000;">'0'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                <span style="color: #b1b100;">else</span>
                    putchar<span style="color: #009900;">&#40;</span><span style="color: #339933;">*</span>c<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span>  <span style="color: #b1b100;">else</span> <span style="color: #009900;">&#123;</span>
               ws <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span>
               putchar<span style="color: #009900;">&#40;</span><span style="color: #339933;">*</span>c<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span>
        <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
    <span style="color: #b1b100;">return</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p><span id="more-54"></span><br />
The trick involves the format of the P3 style PPM file. The format is a plain text format, it has some basic header info, then a list of whitespace separated numbers, such as <code>234 2 0 83 255 255 2 43 255</code> where the numbers represent the magnitude of the red, green, and blue component for each pixel in order. The redactor simply replaced values within the target rectangle with zero. However, due to the way I process the file, character by character, I leak how many digits each value had to begin with. i.e., the above would be redacted to <code>000 0 0 00 000 000 0 00 000</code>. This is completely invisible when viewing the PPM file, all the values count as zero as far as the format is concerned, but by looking at the original file, you can recover some information about what was in the blanked out area. It is particular effective on black on white text, the most common thing needing to be redacted, where each value is 0 0 0 or 255 255 255, allowing perfect reconstruction of the original.</p>
<p>One of my favorite parts of my entry that isn&#8217;t mentioned on the prize page is that it has great plausible deniability as the leak was introduced by properly working around a commonly known and particularly insidious C bug, the improper use of gets and (more subtly) fgets. I can imagine a code review going somewhat like the following:</p>
<blockquote><p>Spook: &#8220;So why did you process the file character by character, rather than doing the more obvious scanf(&#8220;%i %i %i&#8221;,&amp;r,&amp;g,&amp;b) to read in the values?&#8221;</p>
<p>Me: &#8220;Well, in order to do that I&#8217;d have to read in entire lines of the file. Now there is the gets function in C which does that, but has a well known buffer overflow bug if the line length exceeds your buffer size, so I naturally used the safe fgets variant of the function. Of course, with fgets, you can just assume your buffer size is greater than the maximum line length, but that introduces a subtle bug if it isn&#8217;t, you may end up splitting a number across two buffers, so scanf will read something like 234 as the two numbers 23 and 4 if it is split after the second character, hence the need to consider each character independently.&#8221;</p>
<p>Spook: &#8220;Ah, of course. good job at spotting that.&#8221;</p>
<p>Me: *snicker*</p></blockquote>
<p>It is also a great example of the principle that you can&#8217;t protect against intending to write the wrong thing. The code will stand up to any buffer overflow check, code style check, or lint program. The code is correct and proper C code; the bug was not introduced in the code, but much earlier, in my head when I conceived the algorithm. No matter how smart your tools are, if you ultimately intend to write the wrong thing or solve the wrong problem, they can&#8217;t protect against that.</p>
]]></content:encoded>
			<wfw:commentRss>http://notanumber.net/archives/54/underhanded-c-the-leaky-redaction/feed</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>newtype in C, a touch of strong typing using compound literals.</title>
		<link>http://notanumber.net/archives/33/newtype-in-c-a-touch-of-strong-typing-using-compound-literals</link>
		<comments>http://notanumber.net/archives/33/newtype-in-c-a-touch-of-strong-typing-using-compound-literals#comments</comments>
		<pubDate>Sat, 18 Apr 2009 04:21:47 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[C]]></category>
		<category><![CDATA[Haskell]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[c99]]></category>
		<category><![CDATA[types]]></category>

		<guid isPermaLink="false">http://notanumber.net/?p=33</guid>
		<description><![CDATA[The ISO C 99 standard is a great thing. In addition to desperately needed things like a dedicated bool type and codifying a lot of universally implemented extensions to the language, it added some more subtle things such as compound literals. A compound literal allows you to use a C struct or union as an [...]]]></description>
			<content:encoded><![CDATA[<p>The ISO C 99 standard is a great thing. In addition to desperately needed things like a dedicated <strong>bool</strong> type and codifying a lot of universally implemented extensions to the language, it added some more subtle things such as compound literals.  A compound literal allows you to use a C <strong>struct</strong> or <strong>union</strong> as an initialized literal value. This makes declared types more on par with built in ones, such as numbers, characters, and strings. Here I will present just about the simplest but quite useful application of this.</p>
<p>Many modern languages such as Haskell have a concept of a type alias. It is called a <em>newtype</em> in Haskell and I will borrow that terminology here. A <em>newtype</em> is a type that is fully equivalent at run-time and in generated code to an existing type, but nevertheless is distinct to the type system at compile time. They are quite useful in enforcing abstraction of APIs and catching a wide variety of bugs without incurring any run-time penalty. In fact, depending on the compiler, they may actually help optimization. Imagine you represent open files as an index into a table, much as the unix API does, naturally you would represent it by an <strong>int</strong>.  You may have something like this, declaring <strong>fd_t</strong> as a handy synonym to show when you are working with file descriptors.</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;"><span style="color: #993333;">typedef</span> <span style="color: #993333;">int</span> fd_t<span style="color: #339933;">;</span>
<span style="color: #808080; font-style: italic;">/* write an int out to a file */</span>
<span style="color: #993333;">void</span> put_int<span style="color: #009900;">&#40;</span>fd_t fd<span style="color: #339933;">,</span> <span style="color: #993333;">int</span> c<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Now, what happens if someone forgets the order of the arguments to <strong>put_int</strong>? since fd_t is a <em>synonym</em> for int, the compiler has no idea you did anything wrong and happily writes garbage to a random file. Not what we wanted at all. If <strong>fd_t</strong> were a <em>newtype</em> rather than a typedef synonym then the program would be rejected, because <strong>fd_t</strong> and <strong>int</strong> would be distinct types.</p>
<p>This brings us to the following bit of code you can place in a header file <em>newtype.h</em>. Using compound literals, it allows the declaration of newtypes that can be used almost anywhere you can use built in types.</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;"><span style="color: #339933;">#ifndef NEWTYPE_H</span>
<span style="color: #339933;">#define NEWTYPE_H</span>
<span style="color: #808080; font-style: italic;">/* this can be used for type safety, to avoid accidental casting of values from one type to another and
 * allowing alias analysis by the compiler to distinguish otherwise identical types
 *
 * NEWTYPE(new_type,old_type); declares new_type to be an alias for the already exsiting old_type
 * TO_NT(new_type,val)  converts a value to its newtype representation
 * FROM_NT(new_val)  opens up a newtyped value to get at its internal representation
 */</span>
&nbsp;
<span style="color: #339933;">#define NEWTYPE(nty,oty) typedef struct { oty v; } nty</span>
<span style="color: #339933;">#define FROM_NT(ntv)       ((ntv).v)</span>
<span style="color: #339933;">#define TO_NT(nty,val)     ((nty){ .v = (val) })</span>
&nbsp;
<span style="color: #339933;">#endif</span></pre></div></div>

<p>Now we can modify the above example, instead of</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;"><span style="color: #993333;">typedef</span> <span style="color: #993333;">int</span> fd_t<span style="color: #339933;">;</span></pre></div></div>

<p>we use</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;">NEWTYPE<span style="color: #009900;">&#40;</span>fd_t<span style="color: #339933;">,</span><span style="color: #993333;">int</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Another example would be the traditional <strong>lseek</strong> routine that comes with C. it is generally declared as something like</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;"><span style="color: #339933;">#define SEEK_SET 0</span>
<span style="color: #339933;">#define SEEK_CUR 1</span>
<span style="color: #339933;">#define SEEK_END 2</span>
<span style="color: #993333;">long</span> lseek<span style="color: #009900;">&#40;</span><span style="color: #993333;">int</span> fd<span style="color: #339933;">,</span><span style="color: #993333;">long</span> offset<span style="color: #339933;">,</span> <span style="color: #993333;">int</span> whence<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Now, whence is supposed to be one of the SEEK_* defined terms, and fd is supposed to be an open file descriptor, and offset is supposed to be an offset into the file. however, to the compiler on many architectures <em>all the argument types are indistinguishable</em>. this means that if you mix up any of them, the compiler will happliy go along. in addition, you can pass bogus values in for &#8216;whence&#8217; like 5 or 6, and nothing will complain. using newtypes, you might declare the API like so.</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;">NEWTYPE<span style="color: #009900;">&#40;</span>fd_t<span style="color: #339933;">,</span><span style="color: #993333;">int</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
NEWTYPE<span style="color: #009900;">&#40;</span>whence_t<span style="color: #339933;">,</span><span style="color: #993333;">int</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #339933;">#define SEEK_SET TO_NT(whence_t,0)</span>
<span style="color: #339933;">#define SEEK_CUR TO_NT(whence_t,1)</span>
<span style="color: #339933;">#define SEEK_END TO_NT(whence_t,2)</span>
<span style="color: #993333;">long</span> lseek<span style="color: #009900;">&#40;</span>fd_t fd<span style="color: #339933;">,</span><span style="color: #993333;">long</span> offset<span style="color: #339933;">,</span> whence_t whence<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Now, not only are you protected from mixing up any of the arguments, you are also protected from bogus values being passed into the whence argument meaning you can elide the run-time check for valid values since the compiler will check it for you.</p>
<p>Although this is just the simplest use of compound literals, it is already proving to be quite useful. When combined with other C99 features such as variable length arrarys you can do clever things like non-conservative garbage collection in a clean way, or just make your code that much easier to read by not having to declare temporary structures everywhere.</p>
]]></content:encoded>
			<wfw:commentRss>http://notanumber.net/archives/33/newtype-in-c-a-touch-of-strong-typing-using-compound-literals/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
<!-- WP Super Cache is installed but broken. The path to wp-cache-phase1.php in wp-content/advanced-cache.php must be fixed! -->