Underhanded C: The Leaky Redaction

So, it turns out I am the winner of the 2008 Underhanded C Contest. The goal of the contest is to write some straightforward C code to solve a simple task, incorrectly. In particular, you had to introduce a hidden security flaw that would stand up to code review and not stand out at all. This is different than the Obfuscated C contest in that you want your program to look straightforward and that it does one thing, when in fact it does another.

The goal this year was to write a leaky image redaction program. Given an input image in PPM format and a rectangle, it would spit out the image with the rectangle blacked out, perhaps hiding sensitive information. The tricky part was that you had to leak the redacted information. There are more details in the problem specification.

So, before I go on, here is my complete entry. See if you can figure out how the information is leaked before reading further if you like.

 * This is a simple redactor, it accepts a plain text ppm file, a set of
 * coordinates defining a rectangle, and produces a ppm file with said
 * rectangle blacked out.
 * Usage: redact in.ppm x y width height > out.ppm
main(int argc, char *argv[])
        if(argc != 6) {
            fprintf(stderr, "usage: redact in.ppm x y width height > out.ppm\n");
    // process command line arguments
    int rx = atoi(argv[2]), ry = atoi(argv[3]), rwidth = atoi(argv[4]), rheight = atoi(argv[5]);
    FILE *ppm = fopen(argv[1],"r");
    if(!ppm) {
        perror(argv[1]); exit(1);
    //read the ppm header
    unsigned width,height,maxdepth;
    fscanf(ppm,"P3\n%u %u\n%u\n", &width, &height, &maxdepth);
    printf("P3\n%u %u\n%u\n", width, height, maxdepth);
    //current locations
    int x = 0, y = 0, ws = 0;
    //fixed buffer size to avoid overflow
    char buf[BUFSIZE], *c;
    while(fgets(buf,BUFSIZE,ppm)) {
        for(c = buf;*c;c++) {
            if(isdigit(*c)) {
                if(!ws) {   // new number, increment location.
                    ws = 1; x++;
                    if(x >= width * 3) {
                        y++; x = 0;
                if(x > rx * 3 && x < = (rx + rwidth) * 3 && y > ry && y < ry + rheight)
            }  else {
               ws = 0;
    return 0;

The trick involves the format of the P3 style PPM file. The format is a plain text format, it has some basic header info, then a list of whitespace separated numbers, such as 234 2 0 83 255 255 2 43 255 where the numbers represent the magnitude of the red, green, and blue component for each pixel in order. The redactor simply replaced values within the target rectangle with zero. However, due to the way I process the file, character by character, I leak how many digits each value had to begin with. i.e., the above would be redacted to 000 0 0 00 000 000 0 00 000. This is completely invisible when viewing the PPM file, all the values count as zero as far as the format is concerned, but by looking at the original file, you can recover some information about what was in the blanked out area. It is particular effective on black on white text, the most common thing needing to be redacted, where each value is 0 0 0 or 255 255 255, allowing perfect reconstruction of the original.

One of my favorite parts of my entry that isn’t mentioned on the prize page is that it has great plausible deniability as the leak was introduced by properly working around a commonly known and particularly insidious C bug, the improper use of gets and (more subtly) fgets. I can imagine a code review going somewhat like the following:

Spook: “So why did you process the file character by character, rather than doing the more obvious scanf(“%i %i %i”,&r,&g,&b) to read in the values?”

Me: “Well, in order to do that I’d have to read in entire lines of the file. Now there is the gets function in C which does that, but has a well known buffer overflow bug if the line length exceeds your buffer size, so I naturally used the safe fgets variant of the function. Of course, with fgets, you can just assume your buffer size is greater than the maximum line length, but that introduces a subtle bug if it isn’t, you may end up splitting a number across two buffers, so scanf will read something like 234 as the two numbers 23 and 4 if it is split after the second character, hence the need to consider each character independently.”

Spook: “Ah, of course. good job at spotting that.”

Me: *snicker*

It is also a great example of the principle that you can’t protect against intending to write the wrong thing. The code will stand up to any buffer overflow check, code style check, or lint program. The code is correct and proper C code; the bug was not introduced in the code, but much earlier, in my head when I conceived the algorithm. No matter how smart your tools are, if you ultimately intend to write the wrong thing or solve the wrong problem, they can’t protect against that.

Comments 32

  1. MadKat wrote:

    That is truly a beautiful solution. Thank you for the extended description.

    Posted 30 Dec 2009 at 11:07 am
  2. Anonymous wrote:

    Hmm. If I were Spook, I would still be suspicious and/or worried as your answer lacks any _real_ justification as to why the obvious option of using scanf() (or rather fscanf()) would be wrong here. Your explanation seems to rely on the (correct) behavior of fgets(), which is different from fscanf().

    It seems you assume fscanf() might work incorrectly (by not assigning all of the matching data within a line – you seem to imply fscanf() uses fgets() with a limited buffer internally, which may or may not be true) and because of this you implement the logic which uses fgets() directly.

    Such behavior would be a bug. I am not aware of such a (known) bug in any fscanf() implementation. Are you?

    Posted 30 Dec 2009 at 12:09 pm
  3. ThatsGobbles wrote:

    Very clever. I really dig the fact that, like you said, one of the most common types of images to block out are black-and-white text images, and this program works *best* on those types.

    Posted 30 Dec 2009 at 1:12 pm
  4. Thomas Beaucourt wrote:

    Congratulations, beautiful idea ! Plausible deniability is very high indeed.

    Posted 30 Dec 2009 at 3:39 pm
  5. Aaron wrote:

    My code review would start with “Where is BUFSIZE declared?”. I’m assuming there are header files being #include’d which have been omitted, or perhaps you are defining it on the command line?

    Posted 31 Dec 2009 at 12:29 pm
  6. Lalit Mishra wrote:


    Posted 01 Jan 2010 at 7:10 am
  7. Craig wrote:


    Posted 01 Jan 2010 at 7:47 pm
  8. Simon Strandgaard wrote:

    Pure evil. Fantastic well put together piece of code that looks fairly innocent. Thank you for opening my eyes for this kind of dangerous code.

    Keep up the good (or should I say evil) work!

    Posted 11 Jan 2010 at 9:27 am
  9. Deepak Jangid wrote:

    Awesome concept man, you really are a winner !!!

    Posted 04 Sep 2010 at 3:18 am
  10. jonas wrote:

    To Aaron: it’s an effing macro in the C99 standard, defined in by stdio.h.

    Posted 10 Oct 2010 at 7:14 am
  11. ExplicatioExMachina wrote:

    Nice. OpenSource wouldn’t help here to find this backdoor.

    Is it a bug or a feature of PPM to read ‘000’s as ‘0’?

    Posted 15 Dec 2010 at 4:24 am
  12. Peter da Silva wrote:

    Should have parsed the PPM parameters without using fscanf. I hardly ever use *scanf, it makes error handling and recovery too hard.

    Posted 23 Dec 2010 at 4:48 am
  13. Jake wrote:

    You misdirected me with this line:

    if(x > rx * 3 && x ry && y < ry + rheight)

    Is there a reason y isn't paralleling x in this case?

    y < ry + rheight


    x <= (rx + rwidth)

    Posted 23 Dec 2010 at 10:23 am
  14. Craig wrote:

    Interesting. I actually saw the bug, but I suspect it was because between knowing you were supposed to be hiding something (so writing 0 and 000 is inconsistent) combined with ignorance (I had to lookup the PPM file format and refresh myself on C-programming [what was that loop doing :-)] ) that I didn’t just gloss over ‘obvious’ behavior. Definitely interesting in that for most purposes… the program ‘works’

    Posted 23 Dec 2010 at 11:12 am
  15. Jesus of Nazareth wrote:

    I don’t get it

    Posted 23 Dec 2010 at 11:39 am
  16. Witold Baryluk wrote:

    I also have been thinking about this problem, but eventually give it up, as I have no idea how I can hide the fact that I do something evil (any binary tricks, or permutations, will be very susicious).

    Your solution is very good, brillant, and well deserves winning. Congratulations.

    Posted 23 Dec 2010 at 1:52 pm
  17. Witold Baryluk wrote:

    Ah I forgot one question to ask. What is BUFSIZE and where it is defined (and what are includes)?

    Posted 23 Dec 2010 at 1:53 pm
  18. Robert Robertson wrote:

    Nobody actually uses scanf in production code, and if they do, they wouldn’t get where the leak is anyways.

    Posted 23 Dec 2010 at 3:06 pm
  19. JohnH wrote:

    This would only work if I didn’t know that the PPM specification state that no line in the file should be longer than 70 characters, which removes the excuse of not using fgets and a line length of appropriate size.

    Posted 23 Dec 2010 at 4:55 pm
  20. Janne wrote:

    I think you can justify this kind of a coding style to a spook also in performance terms: you don’t perform the costly ascii-to-integer parsing, just a very cheap isdigit() call.

    Very commendable.

    Posted 24 Dec 2010 at 3:45 am
  21. brian wrote:

    If I was the Spook would reject the code as not having enough comments on whats going on. Plus the code layout is setup to hide problems. Code would not get through a normal code review here, let alone a secure one!

    Sorry – its just bad code!

    The penality here for code llike that is a round of drinks!

    Posted 24 Dec 2010 at 4:32 am
  22. Ralph wrote:


    Posted 25 Dec 2010 at 10:25 am
  23. Ranju V wrote:

    Very slick indeed!

    Posted 25 Dec 2010 at 11:10 pm
  24. Anonymous wrote:

    a short comment about scanf:

    actually nobody uses scanf, even though the PPM line should not be larger than 70 characters because you don’t whether some stupid user uses a misplaced file 😉
    -> nobody likes programms with security lacks especially not if they’re obvious to any programmer 😉

    on the other side, you could have used fgets and sscanf in combination.

    But still, just gread work.
    You deserve it!

    Posted 26 Dec 2010 at 9:59 am
  25. Rilhas wrote:

    The usage of fgets does not excuse the programmer. The fact that fgets takes a max buffer size parameter should alert the programmer that he/she should think a bit about whether or not the file could contain lines longer than that buffer size and, if so, what the consequences would be. So the programmer shouldn’t be able to simply not consider cases where the files contain lines larger than the buffer.

    The mere act of selecting a buffer size should not be random and, so, should cause the programmer to stop and think about it (using an already defined symbol tucked away in some standard include is no better).

    So the spook either dismisses it as incompetence or believes it was deliberate. Depending on the programmer’s history and persuasion skills the incompetence scenario may be hard to swallow by the spook.

    The fact that it can be missed when reading the code is another matter altogether. I missed it, sure. But if I was writing the code I wouldn’t make such a bug. It would be my job not to make such a mistake. I’m not being paid just to press keys on the keyboard, my bosses expect me to be using my brain well and keep all my senses on high alert.

    And if I did write the bug (perhaps while drunk or under the influence of hallucinogenic drugs) I would blame myself for being stupid, I would never think of it as “oops… there’s no way a fine-and-dandy programmer like me could ever catch that one… moving on to the next application…”.

    I always thought that medical doctors, lawyers, psychologists, and possibly others, should all be necessarily good. Maybe even excellent. These professions have no room for the “sufficient”, “not-so-good”, etc., all those have to be weeded out, simply because people can get really hurt if a certain minimum level of quality in their work cannot be guaranteed. Mediocre is not enough, because those, on average, fail a number of times comparable to the number of times they succeed. And that is just not good enough if you happen to be their patient/client on one of the bad days, so all days have to be good days. Good professions have an extremely high ratio of good to bad days.

    Back in the 80’s I didn’t think programming would end up being one of those professions. But, in fact, today people can get really hurt from bad code, either because some company goes bankrupt after some security flaw is exploited and all its 150 workers lose their jobs, or because users trusted some application implicitly and lost a lot of money when their credit card number was stolen by some malware or virus. And many other scenarios in between.

    So, looking at how things are made today, I think the IT industry is moving in the wrong direction. Well, at least the bulk of it. Hiring cheap novice or unexperienced programmers, or even flat out bad programmers, and then getting them to learn on the job doing final production code can endup being very damaging to third parties (the users of the software, for example). Even worse, keeping bad programmers just because they are cheap makes someone else (probably a third party, like a user) end up paying for the rest of the cost.

    In this case, hiring a rookie to do the code of this entry would, of course, be a bad idea (assuming it was supposed to be used in very critically important situations). A rookie could (and probably would) have made such a mistake, but a more seasoned programmer should have not. And if he/she did? Not seasoned enough then, his/her boss should have known that.

    We may be reaching a time when programmers should work like medical doctors. While they are rookies (for the first 10 years of their career) they are supervised, and do nothing without close observation by their superiors. They are, in fact, trainees.

    For medical doctors this minimizes loss of lives (and other negative consequences), but it is expensive. True. Somewhere along the way people started thinking that programming was easy and started developing all kinds of crappy software.

    At any given time I have something crashing on me, or not doing what it was supposed to do, caused by crappy software. My car’s automatic windshield wiper cannot be used in its most sensitive position, otherwise it sometimes enters a loop of max activity even if it is not raining. The programmer forgot IIR filters sometimes do that and didn’t take the appropriate measures. My phone sometimes enters a state where the screen is non-responsive, but calls can still come in (although I’m unable to answer them). This is a known software bug on the phones. My set top box at home sometimes crashes when the right conditions are met (for example, I have a recording, which I keep for fun, that makes the device crash every time I rewinding to the point where the actor was crossing a door).

    We are surrounded by software-based crap, done without quality. Many of us are the culprits, we want to by brand new, unproven, flashy, fast, and cheap, and there is still no real culture for demanding quality and assurances from gadget makers as opposed to an already well established culture of selecting refrigerators and dishwashers based on how many decades their manufacturers guarantee their devices to work correctly.

    … well, I digress. Very nice post, an excellent example of how to make crappy code if we are not careful.



    Posted 27 Dec 2010 at 7:10 am
  26. nike free singapore wrote:

    I’m impressed, I have to say. Really not often do I encounter a weblog that’s each educative and entertaining, and let me tell you, you have hit the nail on the head. Your concept is outstanding; the problem is something that not enough people are talking intelligently about. I am very happy that I stumbled throughout this in my seek for something referring to this.

    Posted 13 May 2011 at 10:01 pm
  27. Seriously People wrote:

    Only a kiddyscripter would dare trust a buffer to not overflow because “I followed the format specification” (good way to get a “verified parser” to perform something unexpected too!)

    Lets say you were reading these values, SPECIFIED as bytes, but came across a line like 102121 244 244, who knows how that happened, maybe a program hiccuped while writing it.. doesn’t matter, but now your parser has to handle it; except it doesn’t expect it. (and I’m not 100% certain on the defined behavior of setting a char to 102121 via scanf %i which does have a potential to go poorly without a runtime check, but the problem is that you expect a line no longer than 3c+1c+3c+1c+3c+1c, hence buffer overflow)

    Saying “Oh, well… I’ll just make the buffer really large” doesn’t prevent a TARGETED attack, especially when the code is open source. Lets say an XML parser in Firefox had a buffer of 4MB. This is easily excessive; except when an attacker creates a webpage with no XML content (so it’s 4MB of pure zeros), parser is looking for a “<" to start, never finds it… and then overflows; and an attacker just gained system level access to your computer.

    But what makes this leak even more ingenious is that writing the "good code" would add in more eye openers because it would involve multiple continues and state variables. People are really bad at thinking in states, again why this code works so well, and when you start adding them in it makes "ugly code that looks like it is hiding something."

    And what is really good is that open source code DOES contain stuff like this where people miss the slightest of things… for YEARS on end. This is a patch that fixes a well known problem in an "okay but not great" way; and visually works. 'Many eyes' be damned, no one is going to look at that code again except to fix compatibility issues…

    Posted 25 Mar 2015 at 8:43 pm
  28. wahoo bar wrote:

    Howdy this is kind of of off topic but I was wanting to know if blogs use WYSIWYG editors or if you have to manually code with HTML. I’m starting a blog soon but have no coding knowledge so I wanted to get advice from someone with experience. Any help would be greatly appreciated!

    Posted 24 May 2023 at 12:42 am
  29. Emanuel Fugle wrote:

    Toller Ort, um Rolladen zu kaufen. Tag-Nacht mit Führern, bezeichnend Moskitonetze. Gute Verarbeitungsqualität, alles passt, heutzutage passiert nichts Schlimmes. Dazu eine große Auswahl an Materialien. Ich empfehle! Professioneller Service und Konsultation, ich habe alle Dienstleistungen des Unternehmens (Rollläden) viele Male in Anspruch angewendet, sowohl für mich als auch für meine Familie, und es war immer alles perfekt.

    Posted 25 May 2023 at 12:14 pm
  30. onwin giriş wrote:

    Thank you for content. Area rugs and online home decor store. Hello Administ . Onwin , Onwin Giriş , Onwin Güncel Giriş , Tiklayin. Onwin

    Posted 25 May 2023 at 1:36 pm
  31. ufa คาสิโน wrote:

    สล็อต168 เว็บตรงไม่ผ่านเอเย่นต์ ฝาก-ถอนไม่มีอย่างต่ำ เว็บสล็อตอันดับ 1 เว็บไซต์สล็อตฝากถอนไม่มีขั้นต่ำ เปิดบริการเกี่ยวกับเกมสล็อตออนไลน์มาอย่างช้านาน มีค่ายเกมส์ให้เลือกเล่นล้นหลามหลายกหลายค่ายเกมส์ เรียกง่ายๆว่า รวมสล็อตทุกค่าย เอาไว้ภายในเว็บไซต์เดียวแล้ว ลงทะเบียนเป็นสมาชิกกล้วยๆไม่กี่ขั้นตอนพร้อมรับโปรโมชั่นฝากแรกได้ทันทร มีทีมงานมีอาชีพคอยให้คำปรึกษาตลอด 24 ชม. สล็อตออนไลน์ ฝาก-ถอนเร็วด้วยระบบออโต้จำต้อง แค่นั้นทำรายการไม่ถึง 10 เร็วโคตรๆแถมยังโปรโมชั่นเยอะ แอ ป ทุจริต เกม แตกง่าย จ่ายจริง ไม่มีกลั้น ไม่มีอย่างน้อย 1 บาทก็เล่นได้แล้ว เว็บเดียวจบครบทุกบริการ มีเจ้าหน้าที่รอดูแลตลอด 24 ชั่วโมง สล็อต168 ผู้ให้บริการสล็อตออนไลน์อันดับ 1 ในไทย ให้บริการมายาวนานมากกว่า 10 ปี เนื่องจากว่าพวกเราคือ สล็อตเว็บตรงไม่ผ่านเอเย่นต์ มีใบเซอร์ยืนยันจากต่างประเทศ เล่นง่ายได้เงินจริง เว็บตรง มั่นคง ไม่มีอันตราย เปิดให้บริการแต่ละวัน ไม่มีหยุด ตลอด 24 ชั่วโมง

    Posted 27 May 2023 at 4:15 am
  32. Raymondmaw wrote:

    Healy company offers innovation in the world of medical care – Healy wave device, which cures sicknesses at all levels – energy, mental and physical. Based on the fundamentals of quantum physics, using progressive knowledge and developments in the field of psychology and physiotherapy. Portative home doctor who painlessly effectively treats and counteracts onset of diseases – best diverse spectrum, more than 5 thousand, and catalogue is growing. Partial list of treatment apps – digital su-Jock, digital homeopathy, digital herbal medicine, digital flower therapy, digital health nutrition and diet, i-ging, Schuessler salts, Alaskan gem elixirs, Australian bush flowers, macrobiotics, manual therapy elements, to in -depth meditation and relaxation programs.

    Know More

    Posted 28 May 2023 at 7:26 am

Trackbacks & Pingbacks 5

  1. From Underhanded C: The Leaky Redaction | Sole Genius on 23 Dec 2010 at 5:47 pm

    […] by sanitybit to programming [link] [32 comments] reddit: the voice of the internet — news before it happens @ […]

  2. From Tomatensuppe on 27 Dec 2010 at 4:59 am

    Underhanded C: The Leaky Redaction | Not A Number…

  3. From Linux Flame - Seite 32 on 21 May 2011 at 5:06 am

    […] kann aber den Source lesen und daher ists viel sicherer!!1!" Thematik immer wieder aufkommt: Not A Number – Underhanded C: The Leaky Redaction backdoored Imagebearbeitungsprogramm vom obigem Link    […]

  4. From Internet MyBB-Downloads waren infiziert on 26 Oct 2011 at 4:20 am

    […] […]

  5. From What is noninterference, and how do we enforce it? - The PL Enthusiast on 03 Mar 2015 at 11:00 am

    […] When I learned about the contest, I wondered: Do the human judges have automated tools at their disposal? Automated analysis would seem practically important when trying to detect subtle information leaks (which do occur in the real world, e.g., as with the Lucky Thirteen attack). John Meacham, the winner of the 2008 version of the contest suggests that automated tools will not be of help. He writes: […]

Post a Comment

Your email is never published nor shared. Required fields are marked *