Didier Stevens

Wednesday 1 July 2009

Embedding and Hiding Files in PDF Documents

Filed under: My Software,PDF — Didier Stevens @ 6:28

My corrupted PDF quip inspired me to program another steganography trick: embed a file in a PDF document and corrupt the reference, thereby effectively making the embedded file invisible to the PDF reader.

The PDF specification provides ways to embed files in PDF documents. I’m releasing my Python program to create a PDF file with embedded file (I used make-pdf-embedded.py to create my EICAR.pdf).

Here’s how a PDF document with an embedded file looks like:

20090630-220314

/EmbeddedFiles points to the dictionary with the embedded files:

20090630-220228

As names defined in the PDF specification are case sensitive, changing the case changes the semantics: /Embeddedfiles has no meaning, and thus the PDF reader ignores it and doesn’t find the embedded file.

20090630-220137

20090630-215901

Actually, I used this trick in my Brucon puzzle. I used the –stego option of make-pdf-embedded.py:

20090630-222453

Of course, once you know the stego trick, it’s easy to recover the embedded file: edit the PDF document with an hex editor and change the case back to /EmbeddedFiles.

But if you want to make it harder to detect, use PDF obfuscation techniques. Or embed the file twice with incremental updates. First version is the file you want to hide, second version is a decoy…

The PDF language offers so many features to hide and obfuscate data!

Download:

make-pdf_V0_1_2.zip (https)

MD5: 305D57692C27DD3CD91D8C85A3932948

SHA256: A030BBCB8B54137D8047A4CB5C350725599383A4B113CABBA8871AC221378C5B

51 Comments »

  1. Great work thanx y0!

    Comment by Anarchy Angel — Wednesday 1 July 2009 @ 15:11

  2. […] Data leakage anybody? Didier is at it again. Embedding and Hiding Files in PDF Documents << Didier Stevens Tags: ( pdf […]

    Pingback by Interesting Information Security Bits for 07/01/2009 | Infosec Ramblings — Wednesday 1 July 2009 @ 20:34

  3. That is impressive. Great job.

    Comment by jamessmith — Thursday 2 July 2009 @ 2:09

  4. Nice post and much thanks for the tool. I hope to integrate this into a toolkit I’ve used on pen tests. Currently to move files from client to server I XOR an exe/dll and pre-pend an JPG header. That’s an ugly approach but some AV injects into IE and looks at file transfers and jpgs (even using SSL). If it sees the PE header it will stop the transfer. Using a PDF like this is much more elegant in my mind. 🙂

    Comment by Matthew Wollenweber — Thursday 2 July 2009 @ 14:45

  5. You’re welcome! If you want to change the encoding, use a different filter option.

    But be aware that most PDF readers (that support embedded files) don’t allow the extraction (saving/opening) of executables. Several file extensions are blacklisted: .exe, .dll, .js, .vbs, …. So use .txt instead (like I did for eicar.pdf).

    Interestingly, .py is not blacklisted by Adobe or Foxit.

    Comment by Didier Stevens — Thursday 2 July 2009 @ 15:33

  6. […] rapidement comment corrompre la référence d’un fichier intégré dans un document pdf et ainsi le rendre invisible. Outil à télécharger et autres conseils à lire… Lire aussi PRECEDENTArticle […]

    Pingback by - CNIS mag — Thursday 2 July 2009 @ 15:57

  7. […] @ 20:27 Today, I’m showing you how you can patch your PDF reader (Foxit or Adobe) to handle PDF documents with hidden embedded files. And for Foxit, there’s a bonus: Foxit Reader can also embed files into existing PDF […]

    Pingback by Patching PDF Readers to Support Hidden Embedded Files « Didier Stevens — Monday 6 July 2009 @ 20:28

  8. […] security blogger and hacker extraordinaire Didier Stevens recently posted this entry all about hiding data in PDF files. My corrupted PDF quip inspired me to program another steganography trick: embed a file in a PDF […]

    Pingback by Abusing PDFs « Security For All — Wednesday 8 July 2009 @ 21:03

  9. Isn’t this more ingenuity than you need? Open the PDF file in a text editor and paste as much text as you want between the end of any object and the start of the next. No code needed.
    Of course, it’s easier to detect something like this, but Acrobat Reader doesn’t care about it at all.

    Comment by Skott Klebe — Tuesday 14 July 2009 @ 20:38

  10. @Skott Klebe
    No, you’ll break the XREF table. And not all PDF readers will handle a broken XREF table.

    Comment by Didier Stevens — Tuesday 14 July 2009 @ 20:40

  11. […] was reading Didier Stevens’ posts on the creation of malicious PDF files and embedding other files within PDF files.  He mentions that he ran all his tests using Adobe Acrobat Reader 8.1.2 and Foxit Reader 2.2.  I […]

    Pingback by Chirashi Security » Malicious PDF files and embedding — Wednesday 15 July 2009 @ 5:35

  12. There are many ways to put data into a PDF file so that it will not be observed by “reader” apps. But this doesn’t mean that the data is hidden.

    Comment by Joel — Wednesday 15 July 2009 @ 7:46

  13. @Joel

    I wrote: >embed a file in a PDF document and corrupt the reference, thereby effectively making the embedded file invisible to the PDF reader.

    It’s hidden for the PDF reader, because it has no way to render or extract the embedded file when the /EmbeddedFiles name has been changed.

    Comment by Didier Stevens — Wednesday 15 July 2009 @ 8:52

  14. […] Embedding and Hiding Files in PDF Documents […]

    Pingback by Links for 15th July 2009 | Velcro City Tourist Board — Wednesday 15 July 2009 @ 22:03

  15. Hello,

    a way to embed hidden data in a PDF file wouldn’t be to add a new key/value pair in the Dictionary object of the pdf file?
    The key will be something known that will be used to recover the data that will be stored in the value.
    This thought came from a post found here: http://forums.adobe.com/message/2157533

    Thnx in advance,

    Tony

    Comment by Tony — Sunday 8 November 2009 @ 19:34

  16. Yes, adding a new key/value pair to an existing dictionary works too. To avoid unwanted side-effects, use a key with no meaning.

    Comment by Didier Stevens — Tuesday 10 November 2009 @ 16:30

  17. This trick is really cool!!!

    Thank you!!!

    Simon (from aero mexico)

    Comment by simon — Friday 4 December 2009 @ 21:18

  18. I would like some help as I learn to dissect malicious pdf files. In my example below I am having issues decyphering the string…

    obj 4 0
    Type:
    Referencing:
    Contains stream

    <>

    Immediately following is the inflated java script which I was able to deflate. However the above I have had no luck with. Also, I am not able to find a IP/Name that the malware could be calling home to. I am hoping it is in the string I have attached.

    Thank you for any advice during this learning exercise for me.

    Comment by Jim P — Thursday 28 January 2010 @ 19:58

  19. @Jim P: it looks like that wast lost with the copy/paste…

    Comment by Didier Stevens — Thursday 28 January 2010 @ 20:36

  20. Take 2

    /Length 2901
    /#46#69#6cter [ /AS#43I#49#48#65#78D#65co#64#65 /#4c#5aW#44eco#64e /AS#43I#4985#44e#63o#64#65 /R#75#6eL#65#6e#67#74h#44ecode /#46#6c#61#74#65D#65co#64#65 ]

    /Length 2901

    /Filter [
    /ASCIIHexDecode /LZWDecode
    /ASCII85Decode /RunLengthDecode
    /FlateDecode ]

    Comment by Jim P — Friday 29 January 2010 @ 21:02

  21. @Jim P

    This stream is compressed with 5 filters: /ASCIIHexDecode /LZWDecode /ASCII85Decode /RunLengthDecode /FlateDecode
    The /Names are obfuscated with hex code, like this: /#46#69#6cter -> /Filter

    Comment by Didier Stevens — Sunday 31 January 2010 @ 22:23

  22. /#46#69#6cter [ /AS#43I#49#48#65#78D#65co#64#65 /#4c#5aW#44eco#64e /AS#43I#4985#44e#63o#64#65 /R#75#6eL#65#6e#67#74h#44ecode /#46#6c#61#74#65D#65co#64#65

    So I convert the hex to…

    /Filter [ /ASCIIHexDecode /LZWDecode /ASCII85Decode /RunLengthDecode /FlateDecode ]

    So looking at this, I assume your pdf-parser.py did it for me since in my example it is already decoded in the lines below.

    Also, I assume the filter string is telling the application how to decode the blob of code that is associated with the same object within the PDF? I have been able to take the java script blob and decode it. I can see the exploit it is using, etc. I can not see what the phone home address is. My goal is to find the address the malware talks back to.

    Comment by Jim P — Sunday 31 January 2010 @ 23:39

  23. @Jim P: Take a look at my video: https://blog.didierstevens.com/2008/10/20/analyzing-a-malicious-pdf-file/

    Comment by Didier Stevens — Monday 1 February 2010 @ 9:12

  24. […] that’s a somewhat sophisticated step in itself. I’m not saying it can’t be done, in fact Didier has a method detailed here, but my resources relay that it’s somewhat […]

    Pingback by PDFs Exploitable?!? I’m shocked… | ESET ThreatBlog — Tuesday 6 April 2010 @ 19:30

  25. is there a way to run an exe file through pdf

    Comment by Anonymous — Wednesday 7 April 2010 @ 18:09

  26. @Anonymous Yes, read my last posts.

    Comment by Didier Stevens — Wednesday 7 April 2010 @ 20:26

  27. […] that’s a somewhat sophisticated step in itself. I’m not saying it can’t be done, in fact Didier has a method detailed here, but my resources relay that it’s somewhat […]

    Pingback by Triflex Enterprise | PDFs Exploitable?!? I’m shocked… — Thursday 22 April 2010 @ 5:20

  28. Hi man, great trick!! I’ve try to make a pdf file with -a option, but I have no output file or embedded exe in my source pdf….where I’m wrong????? I’ve to install Adobe Acrobat or the Reader can be done???
    Can you help me, please????

    Comment by Probbe — Thursday 6 May 2010 @ 10:40

  29. @Probbe This doesn’t work with a source PDF, it creates a new PDF, it doesn’t update an existing PDF.

    Comment by Didier Stevens — Thursday 6 May 2010 @ 20:50

  30. thnnnnnnxxxxxx a lot

    Comment by Tarun — Monday 7 June 2010 @ 6:26

  31. Hi didier i m trying to add text mesage to the embeded file but no succes
    this is the cmd
    make-pdf-embedded.py -a -s -m hola a todos nano.exe nano.pdf but there is no outpout
    is something wrong with options ?

    Comment by pig — Monday 19 July 2010 @ 12:20

  32. @pig Try make-pdf-embedded.py -a -s -m “hola a todos” nano.exe nano.pdf
    But Adobe Reader & Foxit Reader do not allow you to extract executables.

    Comment by Didier Stevens — Monday 19 July 2010 @ 16:28

  33. So how we could do to extract exe files? I read all of your articles, I know the results but I don’t well understand how to it… Is there any change to make with the command line?
    make-pdf-embedded.py -b type.exe output.pdf… ?

    The exe have to already be located on the computer like calc.exe or notepad.exe or it could be our proper exe?

    Cause I’ve tried, I have the pdf, it says “this pdf embeds type.exe”, I have the button but when I click, nothing happened…

    Thanks for your answer

    Comment by Charles12 — Thursday 21 October 2010 @ 21:24

  34. Hi all,
    I tried to use the make-pdf-embedded.py script…
    I copied the pdf and exe in python folder… BUT!!!!
    when i use the script as shown in the screenshot i get a pdf output with one liner that “this pdf embeds….”
    the exe does not get embedded and the pdf content get overwritten with this one liner..
    am i missing something????
    help is appreciated..
    thanks

    Comment by syano — Monday 13 December 2010 @ 10:45

  35. @syano This has already been discussed in the comments, what you want to do doesn’t work. See for example #6 and #30.

    Comment by Didier Stevens — Monday 13 December 2010 @ 17:37

  36. I tried to use this but it results to an error:
    File “C:\Embed\make-pdf-embedded.py”, line 99
    print ”
    ^
    SyntaxError: invalid syntax

    what should I do?

    Comment by Air Force — Thursday 19 May 2011 @ 5:01

  37. Use Python 2.6 and not 3.x.

    Comment by Didier Stevens — Sunday 22 May 2011 @ 10:49

  38. […] possible to create a pure ASCII PDF file that embeds a binary file. Here are the steps to drop a binary […]

    Pingback by Teensy PDF Dropper Part 1 « Didier Stevens — Wednesday 13 July 2011 @ 21:40

  39. Is it possible to modify this script to attach multiple files to the same newly created pdf? I would also want to be able to use the -b (button) option and have a button made for each file? Would also like to know if you have a script for encrypting a pdf (or again a way to modify this script to add the feature). Basically I’m looking to automate the process of encrypting one or more files in a pdf as a more secure way of emailing them to someone. Thanks for any help/suggestions in advance.

    Comment by Eric Smith — Tuesday 19 February 2013 @ 21:35

  40. @Eric Yes, you can modify the script to achieve this.
    And qpdf does encryption transformations.

    Comment by Didier Stevens — Wednesday 20 February 2013 @ 20:17

  41. Any chance you would be willing to point me in the right direction? I know nothing of python. Also is qpdf another of your scripts I could download elsewhere or is it already part of this one? Although I am very computer literate and not so bad and windows bat file/command line scripting I would be lying if I said I could follow along with more than a third of the content in this script. Sorry to push the matter and thank you for any reply in advance.

    Comment by Eric Smith — Wednesday 20 February 2013 @ 20:26

  42. @Eric qpdf is software you find on Sourceforge. Look at the lines oPDF.indirectobject(7… and oPDF.stream2(8… in the script, they create the embedded file. You’ll have to copy and adapt these for a second file.

    Comment by Didier Stevens — Wednesday 20 February 2013 @ 21:36

  43. Slightly off topic but…. we have pdf documents that we wish to issue to users with a note showing an expiry date for the document (e.g. “This document is subject to frequent revision. Do not use after 12/31/2013”.

    We cannot modify the actual pdf document per se but wonder if there is a way of perhaps embedding the original PDF in another pdf wrapper that we create that contains the expiry message.

    Any ideas?

    Comment by Anonymous — Wednesday 31 July 2013 @ 19:02

  44. @Anonymous Yes, I know how to do this. Contact me at didier.stevens@gmail.com.

    Comment by Didier Stevens — Wednesday 31 July 2013 @ 19:40

  45. Didier, does this still work?

    Comment by Anonymous — Wednesday 16 October 2013 @ 15:53

  46. @Anonymous Yes.

    Comment by Didier Stevens — Thursday 17 October 2013 @ 17:47

  47. Thanks!

    Comment by Anonymous — Thursday 17 October 2013 @ 18:23

  48. … Thanks for sharing! This is really useful…

    Comment by Ligeti78 — Saturday 14 December 2013 @ 17:31

  49. When i want to hide files contents of my PDF disappears and “This PDF document embeds file” substitutes. why?

    Comment by Shervin — Tuesday 2 July 2019 @ 15:39

  50. Because of design.

    Comment by Didier Stevens — Thursday 4 July 2019 @ 23:09


RSS feed for comments on this post. TrackBack URI

Leave a Reply (comments are moderated)

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.