Didier Stevens

Thursday 31 May 2018

PDFiD: GoToE and GoToR Detection (“NTLM Credential Theft”)

Filed under: My Software,PDF — Didier Stevens @ 0:00

The article “NTLM Credentials Theft via PDF Files” explains how PDF documents can refer to a resource via UNC paths. This is done using  PDF names /GoToE or /GoToR.

My tool pdfid.py can now be extended to report /GoToE and /GoToR usage in a PDF file, without having to change the source code. You just have to edit the pdfid.ini file (or create it) to include these names, like this:


Using pdfid configured like this on a “credential stealing PDF” gives the following result:

pdfid.ini has to be located in the same directory as pdfid.py. And remember that names in the PDF language are case-sensitive.


Friday 29 December 2017

Cracking Encrypted PDFs – Conclusion

Filed under: Encryption,Forensics,Hacking,PDF — Didier Stevens @ 0:00

TL;DR: PDFs protected with 40-bit keys can not guarantee confidentiality, even with strong passwords. When you protect your PDFs with a password, you have to encrypt your PDFs with strong passwords and use long enough keys. The PDF specification has evolved over time, and with it, the encryption options you have. There are many encryption options today, you are no longer restricted to 40-bit keys. You can use 128-bit or 256-bit keys too.

There is a trade-off too: the more advanced encryption option you use, the more recent the PDF reader must be to support the encryption option you selected. Older PDF readers are not able to handle 256-bit AES for example.

Since each application capable of creating PDFs will have different options and descriptions for encryption, I can not tell you what options to use for your particular application. There are just too many different applications and versions. But if you are not sure if you selected an encryption option that will use long enough keys, you can always check the /Encrypt dictionary of the PDF you created, for example with my pdf-parser (in this example /Length 128 tells us a 128-bit key is used):

Or you can use QPDF to encrypt an existing PDF (I’ll publish a blog post later with encryption examples for QPDF).

But don’t use 40-bit keys, unless confidentiality is not important to you:

I first showed (almost 4 years ago) how PDFs with 40-bit keys can be decrypted in minutes, using a commercial tool with rainbow tables. This video illustrates this.

Later I showed how this can be done with free, open source tools: Hashcat and John the Ripper. But although I could recover the encryption key using Hashcat, I still had to use a commercial tool to do the actual decryption with the key recovered by Hashcat.

Today, this is no longer the case: in this series of blog posts, I show how to recover the password, how to recover the key and how to decrypt with the key, all with free, open source tools.

Overview of the complete blog post series:


Thursday 28 December 2017

Cracking Encrypted PDFs – Part 3

Filed under: Encryption,Forensics,Hacking,PDF — Didier Stevens @ 0:00

I performed a brute-force attack on the password of an encrypted PDF and a brute-force attack on the key of (another) encrypted PDF, both PDFs are part of a challenge published by John August.

The encryption key is derived from the password. it’s not just based on the password only, but also on metadata. This implies that different PDFs encrypted with the same user password, will have different encryption keys.

When you recover the user password of an encrypted PDF, you can just use it with PDF readers like Adobe Reader: they will ask you for the password, you provide it and the PDF will be decrypted and rendered.

But when you recover the key of an encrypted PDF, you can not use it with PDF reader: there is no feature that will allow you to input a key in stead of a password. The only method I knew to decrypt a PDF document with its encryption key, was to use Elcomsoft’s PDF cracking tool:

Now I worked out a second method: I modified the source code of QPDF so that it will accept encryption keys too. It’s a quick and dirty hack, I did not add a new option to QPDF but I “hijacked” the –password option. If the value to the option –password starts with string “key:”, then QPDF will not derive the key from the provided password, but it will use the key provided as hexadecimal characters. Here is how I use it to decrypt the “tough” PDF:

I also made a small modification to the –show-encryption option, to display the encryption key:

Update: I had an email exchange with Jay Berkenbilt, the author of QPDF, and he will look into this patch and possibly add a new key option to QPDF.

If you are interested in my modified version of QPDF, you can find the modified source code files and Windows binaries here:

qpdf-patched.zip (https)
MD5: 57E1A5A232E12B45D0A927181A1E8C3B
SHA256: 6F17E095B38AE72F229A6662216DDCE86057D2BA1C567B07FEF78B8A93413495

Update: this is the complete blog post series:

Wednesday 27 December 2017

Cracking Encrypted PDFs – Part 2

Filed under: Encryption,Forensics,Hacking,PDF — Didier Stevens @ 0:00

After cracking the “easy” PDF of John’s challenge, I’m cracking the “tough” PDF (harder_encryption).

Using the same steps as for the “easy” PDF, I confirm the PDF is encrypted with a user password using 40-bit encryption, and I extract the hash.

Since the password is a long random password, a brute-force attack on the password like I did in the first part will take too long. That’s why I’m going to perform a brute-force attack on the key: using 40-bit encryption means that the key is just 5 bytes long, and that will take about 2 hours on my machine. The key is derived from the password.

I’m using hashcat again, but this time with hash mode 10410 in stead of 10400.
This is the command I’m using:

hashcat-4.0.0\hashcat64.exe --potfile-path=harder_encryption.pot -m 10410 -a 3 -w 3 "harder_encryption - CONFIDENTIAL.hash" ?b?b?b?b?b

I’m using the following options:

  • –potfile-path=harder_encryption.pot : I prefer using a dedicated pot file, but this is optional
  • -m 10410 : this hash mode is suitable to crack the key used for 40-bit PDF encryption
  • -a 3 : I perform a brute force attack (since it’s a key, not a password)
  • -w 3 : I’m using a workload profile that is supposed to speed up cracking on my machine
  • ?b?b?b?b?b : I’m providing a mask for 5 bytes (I want to brute-force keys that are 40 bits long, i.e. 5 bytes)

And here is the result:

The recovered key is 27ce78c81a. I was lucky, it took about 15 minutes to recover this key (again, using GPU GeForce GTX 980M, 2048/8192 MB allocatable, 12MCU). Checking the complete keyspace whould take a bit more than 2 hours.

Now, how can we decrypt a PDF with the key (in stead of the password)? I’ll explain that in the next blog post.

Want a hint? Take a look at my Tweet!

Update: this is the complete blog post series:

Tuesday 26 December 2017

Cracking Encrypted PDFs – Part 1

Filed under: Encryption,Forensics,Hacking,PDF — Didier Stevens @ 17:15

In this series of blog posts, I’ll explain how I decrypted the encrypted PDFs shared by John August (John wanted to know how easy it is to crack encrypted PDFs, and started a challenge).

Here is how I decrypted the “easy” PDF (encryption_test).

From John’s blog post, I know the password is random and short. So first, let’s check out how the PDF is encrypted.

pdfid.py confirms the PDF is encrypted (name /Encrypt):

pdf-parser.py can tell us more:

The encryption info is in object 26:

From this I can conclude that the standard encryption filter was used. This encryption method uses a 40-bit key (usually indicated by a dictionary entry: /Length 40, but this is missing here).

PDFs can be encrypted for confidentiality (requiring a so-called user password /U) or for DRM (using a so-called owner password /O). PDFs encrypted with a user password can only be opened by providing this password. PDFs encrypted with a owner password can be opened without providing a password, but some restrictions will apply (for example, printing could be disabled).

QPDF can be used to determine if the PDF is protected with a user password or an owner password:

This output (invalid password) tells us the PDF document is encrypted with a user password.

I’ve written some blog posts about decrypting PDFs, but because we need to perform a brute-force attack here (it’s a short random password), this time I’m going to use hashcat to crack the password.

First we need to extract the hash to crack from the PDF. I’m using pdf2john.py to do this. Remark that John the Ripper (Jumbo version) is now using pdf2john.pl (a Perl program), because there were some issues with the Python program (pdf2john.py). For example, it would not properly generate a hash for 40-bit keys when the /Length name was not specified (like is the case here). However, I use a patched version of pdf2john.py that properly handles default 40-bit keys.

Here’s how we extract the hash:

This format is suitable for John the Ripper, but not for hashcat. For hashcat, just the hash is needed (field 2), and no other fields.

Let’s extract field 2 (you can use awk instead of csv-cut.py):

I’m storing the output in file “encryption_test – CONFIDENTIAL.hash”.

And now we can finally use hashcat. This is the command I’m using:

hashcat-4.0.0\hashcat64.exe --potfile-path=encryption_test.pot -m 10400 -a 3 -i "encryption_test - CONFIDENTIAL.hash" ?a?a?a?a?a?a

I’m using the following options:

  • –potfile-path=encryption_test.pot : I prefer using a dedicated pot file, but this is optional
  • -m 10400 : this hash mode is suitable to crack the password used for 40-bit PDF encryption
  • -a 3 : I perform a brute force attack (since it’s a random password)
  • ?a?a?a?a?a?a : I’m providing a mask for 6 alphanumeric characters (I want to brute-force passwords up to 6 alphanumeric characters, I’m assuming when John mentions a short password, it’s not longer than 6 characters)
  • -i : this incremental option makes that the set of generated password is not only 6 characters long, but also 1, 2, 3, 4 and 5 characters long

And here is the result:

The recovered password is 1806. We can confirm this with QPDF:

Conclusion: PDFs protected with a 4 character user password using 40-bit encryption can be cracked in a couple of seconds using free, open-source tools.

FYI, I used the following GPU: GeForce GTX 980M, 2048/8192 MB allocatable, 12MCU

Update: this is the complete blog post series:

Monday 27 November 2017

Update: pdfid.py Version 0.2.3

Filed under: My Software,PDF,Update — Didier Stevens @ 0:00

In this new version of pdfid.py, a new option was added: -n.

With this option, you can suppress output for names with a count of zero:

pdfid_v0_2_3.zip (https)
MD5: 65966E8BBF932D3C0830B755FDE094FE
SHA256: 9482176D173EFA6F2F33EE409B091BFA45685FC285B87F7219A4E9418B47F739

Monday 30 October 2017

Update: pdfid.py Version 0.2.2

Filed under: My Software,PDF,Update — Didier Stevens @ 0:00

I regularly get ideas to improve my tools when I give (private) training, and last week was not different.

This new version of pdfid.py adds a /URI counter, to help identify PDF documents with embedded URLs, used for phishing or social-engineering users into clicking on links.

I did not hardcode this new counter into the source code of pdfid.py, but it is listed in a new config file: pdfid.ini. You too can add your own identifiers to this configuration file.

pdfid_v0_2_2.zip (https)
MD5: 20614B44D97D48813D867AA8F1C87D4E
SHA256: FBF668779A946C70E6C303417AFA91B1F8A672C0293F855EF85B0E347D3F3259

Sunday 29 October 2017

Update: pdf-parser.py Version 0.6.8

Filed under: My Software,PDF,Update — Didier Stevens @ 15:32

This is a bugfix version.

pdf-parser_V0_6_8.zip (https)
MD5: 7702EEA1C6173CB2E91AB88C5013FAF1
SHA256: 3424E6939E79CB597D32F405E2D75B2E42EF7629750D5DFB39927D5C132446EF

Monday 24 April 2017

Bash Bunny PDF Dropper

Filed under: Hardware,My Software,PDF — Didier Stevens @ 0:00

More than 5 years ago, I worked out a technique to drop any file on a machine which has removable storage disabled. The technique used a Teensy to simulate a keyboard and type out a pure ASCII PDF to notepad. The PDF, containing an embedded executable, can then be saved and opened with a PDF reader to extract the embedded file.

I recently re-visited this technique with my Bash Bunny (it can also be done with a Rubber Ducky):

First I create a pure ASCII PDF file with an embedded executable using my make-pdf-embedded.py tool:

make-pdf-embedded.py -f fi80 -t -n Dialog42.exe.txt Dialog42.exe Dialog42.pdf

Option -f select the filters to use: f to deflate (zlib compress) and i80 to use hexadecimal lines of 80 characters to encode the compressed executable file in pure ASCII.

Option -t for pure text.

Option -n to choose the name used in the PDF document for the embedded file (files with extension .exe can not be extracted with Adobe Reader).

And then I create a Ducky Script script from the PDF with my python-per-line.py tool:

python-per-line.py "Duckify({})" -o payload.duck Dialog42.pdf

The payload.duck file can then be installed on my Bash Bunny, referenced from a payload.txt bash script like this:




QUACK STRING notepad.exe

QUACK switch1/payload.duck

Here is a video showing my Bash Bunny dropping this PDF file:

Thursday 20 April 2017

Malicious Documents: The Matryoshka Edition

Filed under: maldoc,Malware,PDF — Didier Stevens @ 0:02

I must admit that I was (patiently) waiting for the type of malicious document I’m about to describe now. First I’m going to analyze this document with my tools, and after that I’m going to show you some of the mitigations put in place by Adobe and Microsoft.

Malicious document 123-148752488-reg-invoice.pdf is a PDF with an embedded file and JavaScript. Here is pdfid’s report:

As we can notice from this report, the PDF document contains /JavaScript and an /OpenAction to launch this JavaScript upon opening of the PDF file, and also an /EmbeddedFile.

pdf-parser.py searching for JavaScript (option -s javascript) reveals that the JavaScript is in object 5:

Object 5 contains JavaScript (option -o 5 to select object 5, and option -f to decompress the stream with JavaScript):

This script (this.exportDataObject) will save the embedded file (996502.docm) to a temporary file and launch the associated application (if MS Office is installed, Word will be launched). A .docm file is a Word document with macros.

So let’s search for this embedded file:

The embedded file is stored in object 3, as a compressed stream (/FlateDecode).

So let’s decompress and extract the file with pdf-parser: option -f to filter (decompress) and option -d to dump the content. Since I expect the embedded file to be a Word document with macros, I’m going to analyze it with oledump. So in stead of writing the embedded file to disk, I’m going to extract it to stdout (-d -) and pipe it into oledump:

oledump‘s report confirms that it is a Word document with macros. I’m not going to spend much time on the analysis of the VBA code, because the intent of the code becomes clear when we extract all the strings found in the VBA code. First we select and extract all VBA code (options -s a -v) and then we pipe this into re-search to produce a list of unique strings (enclosed in double quotes) with these options: -n str -u

One of the extracted strings contains 3 URLs separated by character V. The macros will download an XOR encoded EXE file from these sites, decode it and execute it.


The first mitigation is in Adobe Reader: the embedded .docm file will not be extracted and launched without user interaction. First the user is presented a dialog box:

Only when clicking OK (the default option), will the .docm file be extracted and launched. Remark that the maldoc authors use some weak social engineering to entice the user to click OK: see in 996502.

When opened in Word, macros will be disabled:

This next mitigation is put into place by Microsoft Word: macros are detected, and by default, they are not executed. Here we see a better attempt at social engineering the user into executing the macros.

You might have expected that this document would be opened in Protected View first. After all, the PDF document was e-mailed to the victims, and Outlook will mark the PDF with a mark-of-web when it is saved to disk:

But Adobe Reader will not propagate that mark-of-web of the PDF document to the extracted Word document (at least the version I tested, version XI). Without mark-of-web, Word will open the document without Protected View.

Another simple mitigation for this type of malicious document that you can put into place but that is not enabled by default, is to disable JavaScript in Adobe Reader.

Remark that these documents do not contain exploits: they just use scripting.

Next Page »

Blog at WordPress.com.