Didier Stevens

Saturday 26 January 2019

Update: msoffcrypto-crack.py Version 0.0.3

Filed under: Encryption,My Software,Update — Didier Stevens @ 13:44

This is a bug fix update: for agile encryption, Python module msoffcrypto does not throw an exception in method load_key when an invalid password is provided. It throws an exception when an attempt is made to decrypt the file.

I added a call to method decrypt to handle this case.

msoffcrypto-crack_V0_0_3.zip (https)
MD5: 45BAB81D744DA62182EC58A8F2E05BFE
SHA256: CF9DE02C72C07C07786BE09551CD17F6DBB83BCEF2A1C5435E06A695D7C6770E

Monday 7 January 2019

Update: msoffcrypto-crack.py Version 0.0.2

Filed under: Encryption,My Software,Update — Didier Stevens @ 0:00

In this update of msoffcrypto-crack.py, two new options were added:

-e takes a text file and extracts all words from this text file to be used in the dictionary attack. Words are strings delimited by space characters. Words between single or double quotes, and words after string “password” are put at the beginning of the list for the dictionary attack.

The idea for option -e, is that you give it the content of an email message that contains the password of the encrypted attachment(s).

-c takes the password to decrypt the document. You use this option after the password was recovered (with option -p or -e for example), and need to run the tool again to decrypt the document. You can run the password cracking each time when you need to decrypt the document, but if this takes too long, then you just run it once and from then on provide the recovered password with option -c.

Password VelvetSweatshop was added to the embedded password list.

msoffcrypto-crack_V0_0_2.zip (https)
MD5: 010B7FA68FCF9CE84427815EFDFE1C42
SHA256: 6B368E40EEE8A907D444A49963B37F456A3645991201CE06F0E46A0F2E188A74

Monday 31 December 2018

New Tool: msoffcrypto-crack.py

Filed under: Encryption,maldoc,My Software — Didier Stevens @ 0:00

This is a new tool to recover the password of encrypted MS Office documents. I quickly put together this script to help with the analysis of encrypted, malicious documents.

This tool relies completely on Python module msoffcrypto to decrypt MS Office documents.

Since this is a Python tool based on a Python library, don’t except fast password recovery. This is more a convenience program.

It can recover passwords using a build-in password list, or you can provide your own list via option -p.

The tool can also decrypt the encrypted MS Office document if the password is recovered: used option -o to achieve this. Otherwise, the tool just displays the recovered password.

Like many of my tools, it can take its input from stdin and provide the decrypted document via stdout.

It’s developed with Python 2, and also tested on Python 3.

Read the man page for all the details: option -m.

msoffcrypto-crack_V0_0_1.zip (https)
MD5: F67060E0DE62727A1A69D0FD6F39013A
SHA256: 1466B94B56595BA0B91F0A2606F699E1D737E964F3F1A4DFDF7EAA47843DD063

Wednesday 10 October 2018

KEIHash: Fingerprinting SSH

Filed under: Encryption,My Software,Networking — Didier Stevens @ 0:00

keihash.py is a program to parse pcap files and calculate the KEIHash of SSH connections.

The KEIHash is the MD5 hash of the Key Exchange Init (KEI) data (strings). For obvious reasons, I could not call this an SSH fingerprint. This is inspired by JA3 SSL fingerprinting.

It can be used to profile SSH clients and servers. For example, the hash for the latest version of PuTTY (SSH-2.0-PuTTY_Release_0.70) is 1c5eaa56f3e4569385ae5f82a54715ee.

This is the MD5 hash of:


These are all the strings found in the Key Exchange Init packet, prefixed by their length and concatenated with separator ;.

With this, I’ve been able to identify SSH clients with spoofed banners attempting to connect to my servers.

keihash_V0_0_1.zip (https)
MD5: 674D019A739679D9659D2D512A60BDD8
SHA256: DB7471F1253E3AEA6BFD0BA38C154AF3E1D1967F13980AC3F42BB61BBB750490

Thursday 7 June 2018

Encrypted OOXML Documents

Filed under: Encryption,maldoc — Didier Stevens @ 0:00

The Office Open XML format introduced with MS Office 2007, is essentially composed of XML files stored inside a ZIP container.

When an OOXML file (like a .docx file) is protected with a password for reading, it is encrypted. The encrypted OOXML file is stored inside a Compound File Binary Format file, or what I like to call an OLE file. This is the “old” MS Office file format (like .doc), the default file format used before MS Office 2007.

This is how an encrypted .docx file looks like, when analyzed with oledump:

Stream EncryptedPackage contains the encrypted document, and stream EncryptionInfo contains information necessary to help with the decryption of stream EncryptedPackage.

The structure of stream EncryptedPackage is simple:

First there’s an integer with the size of the encrypted document, followed by the encrypted document. If we decode the binary data for the integer with format-bytes.py, we get the size 11841:

The EncryptionInfo stream starts with binary data, the version format, and is then followed by more binary data, or XML data, depending on the version:

The first bytes specify the major and minor version used for the EncryptionInfo stream. This example is mostly XML:

Which can be further parsed with xmldump.py:

To help identifying what version is used, I developed an oledump plugin named plugin_office_crypto:

Depending on the version, different tools can be used to decrypt office documents.

Python program msoffcrypto-tool can only decrypt agile encryption (for the moment, it’s a work in progress).

C program msoffice-crypt can decrypt standard, extended and agile encryption.


Sometimes, malicious documents will be encrypted to try to avoid detection. The victim will have to enter the password to open the document. There is one exception though: Excel documents encrypted with password VelvetSweatshop.


Friday 29 December 2017

Cracking Encrypted PDFs – Conclusion

Filed under: Encryption,Forensics,Hacking,PDF — Didier Stevens @ 0:00

TL;DR: PDFs protected with 40-bit keys can not guarantee confidentiality, even with strong passwords. When you protect your PDFs with a password, you have to encrypt your PDFs with strong passwords and use long enough keys. The PDF specification has evolved over time, and with it, the encryption options you have. There are many encryption options today, you are no longer restricted to 40-bit keys. You can use 128-bit or 256-bit keys too.

There is a trade-off too: the more advanced encryption option you use, the more recent the PDF reader must be to support the encryption option you selected. Older PDF readers are not able to handle 256-bit AES for example.

Since each application capable of creating PDFs will have different options and descriptions for encryption, I can not tell you what options to use for your particular application. There are just too many different applications and versions. But if you are not sure if you selected an encryption option that will use long enough keys, you can always check the /Encrypt dictionary of the PDF you created, for example with my pdf-parser (in this example /Length 128 tells us a 128-bit key is used):

Or you can use QPDF to encrypt an existing PDF (I’ll publish a blog post later with encryption examples for QPDF).

But don’t use 40-bit keys, unless confidentiality is not important to you:

I first showed (almost 4 years ago) how PDFs with 40-bit keys can be decrypted in minutes, using a commercial tool with rainbow tables. This video illustrates this.

Later I showed how this can be done with free, open source tools: Hashcat and John the Ripper. But although I could recover the encryption key using Hashcat, I still had to use a commercial tool to do the actual decryption with the key recovered by Hashcat.

Today, this is no longer the case: in this series of blog posts, I show how to recover the password, how to recover the key and how to decrypt with the key, all with free, open source tools.

Overview of the complete blog post series:


Thursday 28 December 2017

Cracking Encrypted PDFs – Part 3

Filed under: Encryption,Forensics,Hacking,PDF — Didier Stevens @ 0:00

I performed a brute-force attack on the password of an encrypted PDF and a brute-force attack on the key of (another) encrypted PDF, both PDFs are part of a challenge published by John August.

The encryption key is derived from the password. it’s not just based on the password only, but also on metadata. This implies that different PDFs encrypted with the same user password, will have different encryption keys.

When you recover the user password of an encrypted PDF, you can just use it with PDF readers like Adobe Reader: they will ask you for the password, you provide it and the PDF will be decrypted and rendered.

But when you recover the key of an encrypted PDF, you can not use it with PDF reader: there is no feature that will allow you to input a key in stead of a password. The only method I knew to decrypt a PDF document with its encryption key, was to use Elcomsoft’s PDF cracking tool:

Now I worked out a second method: I modified the source code of QPDF so that it will accept encryption keys too. It’s a quick and dirty hack, I did not add a new option to QPDF but I “hijacked” the –password option. If the value to the option –password starts with string “key:”, then QPDF will not derive the key from the provided password, but it will use the key provided as hexadecimal characters. Here is how I use it to decrypt the “tough” PDF:

I also made a small modification to the –show-encryption option, to display the encryption key:

Update: I had an email exchange with Jay Berkenbilt, the author of QPDF, and he will look into this patch and possibly add a new key option to QPDF.

If you are interested in my modified version of QPDF, you can find the modified source code files and Windows binaries here:

qpdf-patched.zip (https)
MD5: 57E1A5A232E12B45D0A927181A1E8C3B
SHA256: 6F17E095B38AE72F229A6662216DDCE86057D2BA1C567B07FEF78B8A93413495

Update: this is the complete blog post series:

Wednesday 27 December 2017

Cracking Encrypted PDFs – Part 2

Filed under: Encryption,Forensics,Hacking,PDF — Didier Stevens @ 0:00

After cracking the “easy” PDF of John’s challenge, I’m cracking the “tough” PDF (harder_encryption).

Using the same steps as for the “easy” PDF, I confirm the PDF is encrypted with a user password using 40-bit encryption, and I extract the hash.

Since the password is a long random password, a brute-force attack on the password like I did in the first part will take too long. That’s why I’m going to perform a brute-force attack on the key: using 40-bit encryption means that the key is just 5 bytes long, and that will take about 2 hours on my machine. The key is derived from the password.

I’m using hashcat again, but this time with hash mode 10410 in stead of 10400.
This is the command I’m using:

hashcat-4.0.0\hashcat64.exe --potfile-path=harder_encryption.pot -m 10410 -a 3 -w 3 "harder_encryption - CONFIDENTIAL.hash" ?b?b?b?b?b

I’m using the following options:

  • –potfile-path=harder_encryption.pot : I prefer using a dedicated pot file, but this is optional
  • -m 10410 : this hash mode is suitable to crack the key used for 40-bit PDF encryption
  • -a 3 : I perform a brute force attack (since it’s a key, not a password)
  • -w 3 : I’m using a workload profile that is supposed to speed up cracking on my machine
  • ?b?b?b?b?b : I’m providing a mask for 5 bytes (I want to brute-force keys that are 40 bits long, i.e. 5 bytes)

And here is the result:

The recovered key is 27ce78c81a. I was lucky, it took about 15 minutes to recover this key (again, using GPU GeForce GTX 980M, 2048/8192 MB allocatable, 12MCU). Checking the complete keyspace whould take a bit more than 2 hours.

Now, how can we decrypt a PDF with the key (in stead of the password)? I’ll explain that in the next blog post.

Want a hint? Take a look at my Tweet!

Update: this is the complete blog post series:

Tuesday 26 December 2017

Cracking Encrypted PDFs – Part 1

Filed under: Encryption,Forensics,Hacking,PDF — Didier Stevens @ 17:15

In this series of blog posts, I’ll explain how I decrypted the encrypted PDFs shared by John August (John wanted to know how easy it is to crack encrypted PDFs, and started a challenge).

Here is how I decrypted the “easy” PDF (encryption_test).

From John’s blog post, I know the password is random and short. So first, let’s check out how the PDF is encrypted.

pdfid.py confirms the PDF is encrypted (name /Encrypt):

pdf-parser.py can tell us more:

The encryption info is in object 26:

From this I can conclude that the standard encryption filter was used. This encryption method uses a 40-bit key (usually indicated by a dictionary entry: /Length 40, but this is missing here).

PDFs can be encrypted for confidentiality (requiring a so-called user password /U) or for DRM (using a so-called owner password /O). PDFs encrypted with a user password can only be opened by providing this password. PDFs encrypted with a owner password can be opened without providing a password, but some restrictions will apply (for example, printing could be disabled).

QPDF can be used to determine if the PDF is protected with a user password or an owner password:

This output (invalid password) tells us the PDF document is encrypted with a user password.

I’ve written some blog posts about decrypting PDFs, but because we need to perform a brute-force attack here (it’s a short random password), this time I’m going to use hashcat to crack the password.

First we need to extract the hash to crack from the PDF. I’m using pdf2john.py to do this. Remark that John the Ripper (Jumbo version) is now using pdf2john.pl (a Perl program), because there were some issues with the Python program (pdf2john.py). For example, it would not properly generate a hash for 40-bit keys when the /Length name was not specified (like is the case here). However, I use a patched version of pdf2john.py that properly handles default 40-bit keys.

Here’s how we extract the hash:

This format is suitable for John the Ripper, but not for hashcat. For hashcat, just the hash is needed (field 2), and no other fields.

Let’s extract field 2 (you can use awk instead of csv-cut.py):

I’m storing the output in file “encryption_test – CONFIDENTIAL.hash”.

And now we can finally use hashcat. This is the command I’m using:

hashcat-4.0.0\hashcat64.exe --potfile-path=encryption_test.pot -m 10400 -a 3 -i "encryption_test - CONFIDENTIAL.hash" ?a?a?a?a?a?a

I’m using the following options:

  • –potfile-path=encryption_test.pot : I prefer using a dedicated pot file, but this is optional
  • -m 10400 : this hash mode is suitable to crack the password used for 40-bit PDF encryption
  • -a 3 : I perform a brute force attack (since it’s a random password)
  • ?a?a?a?a?a?a : I’m providing a mask for 6 alphanumeric characters (I want to brute-force passwords up to 6 alphanumeric characters, I’m assuming when John mentions a short password, it’s not longer than 6 characters)
  • -i : this incremental option makes that the set of generated password is not only 6 characters long, but also 1, 2, 3, 4 and 5 characters long

And here is the result:

The recovered password is 1806. We can confirm this with QPDF:

Conclusion: PDFs protected with a 4 character user password using 40-bit encryption can be cracked in a couple of seconds using free, open-source tools.

FYI, I used the following GPU: GeForce GTX 980M, 2048/8192 MB allocatable, 12MCU

Update: this is the complete blog post series:

Tuesday 6 June 2017

Update: xor-kpa.py Version 0.0.5

Filed under: Encryption,My Software,Update — Didier Stevens @ 0:00

Some small changes to my XOR known plaintext attack tool (xor-kpa), which will be detailed in an ISC Diary entry.

xor-kpa_V0_0_5.zip (https)
MD5: 023D8E3725E0EF7CEC449085AA96BB3A
SHA256: 7517DD44AFBFA11122FD940D76878482F50B7A2A2BCD1D7A2AF030F6CAC4F4E3

Next Page »

Blog at WordPress.com.