Didier Stevens

Thursday 30 April 2020

Update: zipdump.py Version 0.0.19

Filed under: Encryption,My Software,Update — Didier Stevens @ 0:00

This new version of zipdump uses module pyzipper in stead of build-in module zipfile.

pyzipper supports AES encryption. It is not a built-in module, and needs to be installed (with pip for example). pyzipper does not support Python 2.

If module pyzipper is not installed, zipdump will fall back to module zipfile.

zipdump_v0_0_19.zip (https)
MD5: 6DDE072811D4B44B15D0B8EE4E7B4C03
SHA256: EB38D57E63B12EFAC531B4F0BA866BF47CAEC7F64E0C3CCF4557476FFF1C6226

Monday 27 April 2020

NVISO Innovation Coin

Filed under: Announcement,Hacking — Didier Stevens @ 0:00

I received an Innovation Coin for the research I conduct at NVISO.

An important element in research, that doesn’t get much (public) attention, is failure.

When you perform research, know that many of the things you will try, will fail: they will not lead to the desired outcome. This is inherent to research.

Publishing failed research is useful, if only to avoid others taking the same, dead-end path. And maybe to inspire future researchers to find other paths.


I would like to show an example of some simple research I did recently, that didn’t produce the desired outcome.


While adding a new feature to my zipdump.py tool, I got the idea to bypass anti-virus detection of a payload by putting it inside the comment of a ZIP archive.

The last record in a ZIP file, is the end-of-central-directory (EOCD) record. In normal situations, this record marks the end of the ZIP file: there is no data beyond this record. One of the last fields in this record, is the comment-length field. If there is no comment (most ZIP files have no comment), the comment-length field is zero and it it the last field in the record. So it marks the end of the ZIP file.

If there is a comment, the comment-length contains the length (in bytes) of the comment, and the comment itself is the last field in the record (right after the comment-length field).

Here is a binary view of the EOCD record of a ZIP file without comment. The comment-length field (2 bytes, little-endian) is equal to zero:

And here is an EOCD record with a comment: 18 bytes long (0x12). The comment-length field (2 bytes, little-endian) is equal to 0x12, and the comment itself is right after this field:

I created a ZIP file with the mimikatz driver as comment. Since the comment-length field is 2 bytes long, a comment can not be longer than 65536 bytes (0xFFFF). Hence I couldn’t use mimikatz.exe (it’s larger than 64KB) and had to use mimikatz.sys (33KB).

The version of mimikatz.drv I used has 55/70 detections on VirusTotal at time of writing, and stored inside a ZIP file, it has 43/62 detections.

A ZIP file containing a simple text file has 0 detections.

And the same ZIP file with mimikatz.sys as a comment, has 13/60 detections.

Here is a binary view of that file:

From these results, I could conclude that this is indeed a valid method to bypass static detection by several anti-virus products, and that my research yielded a useful bypass method.

However, I also created a file where mimikatz.sys is just appended to that ZIP file containing a text file. Not as a comment, just appending one file to another. And here the detection rate on VT is just 4/61.

This is a simpler and better method, one that is already known and used by many actors on the Internet.


Remark that I used VirusTotal here for quick results, but that the anti-virus products on VirusTotal are limited in their detection capability, compared to the same AVs deployed on endpoints.

Sunday 26 April 2020

Quickpost: My SpiderMonkey’s Cheat Sheet

Filed under: My Software,Quickpost — Didier Stevens @ 8:27

I have a modified version of SpiderMonkey, Mozilla’s (old) JavaScript parser, that helps me with JavaScript analysis.

Details here.

js.exe -e “document.output(‘x’);” sample.js
zipdump.py -s 1 -d sample.js.zip | js.exe -e “document.output(‘a’);” –
zipdump.py -s 1 -d sample.js.zip | js.exe -e “document.output(‘d’);” –
zipdump.py -s 1 -d sample.js.zip | js.exe -e “document.output(‘X’);” –
zipdump.py -s 1 -d sample.js.zip | js.exe -e “document.output(‘A’);” –
zipdump.py -s 1 -d sample.js.zip | js.exe -e “document.output(‘D’);” –
zipdump.py -s 1 -d sample.js.zip | js.exe -e “document.output(‘f’);” –
zipdump.py -s 1 -d sample.js.zip | js.exe –

Tuesday 21 April 2020

Handling Diacritics

Filed under: My Software — Didier Stevens @ 0:00

In many languages, letters (basic glyphs) can have accents (diacritics).

Take the common French given name André. It is written with a letter e with an acute accent.

A colleague had to create a list of email addresses from a list of names (given name + surname). Some of the names had letters with accents: these accents had to be removed to keep the basic letter, in order to form a list of email addresses. For example, “andré” had to be converted to “andre”.

I found the Python module Unicode, and told my colleague he could use that module together with my python-per-line.py to generate his list. It turned out I had to make a change to my python-per-line.py tool first, so that it would handle Unicode input properly.

It works as follows. Take this Unicode text file:

Using unidecode method unidecode with python-per-line.py is done like this:

Remark that “é” has been converted to “e”.

Here is a list of names:

And here is the command to convert this list to email addresses:

c:\python37\python python-per-line.py –encoding utf-16 -e “import unidecode” “‘.’.join(unidecode.unidecode(line).lower().split(‘ ‘))+’@target.tld'” unicode-names.txt

Remark that personal names might be more complex than the simple case of “given name + surname”, and that the Python expression might have to be adapted accordingly.

python-per-line_V0_0_7.zip (https)
MD5: 1353108BE499E07745A409568940977F
SHA256: 0086B3780C768717072AC705A0FFEFFA5DD74565B36D4795813BF89E10F88240

Monday 20 April 2020

Update: python-per-line.py Version 0.0.7

Filed under: My Software,Update — Didier Stevens @ 0:00

This new version of python-per-line.py, a utility to execute a Python expression for every line in its input text files(s), adds option –encoding to handle encodings like Unicode (Python 3.7 required).

python-per-line_V0_0_7.zip (https)
MD5: 1AF491C2AD45E7ADB83F121B40F60BFB
SHA256: 5CB1E7C17EE359090E9E7168692CF00347E9815DC47CCCA14A2B4C974832510B

Sunday 19 April 2020

Update: hex-to-bin.py Version 0.0.5

Filed under: My Software,Update — Didier Stevens @ 0:00

This new version of hex-to-bin.py, a tool to convert hexadecimal data to binary data, has a new option to ignore al characters/bytes that are not hexadecimal digits: -H –hexonly.

This option can be used to parse obfuscated, hexadecimal dumps of PE files, for example:

And there are also options if you want to take only lowercase hexadecimal digits into account (–loweronly) or uppercase hexadecimal digits (–upperonly).

hex-to-bin_V0_0_5.zip (https)
MD5: 6247279785AB80F4B0A91E0316D8695C
SHA256: C55246D653F1804DFB2C2EBEC0471AF42A89E9F080DCC87DC673BC9FEAD1949D

Saturday 18 April 2020

Update: xmldump.py Version 0.0.6

Filed under: My Software,Update — Didier Stevens @ 0:00

This new version of xmldump.py, a tool to analyze XML files, has a new command to extract cells from an .xlsx/.xlsm spreadsheet: celltext.

And also an option to provide the encoding of input files, like utf8 (Python 3.7 and later): –encoding.

xmldump_V0_0_6.zip (https)
MD5: 74BE27A8F45F1814341DCB7AEF6AE8BC
SHA256: 1767C27D9907FDDF88015D938EFF47782C06547CEEF0493F67D85FF4A06656DA

Wednesday 15 April 2020

Analyzing Malformed ZIP Files

Filed under: Forensics,maldoc,My Software — Didier Stevens @ 0:00

With version 0.0.16 (we are now at version 0.0.18), I updated my zipdump.py tool to handle (deliberately) malformed ZIP files. My zipdump tool uses Python’s ZIP module to analyze ZIP files.

Now, zipdump has a an option (-f) to scan arbitrary binary files for ZIP records.

I will show here how this feature can be used, by analyzing a sample Xavier Mertens wrote a diary entry about. This sample is a Word document with macros, an OOXML (Office Open XML format) file (.docm). It is malformed, because 1) there’s an extra byte at the beginning and 2) there’s a byte missing at the end.

When you use my zipdump tool to look at the file, you get an error:

Using option -f l (list), we can find all PKZIP records inside arbitrary, binary files:

When using option -f with value l, a listing will be created of all PKZIP records found in the file, plus extra data. Some of these entries in this report will have an index, that can be used to select the entry.

In this example, 2 entries can be selected:

p: extra bytes at the beginning of the file (prefix)

1: an end-of-central-directory record (PK0506 end)

Using option -f p, we can select the prefix (extra data at the beginning of the file) for further analysis:

And from this hex/ascii dump, we learn that there is one extra byte at the beginning of the ZIP file, and that it is a newline characters (0x0A).

Using option -f 1, we can select the EOCD record to analyze the ZIP file:

As this generates an error, we need to take a closer look at the EOCD record by adding option -i (info):

With this info, we understand that the missing byte makes that the comment length field is one byte short, and this causes the error seen in previous image.

ZIP files can contain comments (for the ZIP container, and also for individual files): these are stored at the end of the PKZIP records, preceded by a 2-byte long, little-endian integer. This integer is the length of the comment. If there is no comment, this integer is zero (0x00).

Hence, the byte we are missing here is a NULL (0x00) byte. We can append a NULL byte to the sample, and then we should be able to analyze the ZIP file. In stead of modifying the sample, I use my tool cut-bytes.py to add a single NULL byte to the file (suffix option: -s #h#00) and then pipe this into zipdump:

File 5 (vbaProject.bin) contains the VBA macros, and can be piped into oledump.py:

I also created a video:

zipdump_v0_0_18.zip (https)
MD5: 34DC469E8CD4E5D3E9520517DEFED888
SHA256: 270B26217755D7ECBCB6D642FBB349856FAA1AE668DB37D8D106B37D062FADBB

Tuesday 14 April 2020

Update: zipdump.py Version 0.0.18

Filed under: My Software,Update — Didier Stevens @ 0:00

This new version op zipdump.py adds option -i (info), to be used to obtain more info on PKZIP records.


In next blog post, I’ll explain how to use zipdump to analyze malformed ZIP files.

zipdump_v0_0_18.zip (https)
MD5: 34DC469E8CD4E5D3E9520517DEFED888
SHA256: 270B26217755D7ECBCB6D642FBB349856FAA1AE668DB37D8D106B37D062FADBB

Monday 13 April 2020

Update: zipdump.py Version 0.0.17

Filed under: My Software,Update — Didier Stevens @ 0:00

This version includes a couple of bug fixes.

zipdump_v0_0_17.zip (https)
MD5: E61843BC5B42F4129A4664CD0A5FF93C
SHA256: 72C8AA31F143575E7F77027A7C186484E810F8E400285B6D3785C33C0408F4BF

Next Page »

Blog at WordPress.com.