Didier Stevens

Tuesday 23 October 2018

Update: pdf-parser.py Version 0.6.9

Filed under: My Software,PDF,Update — Didier Stevens @ 0:00

This new version of pdf-parser.py brings 2 new features; the idea came to me during private & public trainings I gave on malicious documents (if you are interested in a training, please get in touch).

The statistics option (-a –stats) has been enhanced with a search for keywords section:

In this section, the result of searches for particular keywords (that might indicate a malicious PDF) is displayed: you get the number of hits followed by the indices of the objects that contain this keyword.

In the example above, we see that object 11 contains JavaScript.

Remark that this section is the result of a search command (-s): search in pdf-parser is not case-senstive and partial (unlike PDFiD). That explains why /AA is found in object 37, while it’s actually /Aacute:

pdf-parser will also read file pdfid.ini (if present) so that the personal keywords you added to PDFiD are also used by pdf-parser.

–overridingfilters is a new option: it allows for the processing of streams with a different filter (or filter chain) than the one specified in the object’s dictionary. Use value raw to obtain the raw stream, without filtering.

pdf-parser_V0_6_9.zip (https)
MD5: 27D65A96FEAF157360ACBBAAB9748D27
SHA256: 3F102595B9EAE5842A1B4723EF965344AE3AB01F90D85ECA96E9678A6C7092B7

Monday 22 October 2018

New tool: decompress_rtf.py

Filed under: My Software — Didier Stevens @ 0:00

A reader over at the Internet Storm Center asked how to analyze a particular email file (.msg) with my oledump.py tool. MSG files are ole files, and can be analyzed with oledump. In this particular email, one stream contains compressed RTF.

I quickly wrote a tool to decompress compressed RTF using Python module compressed_rtf, using my binary file Python template.

Template process-binary-file.py is a Python program that reads binary files (normal files, stdin, contained in a ZIP file, …) and comes with various options to process these files. The class cBinaryFile can read a complete or partial file into memory for further processing.

Just a few lines of code of the template need to be added/changed to create this new decompression tool.

First I need to import module compressed_rtf. A single line “import compressed_rtf” would be sufficient, but I’m adding some error handling in case the module is not installed:

    import compressed_rtf
except ImportError:
    print('Module compressed_rtf missing, please install with command: pip install compressed_rtf')

Next, I search for “# —– Put your data processing code here —–” (line 1315 in the current version 0.0.1 of the template):

Lines 1316 and 1317 (starting with oOutput.Line) are just demo lines, to be replaced by this line:

        oOutput.Line(compressed_rtf.decompress(data), eol='')

Variable data contains the complete binary content of the processed file (e.g. the compressed RTF), and a call to method compressed_rtf.decompress will decompress the data. Then I output the result with method Line of object oOutput. I use this method in stead of a print statement, because then I have more control over the output format and destination by using command-line option -o. eol=” directs the Line method not to append a new-line after outputting the decompressed RTF file.

That’s essentially all that needs to be done to create this new tool with my template.

Documentation is also important, so I also updated the description (line 5), date (line 8) and the manual (starting line 67).

And now with this tool, I can decompress compressed RTF streams found inside a .msg file:

decompress_rtf_V0_0_1.zip (https)
MD5: 41127F62897479FB5135D36675C396F5
SHA256: 581F2E1B2B508C3941EC22040FB0C76999E5DF293C8AD0DC1FDE921D121F3A26

Sunday 21 October 2018

Release: Python Tool Templates

Filed under: Announcement,My Software — Didier Stevens @ 0:00

I’m releasing the templates for Python tools I shared and used during my BruCON and Hack.lu 2018 workshops.

There’s a template for text files and one for binary files.

python-templates_V0_0_1.zip (https)
MD5: 99E9D87681470F1BAE020B68F2853F49
SHA256: 2CA24AD6928FA2FE2DE894FEFBD1B41238B723D46ADED4064D26374A805BA1C4

Wednesday 10 October 2018

KEIHash: Fingerprinting SSH

Filed under: Encryption,My Software,Networking — Didier Stevens @ 0:00

keihash.py is a program to parse pcap files and calculate the KEIHash of SSH connections.

The KEIHash is the MD5 hash of the Key Exchange Init (KEI) data (strings). For obvious reasons, I could not call this an SSH fingerprint. This is inspired by JA3 SSL fingerprinting.

It can be used to profile SSH clients and servers. For example, the hash for the latest version of PuTTY (SSH-2.0-PuTTY_Release_0.70) is 1c5eaa56f3e4569385ae5f82a54715ee.

This is the MD5 hash of:


These are all the strings found in the Key Exchange Init packet, prefixed by their length and concatenated with separator ;.

With this, I’ve been able to identify SSH clients with spoofed banners attempting to connect to my servers.

keihash_V0_0_1.zip (https)
MD5: 674D019A739679D9659D2D512A60BDD8
SHA256: DB7471F1253E3AEA6BFD0BA38C154AF3E1D1967F13980AC3F42BB61BBB750490

Sunday 23 September 2018

Update: pecheck.py Version 0.7.4

Filed under: My Software,Update — Didier Stevens @ 0:00

This update improves digital signature handling.

pecheck-v0_7_4.zip (https)
MD5: E0F90B85576F7BC42BB8601E650134FB
SHA256: E011CD82F5E3244553FBA52DDF3F0D3076E88A6F35E50AA18AC0DAAC6ED91389

Saturday 25 August 2018

Update: numbers-to-string.py Version 0.0.5

Filed under: My Software,Update — Didier Stevens @ 16:34

This new version of numbers-to-string.py has a new option: -S (–statistics).

Statistics can help identifying malicious scripts (text files in general)  with numbers:

numbers-to-string_v0_0_5.zip (https)
MD5: 02119AFAC1942A3C97B8E554C03B2DB6
SHA256: 36A5C346063C93B45C50ACF82C317379496A815F166E25F969168DDAB561F92D

Tuesday 14 August 2018

Update: format-bytes Version 0.0.5

Filed under: My Software,Update — Didier Stevens @ 0:00

This new version has many new features and options.

First there is the remainder (*) when using option -f to specify a parsing format.

For example, -f “<i25s” directs format-bytes to interpret the provided data as a little-endian integer followed by a 25-byte long string:

With the remainder (-f “<i25s*”), format-bytes will provide info for the remaining bytes (if any) after parsing (e.g. after the 25-byte long string):

Options -c and -s changed ito -C and -S, so that option -s can be used to select items (to be consistent across my tools).

Option -s can be used to select an item, like a string, to be dumped (options -a, -x and -d). If no dump option is provided, an hex-ascii dump (-a) is the default.

And option –jsoninput can be used to process JSON output produced by oledump.py or zipdump.py, for example.


format-bytes_V0_0_5.zip (https)
SHA256: AD43756F69C8C2ABF0F5778BC466AD480630727FA7B03A6D4DEC80743549845A

Monday 13 August 2018

Update: oledump.py Version 0.0.37

Filed under: My Software,Update — Didier Stevens @ 0:00

This new version of oledump.py adds option –vbadecompressskipattributes to decompress VBA code while skipping the initial attribute definitions (those that are hidden in the MS Office VBA Editor).

Here is an example of output with option -v you are familiar with:

When replacing option -v with option –vbadecompressskipattributes, the initial attributes are no longer displayed:

These attributes are actually hidden in the MS Office VBA Editor:

I added this option because lately, I’ve analyzed several samples where I had to extract all strings for further decoding, and the strings in the attribute definitions were interfering with the decoding. With this new options, I can prevent these strings from appearing in the output.


plugin_msg.py was updated to version 0.0.3 to include plugin option -k, to display only known MSG streams.


oledump_V0_0_37.zip (https)
MD5: BBC2F3B57266B557307E12E8BC950F98
SHA256: 573C73110CA35EE6451FD14EE7B7DCA3B53FF624ECCFF824799DA59F7767DA68

Friday 3 August 2018

Update: PDFiD.py Version 0.2.5

Filed under: My Software,PDF,Update — Didier Stevens @ 0:00

It’s the second time now that a friend reports to me that PDFiD produces no output at all when a pdf is analyzed.

In both cases, the filename was something like sample[1].pdf (a file you could find in Internet Explorer’s cache, for example).

PDFiD can process multiple files, and accepts UNIX shell-style wildcards. Not only * and ?, but also []. So with a filename like sample[1].pdf, PDFiD is actually looking for a file with filename sample1.pdf. Which it doesn’t find, and thus produces no output.

About two years ago, when first a friend reported this, I added option -l –literal. If you use this option, then PDFiD will do no wildcard expansion, and will thus find file sample[1].pdf.

Recently, another friend had the same problem. And was not aware of the existence of option -l.

This new version of PDFiD will display a warning when you use wildcard characters in filenames (without option -l) and when no files match. Like this:

I also renamed option –literal to –literalfilenames, to be consistent across my tools.

pdfid_v0_2_5.zip (https)
MD5: 9B835D9E934A7AA7E68C3649A7AA5DAF
SHA256: 4DD43D7BDA885C5A579FC1F797E93A536E1DB5A4AB52A9337759A69D3B0250E0

Tuesday 31 July 2018

Update: python-per-line.py Version 0.0.5

Filed under: My Software,Update — Didier Stevens @ 0:00

This new version of python-per-line.py adds new options: –grep, –grepoptions, –begingrep, –begingrepoptions, –endgrep and –endgrepoptions

With –grep and –grepoptions, you can select the lines to be processed by python-per-line.py.

If you want to skip lines at the beginning of the file, use option –begingrep to grep for the first line where processing should start.

And if you want to skip lines at the end of the file, use option –endgrep to grep for the last line to be processes.

python-per-line_V0_0_5.zip (https)
MD5: 1CED1F84FD44E64BF448558BA02E0978
SHA256: 8E6845006BD3463135CE7AA0AA05FA596AC10E6E2ACC4B45C5909B624A20D6A5

« Previous PageNext Page »

Blog at WordPress.com.