Didier Stevens

Monday 12 November 2018

Update: cut-bytes.py Version 0.0.8

Filed under: My Software,Update — Didier Stevens @ 0:00

cut-bytes.py is a tool I use to select (cut) a sequence of bytes out of a file, using a cut-expression. This expression specifies the start of the sequence and the end of the sequence.

In this example, I use a cut-expression to find the first occurrence of MZ (i.e. [‘MZ’]) and select 8 bytes (8l) starting at the position of that occurrence (-a is ASCII dump):

I realized that with a few changes, I could add a binary grep feature to cut-bytes. Option -g activates this binary grep:

In stead of one occurrence (the first), with option -g, all occurrences are selected.

JSON output is now also available with option –jsonoutput:

This JSON output contains all the selected byte sequences (BASE64 encoded and with metadata), and it can be piped into tools that accept this format, like file-magic.py:

file-magic will then identify each byte sequence. As you can guess, I’m looking for PE files embedded in file update.bin. But the byte sequences are too short (8 bytes) for file-magic.py to properly identify file types. By increasing the length to 512 bytes, file-magic.py has enough data to locate 2 PE files (a 32-bit DLL and a 64-bit DLL) inside update.bin:

Option -G is identical to -g, except that the selected byte sequences will not overlap.

And I also added a “run length encoded” ASCII dump (-A). If 2 or more consecutive output lines are identical, the duplicates are suppressed:

cut-bytes_V0_0_8.zip (https)
MD5: 1A69542E7E9D7348101B7E91884674B7
SHA256: 15BC253323FF162F26BEF784172A502383970E63514DF6B88A09952A19DAE826

Wednesday 7 November 2018

Update: hash.py Version 0.0.6

Filed under: My Software,Update — Didier Stevens @ 0:00

This new version adds CSV output via option -C:

hash_V0_0_6.zip (https)
MD5: DE0AC3F7809E55E1577EB049A5F34EDF
SHA256: D66FF1D5173E3DDAFC842087B9E4E8447C18EF0AA8C03E02A365E3F9028BA8D9

Tuesday 30 October 2018

Update: format-bytes.py Version 0.0.6

Filed under: My Software,Update — Didier Stevens @ 0:00

When using option -f to specify struct members, you can now also use new option -n (annotations) to annotate members.

Like in this example:

format-bytes_V0_0_6.zip (https)
MD5: D73C88AB15B8AE3B30BA2C5EBE8CC77E
SHA256: 3FB480B52F5BF535A54B66CABBD853666B3E306EFAE4BD9247B45255F223E0B6

Sunday 28 October 2018

Update: file-magic.py Version 0.0.4

Filed under: My Software,Update — Didier Stevens @ 0:00

I added a new option to file-magic.py to limit identification to the custom definitions: -C.

file-magic_V0_0_4.zip (https)
MD5: CCF170F09B1442D27AE6519A0BB0CBAB
SHA256: F240BAEE78C8AE4DB29724D8A8F2A5DEDEFE47570219D700FB3BB9A6707432BB

Saturday 27 October 2018

Update: file-magic.py Version 0.0.3

Filed under: My Software,Update — Didier Stevens @ 0:00

This is an update with a custom definition to recognize compressed RTF.

file-magic_V0_0_3.zip (https)
MD5: C46EBA4BC6BC63E097A86E30E6DE5432
SHA256: 3F3012B06182925C1A42678977089184B9C97C37CD025F9D71757B4227E7BE09

Thursday 25 October 2018

Analyzing PowerPoint Maldocs with oledump Plugin plugin_ppt

Filed under: maldoc,My Software — Didier Stevens @ 0:00

VBA macros inside a PowerPoint document are not stored directly inside streams, but as records in the “PowerPoint Document” stream. I have a plugin to parse the records of the “PowerPoint Document” stream, but I failed to extract the embedded, compressed OLE file with the macros. Until a recent tweet by @AngeAlbertini brought this up again. On his sample too I failed to extract the compressed OLE file, but then I remembered I had fixed a problem with zlib extraction in pdf-parser.py. Taking this code into plugin_ppt.py fixed the decompression problems.

VBA macros in a PowerPoint document do not appear directly in streams:

Plugin plugin_ppt parses records found in stream “PowerPoint Document”:

Each line represents a record, prefixed by an index generated by the plugin (to easily reference records). Records with a C indicator (like 1 and 435) contain sub-records. Records prefixed with ! contain an embedded object.

Record 441 (RT_ExternalOleObjectStg) interests us because it contains an OLE file with VBA macros.

Plugin option -s can be used to select this record:

Plugin option -a can then be used to do an hex/ascii dump:

The first four bytes are the size, and then follows the zlib compressed OLE file (as indicated by 0x78).

This OLE file can be decompressed and extracted with option -e, but pay attention to use option -q (quiet) so that oledump will only report the output of the plugin, and nothing else. This can then be piped into a second instance of oledump:

And now we can extract the VBA macros:

oledump_V0_0_38.zip (https)
MD5: C1D7F71A390497A516F67D798BA25128
SHA256: 4CADEE69D024E9242CDA0CE3A9C22BCB1CAFF9D5BA2D946519C6B7C18F895B81

Wednesday 24 October 2018

Update: oledump.py Version 0.0.38

Filed under: maldoc,My Software,Update — Didier Stevens @ 0:00

This new version of oledump.py includes a new plugin to extract VBA code from PowerPoint files and an update to plugin plugin_http_heuristics.

plugin_http_heuristics was updated to increase the chance of success for the XOR dictionary attack, triggered by a maldoc sample I analyzed.

Two new options were added: -e and -k.

By default, plugin_http_heuristics searchers for keywords http: and https:. Using option -e, this list is extended with keywords msxml, adodb, shell, c:\, cmd and powershell.

With option -k, the default keyword list is replaced by your own list (using , as separator). Here I look for ftp (which is not present), remark that http is no longer detected:

oledump_V0_0_38.zip (https)
MD5: C1D7F71A390497A516F67D798BA25128
SHA256: 4CADEE69D024E9242CDA0CE3A9C22BCB1CAFF9D5BA2D946519C6B7C18F895B81

Tuesday 23 October 2018

Update: pdf-parser.py Version 0.6.9

Filed under: My Software,PDF,Update — Didier Stevens @ 0:00

This new version of pdf-parser.py brings 2 new features; the idea came to me during private & public trainings I gave on malicious documents (if you are interested in a training, please get in touch).

The statistics option (-a –stats) has been enhanced with a search for keywords section:

In this section, the result of searches for particular keywords (that might indicate a malicious PDF) is displayed: you get the number of hits followed by the indices of the objects that contain this keyword.

In the example above, we see that object 11 contains JavaScript.

Remark that this section is the result of a search command (-s): search in pdf-parser is not case-senstive and partial (unlike PDFiD). That explains why /AA is found in object 37, while it’s actually /Aacute:

pdf-parser will also read file pdfid.ini (if present) so that the personal keywords you added to PDFiD are also used by pdf-parser.

–overridingfilters is a new option: it allows for the processing of streams with a different filter (or filter chain) than the one specified in the object’s dictionary. Use value raw to obtain the raw stream, without filtering.

pdf-parser_V0_6_9.zip (https)
MD5: 27D65A96FEAF157360ACBBAAB9748D27
SHA256: 3F102595B9EAE5842A1B4723EF965344AE3AB01F90D85ECA96E9678A6C7092B7

Monday 22 October 2018

New tool: decompress_rtf.py

Filed under: My Software — Didier Stevens @ 0:00

A reader over at the Internet Storm Center asked how to analyze a particular email file (.msg) with my oledump.py tool. MSG files are ole files, and can be analyzed with oledump. In this particular email, one stream contains compressed RTF.

I quickly wrote a tool to decompress compressed RTF using Python module compressed_rtf, using my binary file Python template.

Template process-binary-file.py is a Python program that reads binary files (normal files, stdin, contained in a ZIP file, …) and comes with various options to process these files. The class cBinaryFile can read a complete or partial file into memory for further processing.

Just a few lines of code of the template need to be added/changed to create this new decompression tool.

First I need to import module compressed_rtf. A single line “import compressed_rtf” would be sufficient, but I’m adding some error handling in case the module is not installed:


try:
    import compressed_rtf
except ImportError:
    print('Module compressed_rtf missing, please install with command: pip install compressed_rtf')
    sys.exit(-1)

Next, I search for “# —– Put your data processing code here —–” (line 1315 in the current version 0.0.1 of the template):

Lines 1316 and 1317 (starting with oOutput.Line) are just demo lines, to be replaced by this line:


        oOutput.Line(compressed_rtf.decompress(data), eol='')

Variable data contains the complete binary content of the processed file (e.g. the compressed RTF), and a call to method compressed_rtf.decompress will decompress the data. Then I output the result with method Line of object oOutput. I use this method in stead of a print statement, because then I have more control over the output format and destination by using command-line option -o. eol=” directs the Line method not to append a new-line after outputting the decompressed RTF file.

That’s essentially all that needs to be done to create this new tool with my template.

Documentation is also important, so I also updated the description (line 5), date (line 8) and the manual (starting line 67).

And now with this tool, I can decompress compressed RTF streams found inside a .msg file:

decompress_rtf_V0_0_1.zip (https)
MD5: 41127F62897479FB5135D36675C396F5
SHA256: 581F2E1B2B508C3941EC22040FB0C76999E5DF293C8AD0DC1FDE921D121F3A26

Sunday 21 October 2018

Release: Python Tool Templates

Filed under: Announcement,My Software — Didier Stevens @ 0:00

I’m releasing the templates for Python tools I shared and used during my BruCON and Hack.lu 2018 workshops.

There’s a template for text files and one for binary files.

python-templates_V0_0_1.zip (https)
MD5: 99E9D87681470F1BAE020B68F2853F49
SHA256: 2CA24AD6928FA2FE2DE894FEFBD1B41238B723D46ADED4064D26374A805BA1C4

Next Page »

Blog at WordPress.com.