Didier Stevens

Tuesday 31 December 2019

YARA “Ad Hoc Rules”

Filed under: My Software — Didier Stevens @ 14:42

Several of my tools support YARA rules.

And of those tools, many support what I like to call “Ad Hoc rules” (or Here rules).

An Ad Hoc YARA rule is a rule that isn’t stored in a file, but is passed via the command line, and is generated ad hoc by the tool for you.

Take for example oledump.py.

When you issue the command “oledump.py -y trojan.yara sample.vir”, oledump will load all the rules found inside file trojan.yara, and scan the streams of document sample.vir with these rules.

But if you want to search for a simple string, say “virus.exe”, then you have to create a YARA rule to search for this string, store it inside a file, and pass this file to oledump via option -y.

Ad hoc rules make this process simpler. Ad hoc rules start with #.

To generate an ad hoc rule for a string, use prefix #s#. Like this:

oledump.py -y “#s#virus.exe sample.vir”

This will generate the following YARA rule:

rule string {strings: $a = “virus.exe” ascii wide nocase condition: $a}

You can also use #x# for hexadecimal, oledump.py -y “#x#D0 CF 11 E0” sample.vir:

rule hexadecimal {strings: $a = { D0 CF 11 E0 } condition: $a}

And #r# for regular expressions, oledump.py -y “#r#[a-z]+” sample.vir

rule regex {strings: $a = / [a-z]+ / ascii wide nocase condition: $a}

And you can also pass YARA rules literally (#), hexadecimal encoded (#h#) and base64 encoded (#b#).

And finally, for passing rules literally with double-quotes (“), you can use #q#: this will replace every single quote (‘) with a double quote (“).


Sunday 29 December 2019

Update: pdf-parser.py Version 0.7.4 and pdfid.py Version 0.2.7

Filed under: My Software,PDF,Update — Didier Stevens @ 0:00

This is a bug fix version.

pdf-parser_V0_7_4.zip (https)
MD5: 51C6925243B91931E7FCC1E39A7209CF
SHA256: FC318841952190D51EB70DAFB0666D7D19652C8839829CC0C3871BBF7E155B6A

pdfid_v0_2_7.zip (https)
MD5: F1852F238386681C2DC40752669B455B
SHA256: FE2B59FE458ECBC1F91A40095FB1536E036BDD4B7B480907AC4E387D9ADB6E60

Saturday 28 December 2019

Update: zipdump.py Version 0.0.16

Filed under: My Software,Update — Didier Stevens @ 0:00

This new version of zipdump.py, a tool to analyze ZIP files, adds option -f to scan for PK records and adds support for Python 3.

More details in an upcoming blog post.

zipdump_v0_0_16.zip (https)
MD5: 616654BDAFFDA1DDE074E6D1A41E8A42
SHA256: F3B6D52BA32D6BA3836D0919F2BBC262F043EF6E26D173DD0965735D4F3B5598

Wednesday 25 December 2019


Filed under: My Software — Didier Stevens @ 13:52

I regularly want to test the behavior of applications opening files downloaded from the Internet.

On Windows, files downloaded from the Internet (with Internet Explorer or Edge, for example) have metadata in an Alternate Data Stream to indicate their origin. This is the Zone.Identifier ADS.

To simulate a download, I will add the ADS myself, and I often refer to my own blog post here and here, as I don’t remember the exact syntax and numbers.

Until recently.

Now, I wrote a small Go program that helps me creating (and removing) the appropriate ADS for a mark-of-web (Zone.Identifier).

Just running zoneidentifier with a filename, will add a Zone.Identifier ADS for zone 3 (Internet) to the file. Like this:

Option -id is used to specify a different zone ID, like this:

And option -remove is used to remove a Zone.Identifier ADS:

zoneidentifier_V0_0_1.zip (https)
MD5: CB1EB21013C6124CB3C1320F6A12207F
SHA256: E867AE693CB5EEA8CF0D252421E347B1309D7F36C9C6A427F7361CD5DD619839

Thursday 19 December 2019

Update: oledump.py Version 0.0.44

Filed under: maldoc,My Software,Update — Didier Stevens @ 0:00

This new version of oledump adds option -f to find embedded ole files, making the analysis of .DWG files with embedded VBA macros (for example) easier.

And there is a new plugin: plugin_version_vba.py. This helps with determining the VBA version.

Here is a video showing the analysis of .DWG files with option -f:

oledump_V0_0_44.zip (https)
MD5: 2BB2CD027327FFD8857CDADC1C988133
SHA256: 1A9C951E95E2FE0FDF3A3DC8E331205BC65C617953F0E30ED3E6AC045F4DD0C0

Monday 16 December 2019

Analyzing .DWG Files With Embedded VBA Macros

Filed under: maldoc,Malware — Didier Stevens @ 0:00

AutoCAD’s drawing files (.dwg) can contain VBA macros. The .dwg format is a proprietary file format. There is some documentation, for example here.

When VBA macros are stored inside a .dwg file, an OLE file is embedded inside the .dwg file. There’s a quick-and-dirty way to find this embedded file inside the .dwg file: search for magic sequence D0CF11E0.

My tool cut-bytes.py can be used to search for the first occurrence of byte sequence D0CF11E0 and extract all bytes starting from this sequence until the end of the .dwg file. This can be done with cut-expression [D0CF11E0]: and pipe the result into oledump.py, like this:

Next, oledump can be used to conduct the analysis as usual, for example by extracting the VBA macro source code:

There is also a more structured approach to locate the embedded OLE file inside a .dwg file. When one looks at a .dwg file with a hexadecimal editor, the following can be seen:

First there is a magic sequence identifying this as a .dwg file: AC1032. This sequence varies with the file format version, but since many, many years, it starts with AC10. You can find more details regarding this magic sequence here and here.

At position 0x24 (36 decimal), there is a 32-bit little-endian integer. This is a pointer to the embedded OLE file (this pointer is NULL when no OLE file with VBA macros is embedded).

In our example, this pointer value is 0x00008080. And here is what can be found at this position inside the .dwg file:

First there is a 16-byte long header. At position 8 inside this header, there is a 32-bit little-endian integer that represents the length of the embedded file. 0x00001C00 in our example. And after the header one can find the embedded OLE file (notice magic sequence D0CF11E0).

This information can then be used to extract the OLE file from the .dwg like, like this:

Achieving exactly he same result as the quick-and-dirty method. The reason we don’t have to figure out the length of embedded OLE the file using the quick-and-dirty method, is that oledump ignores all bytes appended to an OLE file.

I will adapt my oledump.py tool to extract macros directly from .dwg files, without the need of a tool like cut-bytes.py, but I will probably implement something like the quick-and-dirty method, as this method would potentially work for other file formats with embedded OLE files, not only .dwg files.


Monday 9 December 2019

Update: oledump.py Version 0.0.43

Filed under: My Software,Update — Didier Stevens @ 0:00

This new version of oledump.py adds support for Python 3. Several plugins and decoders were also updated for Python 3.

There’s a new option to include storages in the overview: –storages.

And option –decompress now does also VBA decompression (it was zlib only). This helps to decompress the dir stream of documents with VBA macros:

And I added type 1009 to plugin_msg.py: Compressed RTF.

oledump_V0_0_43.zip (https)
MD5: F98A06CED73C4FC2CA153B7E751746B5
SHA256: 4FE1DBAB822CEC2489328CE3D4D272400F23F1FAD266C9D89B49D9F83F3AA27F

Sunday 8 December 2019

Update: numbers-to-string.py Version 0.0.9

Filed under: My Software,Update — Didier Stevens @ 19:34

This is just a bugfix version (Python 3).

numbers-to-string_v0_0_9.zip (https)
MD5: C5629F102FCF58E5CFF24472D35AFF22
SHA256: 5B1CA43EDFD7BA66CF44FB552BD7882AEB13A8765017F9F865071E187410EE63

Overview of Content Published in November

Filed under: Announcement — Didier Stevens @ 9:36

Here is an overview of content I published in November:

Blog posts:

SANS ISC Diary entries:

Blog at WordPress.com.