Didier Stevens

Thursday 22 January 2015

Converting PEiD Signatures To YARA Rules

Filed under: Forensics,Malware,My Software — Didier Stevens @ 0:56

I converted Jim Clausing’s PEiD rules to YARA rules so that I can use them to detect executable code in suspect Microsoft Office Documents with my oledump tool.

Of course, I wrote a program to do this automatically: peid-userdb-to-yara-rules.py

This program converts PEiD signatures to YARA rules. These signatures are typically found in file userdb.txt. Since PEiD signature names don’t need to be unique, and can contain characters that are not allowed in YARA rules, the name of the YARA rule is prefixed with PEiD_ and a running counter, and non-alphanumeric characters are converted to underscores (_).
Signatures that can not be parsed are ignored.

Here is an example:
PEiD signature:

 [!EP (ExE Pack) V1.0 -> Elite Coding Group]
 signature = 60 68 ?? ?? ?? ?? B8 ?? ?? ?? ?? FF 10
 ep_only = true

Generated YARA rule:

 rule PEiD_00001__EP__ExE_Pack__V1_0____Elite_Coding_Group_
 {
     meta:
         description = "[!EP (ExE Pack) V1.0 -> Elite Coding Group]"
         ep_only = "true"
     strings:
         $a = {60 68 ?? ?? ?? ?? B8 ?? ?? ?? ?? FF 10}
     condition:
         $a
 }

PEiD signatures have an ep_only property that can be true or false. This property specifies if the signature has to be found at the PE file’s entry point (true) or can be found anywhere (false).
This program will convert all signatures, regardless of the value of the ep_only property. Use option -e to convert only rules with ep_only property equal to true or false.

Option -p generates rules that use YARA’s pe module. If a signature has ep_only property equal to true, then the YARA rule’s condition becomes $a at pe.entry_point instead of just $a.

Example:

 import "pe"

 rule PEiD_00001__EP__ExE_Pack__V1_0____Elite_Coding_Group_
 {
     meta:
         description = "[!EP (ExE Pack) V1.0 -> Elite Coding Group]"
         ep_only = "true"
     strings:
         $a = {60 68 ?? ?? ?? ?? B8 ?? ?? ?? ?? FF 10}
     condition:
         $a at pe.entry_point
 }

Specific signatures can be excluded with option -x. This option takes a file that contains signatures to ignore (signatures like 60 68 ?? ?? ?? ?? B8 ?? ?? ?? ?? FF 10, not names like [!EP (ExE Pack) V1.0 -> Elite Coding Group]).

Download my YARA Rules.

peid-userdb-to-yara-rules_V0_0_1.zip (https)
MD5: D5B9B6FA7EC50A107A70419D30FEC9ED
SHA256: F8A12B5522B92AE7E3EDF11ACFAEEA7FDCC7FBDA8DC827D288A2D92B2B2CA5E2

Friday 16 January 2015

Update: oledump.py Version 0.0.6

Filed under: Malware,My Software,Update — Didier Stevens @ 16:11

My last software release for 2014 was oledump.py V0.0.6 with support for the “ZIP/XML” Microsoft Office fileformat and YARA.

In this post I will highlight support for the “new” Microsoft Office fileformat (.docx, .docm, .xlsx, .xlsm, …), which is mainly composed of XML files stored inside a ZIP container. Except macros which are still stored with OLE files (inside the ZIP container).

When oledump.py detects that the file is actually a ZIP file, it searches through all the files stored inside the ZIP container for OLE files, and analyses these.

Here is an example of a simple spreadsheet with macros. The xlsm file contains one OLE file: xl/vbaProject.bin. oledump gives it the identifier A. All the streams inside the OLE file are reported, and their index is prefixed with the identifier (A in this example).

20150112-232122

If you want to select the stream with the macros, you use A6, like this: oledump.py -s A1

oledump also supports the analysis of an OLE file stored in a password protected ZIP file (typically, malware samples are stored inside ZIP files with password infected). When oledump.py analyses a ZIP file with extension .zip, it assumes that the file is NOT using the “new” Microsoft Office fileformat. Only when the file is a ZIP file but the extension is not .zip does oledump assume that the file is using the “new” Microsoft Office fileformat.

I have another example in my Internet Storm Center Guest Diary Entry.

oledump_V0_0_6.zip (https)
MD5: E32069589FEB7B53707D00D7E0256F79
SHA256: 8FCEFAEF5E6A2779FC8755ED96FB1A8DACDBE037B98EE419DBB974B5F18E578B

Thursday 8 January 2015

Didier Stevens Suite

Filed under: My Software — Didier Stevens @ 20:14

I bundled most of my software in a ZIP file. In all modesty, I call it Didier Stevens Suite.

Wednesday 24 December 2014

Update: oledump.py Version 0.0.5

Filed under: Malware,My Software,Update — Didier Stevens @ 18:10

A quick bugfix and a new feature.

oledump will now correctly handle OLE files with an empty storage. Here is an example with a malicious sample that blog readers reported to me:

20141224-185748

And when the OLE file contains a stream with VBA code, but this code is just a set of Attribute statements and nothing else, then the indicator will be a lowercase letter m instead of an uppercase letter M.

20141224-190354

This way, you can quickly identify interesting VBA streams to analyze.

oledump_V0_0_5.zip (https)
MD5: A712DCF508C2A0184F751B74FE7F513D
SHA256: E9106A87386CF8512467FDD8BB8B280210F6A52FCBACEEECB405425EFE5532D9

Tuesday 23 December 2014

oledump: Extracting Embedded EXE From DOC

Filed under: Malware,My Software — Didier Stevens @ 0:00

RECHNUNG_vom_18122014.doc (6a574342b3e4e44ae624f7606bd60efa) is a malicious Word document with VBA macros that extract and launch an embedded EXE.

This is nothing new, but I want to show you how you can analyze this document with oledump.py. I also have a video on my video blog.

First we have a look at the streams (I put the Word document inside a password (= infected) protected ZIP file to avoid AV interference, oledump can handle such files):

20141221-131242

Stream 7 contains VBA macros, let’s have a look:

20141221-131457

Subroutine v45 is automatically executed when the document is opened. It creates a temporary file, searches for string “1234” inside the text of the Word document (ActiveDocument.Range.Text), writes the encoded bytes following it to disk, and then executes it.

If you take a look at the content of the Word document (stream 14), you’ll see this:

20141221-131551

Following string “1234” you’ll see &H4d&H5a&h90…

&Hxx is the hexadecimal notation for a byte in VBA. It can be converted with function cbyte. We can also convert this sequence of hexadecimally encoded bytes using a decoder specially written for this. The decoder (written in Python) searchers for strings &Hxx with a regular expression, converts the xx hex values to characters and concatenates them into a string, which is returned to oledump.

#!/usr/bin/env python

__description__ = '&H decoder for oledump.py'
__author__ = 'Didier Stevens'
__version__ = '0.0.1'
__date__ = '2014/12/19'

"""

Source code put in public domain by Didier Stevens, no Copyright

https://DidierStevens.com

Use at your own risk

History:
  2014/12/19: start

Todo:
"""

import re

class cAmpersandHexDecoder(cDecoderParent):
    name = '&H decoder'

    def __init__(self, stream, options):
        self.stream = stream
        self.options = options
        self.done = False

    def Available(self):
        return not self.done

    def Decode(self):
        decoded = ''.join([chr(int(s[2:], 16)) for s in re.compile('&H[0-9a-f]{2}', re.IGNORECASE).findall(self.stream)])
        self.name = '&H decoder'
        self.done = True
        return decoded

    def Name(self):
        return self.name

AddDecoder(cAmpersandHexDecoder)

This decoder allows us to analyze the embedded file with the following command: oledump.py -s 14 -D decoder_ah.py RECHNUNG_vom_18122014.doc.zip

20141221-131712

From the MZ and PE headers, you can identify it as a PE file. We can check this with pecheck like this:

oledump.py -s 14 -D decoder_ah.py -d RECHNUNG_vom_18122014.doc.zip | pecheck.py

20141221-131759

20141221-131833
oledump_V0_0_4.zip (https)
MD5: 8AD542ED672E45C45222E0A934033852
SHA256: F7B8E094F5A5B31280E0CDF11E394803A6DD932A74EDD3F2FF5EC6DF99CBA6EF

Wednesday 17 December 2014

Introducing oledump.py

Filed under: Forensics,Malware,My Software — Didier Stevens @ 0:07

If you follow my video blog, you’ve seen my oledump videos and downloaded the preview version. Here is the “official” release.

oledump.py is a program to analyze OLE files (Compound File Binary Format). These files contain streams of data. oledump allows you to analyze these streams.

Many applications use this file format, the best known is MS Office. .doc, .xls, .ppt, … are OLE files (docx, xlsx, … is the new file format: XML insize ZIP).

Run oledump on an .xls file and it will show you the streams:

20141216-223150

The letter M next to stream 7, 8, 9 and 10 indicate that the stream contains VBA macros.

You can select a stream to dump its content:

20141216-223233

The source code of VBA macros is compressed when stored inside a stream. Use option -v to decompress the VBA macros:

20141216-223705

You can write plugins (in Python) to analyze streams. I developed 3 plugins. Plugin plugin_http_heuristics.py uses a couple of tricks to extract URLs from malicious, obfuscated VBA macros, like this:

20141216-224228

You might have noticed that the file analyzed in the above screenshot is a zip file. Like many of my analysis programs, oledump.py can analyze a file inside a (password protected) zip file. This allows you to store your malware samples in password protected zip files (password infected), and then analyze them without having to extract them.

If you install the YARA Python module, you can scan the streams with YARA rules:

20141216-224952

And if you suspect that the content of a stream is encoded, for example with XOR, you can try to brute-force the XOR key with a simple decoder I provide (or you can develop your own decoder in Python):

20141216-225911

This program requires Python module OleFileIO_PL: http://www.decalage.info/python/olefileio

oledump_V0_0_3.zip (https)
MD5: 9D5AA950C9BFDB16D63D394D622C6767
SHA256: 44D8C675881245D3336D6AB6F9D7DAF152B14D7313A77CB8F84A71B62E619A70

Friday 12 December 2014

XORSelection.1sc

Filed under: My Software,Update — Didier Stevens @ 16:09

This is an update to my XORSelection 010 Editor script. You can select a sequence of bytes in 010 Editor (or the whole file) and then run this script to encode the sequence with the XOR key you provide. The XOR key can be a string or a hexadecimal value. Prefix the hexadecimal value with 0x.

Here is an example of an XOR encoded malicious URL found in a Word document with malicious VBA code.

20141212-164241

20141212-164325

Although this is an update, it turns out I never released it on my site here, but it has been released on the 010 Editor script repository.

XORSelection_V3_0.zip (https)
MD5: EAF49C31C20F52DDEF74C1B50DC4EFA1
SHA256: 755913C46F8620E6865337F621FC46EA416893E28A4193E42228767D9BD7804A

Tuesday 25 November 2014

Update: find-file-in-file.py Version 0.0.4

Filed under: Forensics,My Software,Update — Didier Stevens @ 22:05

Here is the version I talked about in my Bitcoin virus posts.

It also has an embedded man page (use option –man).

find-file-in-file_v0_0_4.zip (https)
MD5: CD381616158BD233D94B368554B824C6
SHA256: FD5C4E3EC99371754E58B93D3D96CBA7A86C230C47FC9C27C9B871ED8BFB9149

Man page:

Usage: find-file-in-file.py [options] file-contained file-containing […]
Find if a file is present in another file

Arguments:
file-containing can be a single file, several files, and/or @file
@file: run the command on each file listed in the text file specified
wildcards are supported
batch mode is enabled when more than one file is specified

Source code put in the public domain by Didier Stevens, no Copyright
Use at your own risk

https://DidierStevens.com

Options:
–version             show program’s version number and exit
-h, –help            show this help message and exit
-m MINIMUM, –minimum=MINIMUM
Minimum length of byte-sequence to find (default 10)
-o, –overlap         Found sequences may overlap
-v, –verbose         Be verbose in batch mode
-p, –partial         Perform partial search of contained file
-O OUTPUT, –output=OUTPUT
Output to file
-b RANGEBEGIN, –rangebegin=RANGEBEGIN
Select the beginning of the contained file (by default
byte 0)
-e RANGEEND, –rangeend=RANGEEND
Select the end of the contained file (by default last
byte)
-x, –hexdump         Hexdump of found bytes
-q, –quiet           Do not output to standard output
–man                 Print manual

Manual:

find-file-in-file is a program to test if one file (the contained
file) can be found inside another file (the containing file).

Here is an example.
We have a file called contained-1.txt with the following content:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
and have a file called containing-1.txt with the following content:
0000ABCDEFGHIJKLM1111NOPQRSTUVWXYZ2222

When we execute the following command:
find-file-in-file.py contained-1.txt containing-1.txt

We get this output:
0x00000004 0x0000000d (50%)
0x00000015 0x0000000d (50%)
Finished

This means that the file contained-1.txt was completely found inside
file containing-1.txt At position 0x00000004 we found a first part
(0x0000000d bytes) and at position 0x00000015 we found a second part
(0x0000000d bytes).

We can use option hexdump (-x) to see which bytes were found:
find-file-in-file.py -x contained-1.txt containing-1.txt
0x00000004 0x0000000d (50%)
41 42 43 44 45 46 47 48 49 4a 4b 4c 4d
0x00000015 0x0000000d (50%)
4e 4f 50 51 52 53 54 55 56 57 58 59 5a
Finished

The containing file may contain the contained file in an arbitrary
order, like file containing-2.txt:
0000NOPQRSTUVWXYZ1111ABCDEFGHIJKLM2222

Example:
find-file-in-file.py -x contained-1.txt containing-2.txt
0x00000015 0x0000000d (50%)
41 42 43 44 45 46 47 48 49 4a 4b 4c 4d
0x00000004 0x0000000d (50%)
4e 4f 50 51 52 53 54 55 56 57 58 59 5a
Finished

The containing file does not need to contain the complete contained
file, like file containing-3.txt:
0000ABCDEFGHIJKLM1111

Example:
find-file-in-file.py -x contained-1.txt containing-3.txt
0x00000004 0x0000000d (50%)
41 42 43 44 45 46 47 48 49 4a 4b 4c 4d
Remaining 13 (50%)

The message “Remaining 13 (50%)” means that the last 13 bytes of the
contained file were not found in the containing file (that’s 50% of
the contained file).

If the contained file starts with a byte sequence not present in the
containing file, nothing will be found. Example with file
contained-2.txt:
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ

Nothing is found:
find-file-in-file.py -x contained-2.txt containing-1.txt
Remaining 36 (100%)

If you know how long that initial byte sequence is, you can skip it.
Use option rangebegin (-b) to specify the position in the contained
file from where you want to start searching.
Example:

find-file-in-file.py -x -b 10 contained-2.txt containing-1.txt
0x00000004 0x0000000d (50%)
41 42 43 44 45 46 47 48 49 4a 4b 4c 4d
0x00000015 0x0000000d (50%)
4e 4f 50 51 52 53 54 55 56 57 58 59 5a
Finished

If you want to skip bytes at the end of the contained file, use option
rangeend (-e).

If you don’t know how long that initial byte sequence is, you can
instruct find-file-in-file to “brute-force” it. With option partial
(-p), one byte at a time will be removed from the beginning of the
contained file until a match is found.
Example:

find-file-in-file.py -x -p contained-2.txt containing-1.txt
File: containing-1.txt (partial 0x0a)
0x00000004 0x0000000d (50%)
41 42 43 44 45 46 47 48 49 4a 4b 4c 4d
0x00000015 0x0000000d (50%)
4e 4f 50 51 52 53 54 55 56 57 58 59 5a
Finished

“(partial 0x0a)” tells you that the first 10 bytes of the contained
file were skipped before a match was found.

There are some other options:
-m minimum: find-file-in-file will search for byte sequences of 10
bytes long minimum. If you want to change this minimum, use option -m
minimum.
-o overlap: find-file-in-file will not let byte sequences overlap. Use
option -o overlap to remove this restriction.
-v verbose: be verbose in batch mode (more than one containing file).
-O output: besides writing output to stdout, write the output also to
the given file.
-q quiet: do not output to stdout.

Tuesday 18 November 2014

Update: pecheck.py Version 0.4.0

Filed under: My Software,Update — Didier Stevens @ 21:15

pecheck.py is a wrapper for pefile, ant this update has a couple of new features:

  • accept input from stdin (for pipes)
  • load PeID userdb.txt by default from same directory as pecheck.py
  • extra entry point info

pecheck-v0_4_0.zip (https)
MD5: 27041C56B80B097436076B7366A6F3B2
SHA256: F9C73ED054AE4D5E9F495916D1B028FD8D6E9B2800DCE1993E568E2A2BFD9A71

Wednesday 5 November 2014

XORSearch: Hexdump Support

Filed under: My Software,Update — Didier Stevens @ 22:04

Sometimes I want to check a malware sample with XORSearch, but I can’t because my AV will delete it. My solution is to work with a hexdump of the file.

Option -x allows XORSearch to work with a hexdump.

XORSearch_V1_11_1.zip (https)
MD5: D5EA1E30B2C2C7FEBE7AE7AD6E826BF5
SHA256: 15E9AAE87E7F25CF7966CDF0F8DFCB2648099585D08EAD522737E72C5FACA50A

Next Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 251 other followers