Didier Stevens

Tuesday 23 December 2014

oledump: Extracting Embedded EXE From DOC

Filed under: Malware,My Software — Didier Stevens @ 0:00

RECHNUNG_vom_18122014.doc (6a574342b3e4e44ae624f7606bd60efa) is a malicious Word document with VBA macros that extract and launch an embedded EXE.

This is nothing new, but I want to show you how you can analyze this document with oledump.py. I also have a video on my video blog.

First we have a look at the streams (I put the Word document inside a password (= infected) protected ZIP file to avoid AV interference, oledump can handle such files):

20141221-131242

Stream 7 contains VBA macros, let’s have a look:

20141221-131457

Subroutine v45 is automatically executed when the document is opened. It creates a temporary file, searches for string “1234” inside the text of the Word document (ActiveDocument.Range.Text), writes the encoded bytes following it to disk, and then executes it.

If you take a look at the content of the Word document (stream 14), you’ll see this:

20141221-131551

Following string “1234” you’ll see &H4d&H5a&h90…

&Hxx is the hexadecimal notation for a byte in VBA. It can be converted with function cbyte. We can also convert this sequence of hexadecimally encoded bytes using a decoder specially written for this. The decoder (written in Python) searchers for strings &Hxx with a regular expression, converts the xx hex values to characters and concatenates them into a string, which is returned to oledump.

#!/usr/bin/env python

__description__ = '&H decoder for oledump.py'
__author__ = 'Didier Stevens'
__version__ = '0.0.1'
__date__ = '2014/12/19'

"""

Source code put in public domain by Didier Stevens, no Copyright
https://DidierStevens.com
Use at your own risk

History:
  2014/12/19: start

Todo:
"""

import re

class cAmpersandHexDecoder(cDecoderParent):
    name = '&H decoder'

    def __init__(self, stream, options):
        self.stream = stream
        self.options = options
        self.done = False

    def Available(self):
        return not self.done

    def Decode(self):
        decoded = ''.join([chr(int(s[2:], 16)) for s in re.compile('&H[0-9a-f]{2}', re.IGNORECASE).findall(self.stream)])
        self.name = '&H decoder'
        self.done = True
        return decoded

    def Name(self):
        return self.name

AddDecoder(cAmpersandHexDecoder)

This decoder allows us to analyze the embedded file with the following command: oledump.py -s 14 -D decoder_ah.py RECHNUNG_vom_18122014.doc.zip

20141221-131712

From the MZ and PE headers, you can identify it as a PE file. We can check this with pecheck like this:

oledump.py -s 14 -D decoder_ah.py -d RECHNUNG_vom_18122014.doc.zip | pecheck.py

20141221-131759

20141221-131833
oledump_V0_0_4.zip (https)
MD5: 8AD542ED672E45C45222E0A934033852
SHA256: F7B8E094F5A5B31280E0CDF11E394803A6DD932A74EDD3F2FF5EC6DF99CBA6EF

Blog at WordPress.com.