Didier Stevens

Tuesday 23 December 2014

oledump: Extracting Embedded EXE From DOC

Filed under: Malware,My Software — Didier Stevens @ 0:00

RECHNUNG_vom_18122014.doc (6a574342b3e4e44ae624f7606bd60efa) is a malicious Word document with VBA macros that extract and launch an embedded EXE.

This is nothing new, but I want to show you how you can analyze this document with oledump.py. I also have a video on my video blog.

First we have a look at the streams (I put the Word document inside a password (= infected) protected ZIP file to avoid AV interference, oledump can handle such files):

20141221-131242

Stream 7 contains VBA macros, let’s have a look:

20141221-131457

Subroutine v45 is automatically executed when the document is opened. It creates a temporary file, searches for string “1234” inside the text of the Word document (ActiveDocument.Range.Text), writes the encoded bytes following it to disk, and then executes it.

If you take a look at the content of the Word document (stream 14), you’ll see this:

20141221-131551

Following string “1234” you’ll see &H4d&H5a&h90…

&Hxx is the hexadecimal notation for a byte in VBA. It can be converted with function cbyte. We can also convert this sequence of hexadecimally encoded bytes using a decoder specially written for this. The decoder (written in Python) searchers for strings &Hxx with a regular expression, converts the xx hex values to characters and concatenates them into a string, which is returned to oledump.

#!/usr/bin/env python

__description__ = '&H decoder for oledump.py'
__author__ = 'Didier Stevens'
__version__ = '0.0.1'
__date__ = '2014/12/19'

"""

Source code put in public domain by Didier Stevens, no Copyright
https://DidierStevens.com
Use at your own risk

History:
  2014/12/19: start

Todo:
"""

import re

class cAmpersandHexDecoder(cDecoderParent):
    name = '&H decoder'

    def __init__(self, stream, options):
        self.stream = stream
        self.options = options
        self.done = False

    def Available(self):
        return not self.done

    def Decode(self):
        decoded = ''.join([chr(int(s[2:], 16)) for s in re.compile('&H[0-9a-f]{2}', re.IGNORECASE).findall(self.stream)])
        self.name = '&H decoder'
        self.done = True
        return decoded

    def Name(self):
        return self.name

AddDecoder(cAmpersandHexDecoder)

This decoder allows us to analyze the embedded file with the following command: oledump.py -s 14 -D decoder_ah.py RECHNUNG_vom_18122014.doc.zip

20141221-131712

From the MZ and PE headers, you can identify it as a PE file. We can check this with pecheck like this:

oledump.py -s 14 -D decoder_ah.py -d RECHNUNG_vom_18122014.doc.zip | pecheck.py

20141221-131759

20141221-131833
oledump_V0_0_4.zip (https)
MD5: 8AD542ED672E45C45222E0A934033852
SHA256: F7B8E094F5A5B31280E0CDF11E394803A6DD932A74EDD3F2FF5EC6DF99CBA6EF

29 Comments »

  1. 1: 106 ‘\x01CompObj’
    2: 4096 ‘\x05DocumentSummaryInformation’
    3: 4096 ‘\x05SummaryInformation’
    4: 14321 ‘1Table’
    5: 4096 ‘Data’
    6: 369 ‘Macros/PROJECT’
    7: 41 ‘Macros/PROJECTwm’
    8: M 21630 ‘Macros/VBA/ThisDocument’
    9: 11044 ‘Macros/VBA/_VBA_PROJECT’
    10: 3402 ‘Macros/VBA/__SRP_0’
    11: 102 ‘Macros/VBA/__SRP_1’
    12: 4500 ‘Macros/VBA/__SRP_2’
    13: 103 ‘Macros/VBA/__SRP_3’
    14: 507 ‘Macros/VBA/dir’
    Traceback (most recent call last):
    File “oledump.py”, line 522, in
    Main()
    File “oledump.py”, line 519, in Main
    OLEDump(args[0], options)
    File “oledump.py”, line 427, in OLEDump
    stream = ole.openstream(fname).read()
    File “C:\Python27\lib\site-packages\olefile\olefile.py”, line 1894, in openstr
    eam
    raise IOError(“this file is not a stream”)
    IOError: this file is not a stream

    Comment by Yogesh — Tuesday 23 December 2014 @ 14:00

  2. Interesting. What file is this?

    Comment by Didier Stevens — Tuesday 23 December 2014 @ 14:51

  3. 1: 113 ‘\x01CompObj’
    2: 4096 ‘\x05DocumentSummaryInformation’
    3: 4096 ‘\x05SummaryInformation’
    4: 5890 ‘1Table’
    5: 260142 ‘WordDocument’
    raise IOError(“this file is not a stream”)
    IOError: this file is not a stream

    there exisitng different Versions of “Rechnung vom 1812…”
    .
    started my Analysis on dec, 20 -> https://www.virustotal.com/de/file/62d721f15c25fa8da96f1dc2446209730fbeb96feda4d49ca36e5534678207c4/analysis/1419373825/

    regards
    skippie

    Comment by stefan — Tuesday 23 December 2014 @ 22:32

  4. Ah, thanks for posting this sample. It has no macro streams, but a _acros storage. I fixed this in version 0.0.5: https://blog.didierstevens.com/programs/oledump-py/

    Comment by Didier Stevens — Wednesday 24 December 2014 @ 13:37

  5. […] first idea was to use the oledump tool developed by Didier Stevens. Without any command line option, this nice tool lists the streams […]

    Pingback by Searching for Microsoft Office Files Containing Macro | /dev/random — Thursday 8 January 2015 @ 21:25

  6. Can’t you just open the Word docx in Winzip or 7Zip and look through the folders? All the new Office formats ending in X is just a zip file.

    Comment by sunk818 — Wednesday 11 February 2015 @ 16:59

  7. oledump.py can handle Office 2007+ files.

    Comment by Didier Stevens — Thursday 12 February 2015 @ 21:50

  8. I have a docm file which has the same kind of behavior, however, oledump says thats its not a valid OLE file. I got to the VBA macro using officeparser. It is a valid VB Macro which launches an .exe I need to analyse this exe file. decoder_ah.py doesnt work again since it doesnt recognise it a valid OLE file. Am I missing something ?

    Comment by Priyank — Wednesday 25 February 2015 @ 5:42

  9. Can you share the md5 of your sample?

    Comment by Didier Stevens — Wednesday 25 February 2015 @ 8:11

  10. 515bbebbeda2d36780798cb29cf9c54f – For the docm file and 6766e4e76303243720e2c64e21153105 for the vbaproject.bin file which I managed to extract. This file contains a call to an exe, which I need to analyse

    Comment by Priyank — Wednesday 25 February 2015 @ 19:03

  11. @Priyank

    1) This sample is not on VirusTotal. Can you share this sample? Then please upload it to VirusTotal.

    2) Make sure you use the latest version of oledump, because it also supports the new file format directly: https://blog.didierstevens.com/programs/oledump-py/

    Comment by Didier Stevens — Wednesday 25 February 2015 @ 19:20

  12. Thank you for your reply. I believe this is not a virus, Just a part of a lesson in reversing. I used 0.0.9 version and now it recognizes it as a valid OLE file. However, I am still looking for a way to decode the bin file. Tried ./oledump.py -s 14 -D decoder_ah.py -d <file? , but no output.
    A: word/vbaProject.bin
    A1: 416 'PROJECT'
    A2: 71 'PROJECTwm'
    A3: M 5582 'VBA/NewMacros'
    A4: m 940 'VBA/ThisDocument'
    A5: 4923 'VBA/_VBA_PROJECT'
    A6: 1724 'VBA/__SRP_0'
    A7: 85 'VBA/__SRP_1'
    A8: 1288 'VBA/__SRP_2'
    A9: 260 'VBA/__SRP_3'
    A10: 579 'VBA/dir'

    Comment by Priyank — Wednesday 25 February 2015 @ 20:54

  13. If it’s not malware, why do you think it contains a bin file?

    Comment by Didier Stevens — Wednesday 25 February 2015 @ 22:56

  14. Because it contains a valid VBA Macro in the .bin file. Maybe it contains a secret (As i said it’s part of a lesson) And the behavior is the same as described in this post, hence my interest. If you are willing, you may have a look, I just need some understanding on this topic

    Comment by Priyank — Thursday 26 February 2015 @ 2:26

  15. OK. But there is a difference, your file is .docm, my example is .doc.
    So if you are looking for an embedden file, you must look in the files of the zip container.

    Comment by Didier Stevens — Thursday 26 February 2015 @ 7:50

  16. Great job, I’d like to see that turned into Volatility/Rekall plugin, also is there any guidance on how to use oledump from your own python script (import oledump, OLEDump(“”, options)

    Comment by Anonymous — Monday 23 March 2015 @ 16:36

  17. I cannot seem to get it to work, it returns blank. I have 0.0.14, the latest version on your site is 0.0.9

    david@Messenger:~$ sudo python oledump.py -s 1 -v infected.zip
    david@Messenger:~$ sudo python oledump.py -s 2 -v infected.zip
    david@Messenger:~$ sudo python oledump.py -s 3 -v infected.zip
    david@Messenger:~$ sudo python oledump.py -s 4 -v infected.zip
    david@Messenger:~$ sudo python oledump.py -s 5 -v infected.zip
    david@Messenger:~$ sudo python oledump.py -s 6 -v infected.zip
    david@Messenger:~$ sudo python oledump.py -s 7 -v infected.zip
    david@Messenger:~$ sudo python oledump.py -s 8 -v infected.zip
    david@Messenger:~$ sudo python oledump.py -s 9 -v infected.zip
    david@Messenger:~$ sudo python oledump.py infected.zip
    A: word/vbaProject.bin
    A1: 533 ‘PROJECT’
    A2: 95 ‘PROJECTwm’
    A3: 97 ‘UserForm1/\x01CompObj’
    A4: 290 ‘UserForm1/\x03VBFrame’
    A5: 131 ‘UserForm1/f’
    A6: 180 ‘UserForm1/o’
    A7: M 27252 ‘VBA/Module1’
    A8: M 1289 ‘VBA/ThisDocument’
    A9: m 1160 ‘VBA/UserForm1’
    A10: 6017 ‘VBA/_VBA_PROJECT’
    A11: 1391 ‘VBA/__SRP_0’
    A12: 110 ‘VBA/__SRP_1’
    A13: 292 ‘VBA/__SRP_2’
    A14: 103 ‘VBA/__SRP_3’
    A15: 789 ‘VBA/dir’
    david@Messenger:~$ sudo python oledump.py -s 10 -v infected.zip
    david@Messenger:~$ sudo python oledump.py -s 11 -v infected.zip
    david@Messenger:~$ sudo python oledump.py -s 12 -v infected.zip
    david@Messenger:~$ sudo python oledump.py -s 13 -v infected.zip
    david@Messenger:~$ sudo python oledump.py -s 14 -v infected.zip
    david@Messenger:~$ sudo python oledump.py -s 15 -v infected.zip

    Comment by brad — Wednesday 17 February 2016 @ 18:56

  18. I don’t know where you get your info, but the latest version on my site is 0.0.22: https://blog.didierstevens.com/programs/oledump-py/

    Did you try this: oledump.py -s A7 -v infected.zip

    Comment by Didier Stevens — Wednesday 17 February 2016 @ 19:01

  19. My apologies. I must have missed that version on your archived versions page.

    -A7 did work, it dumped a lot of code for (what looks like) a Snake game. It appears harmless, however my AV and Gmail still classify it as a virus .docm file. And it came from a spam message, so I’m sure it is…

    Could there be other code it’s not dumping? The code it has dumped does not look malicious

    This is a really fascinating tool. Thanks in advance!

    Comment by Watchman1 — Wednesday 17 February 2016 @ 20:20

  20. Yes, every stream with a M/m indicator contains macros. If you can share the md5 hash, I can have a look.

    Comment by Didier Stevens — Wednesday 17 February 2016 @ 20:46

  21. sure.
    david@Messenger:~$ md5sum infected.zip
    2dc34382644e336e288ce161c8ed03d6 infected.zip

    Comment by Watchman1 — Wednesday 17 February 2016 @ 20:54

  22. It is malicious. Search for Array in the VBA code of A7. It contains an encoded URL like this one: hxxp://feestineendoos[.]nl/…..[.]exe

    Comment by Didier Stevens — Wednesday 17 February 2016 @ 21:03

  23. Very interesting! Thanks for the help.

    Comment by Watchman1 — Wednesday 17 February 2016 @ 21:22

  24. The script is using a string in the label on the form. Do you know how to get this string?
    onopridet = Split(UserForm1.Label1.Caption, “/”)

    Comment by Watchman1 — Wednesday 17 February 2016 @ 21:39

  25. Yes, in the stream named UserForm1. UserForm1/f contains the names and UserForm1/o the values.

    Comment by Didier Stevens — Wednesday 17 February 2016 @ 21:56

  26. I decompiled and reported over 2,000 examples of this batch of malware over the last two days.
    I use linux and got olevba to work, but it doesn’t unpack streams containing forms.
    I couldn’t get oledump to work on linux:
    print(‘Error – %s is not a valid OLE file.’ % infile)
    NameError: global name ‘infile’ is not defined

    Could you update olevba to unpack these streams to make parsing easier?

    Comment by Andy Lee Robinson — Friday 19 February 2016 @ 21:25

  27. 1) I can’t update olevba: I don’t develop olevba, I develop oledump
    2) The error you get means that you are using an old version of oledump
    3) Your malware samples are MIME files, you have to use my tool emldump

    Comment by Didier Stevens — Friday 19 February 2016 @ 21:31

  28. Thanks for the info Didier, I’ve only just delved into ole malware through the recent spate. I’ve written to Philippe and asked him if he could update olevba.

    Yes, those particular doc files were quite heavily mangled mime files quarantined from the emails by MailScanner and also archived with sendmail queuefiles so I can reconstruct the original mail for reporting.

    I had another look at your oledump – would be good to get a complete extraction for further parsing, and only way I could find was to do this:

    for i in $(seq 1 14); do /z/oledump.py -s$i -e FACTURA_385E59.doc > out$i.dat; done;

    Then I can use perl to parse and extract suspicious urls or other code not in the macros.

    “-e all” might be useful!

    Comment by Andy Lee Robinson — Tuesday 23 February 2016 @ 22:40

  29. All of my (Python) tools have a help function (-h, –help), and many of my Python tools have a man page (-m, –man).
    Option -s (select) also accepts value a to select all streams.

    Comment by Didier Stevens — Sunday 28 February 2016 @ 9:50


RSS feed for comments on this post. TrackBack URI

Leave a Reply (comments are moderated)

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.