Didier Stevens

Saturday 27 April 2019

Update: format-bytes.py Version 0.0.8

Filed under: My Software,Reverse Engineering,Update — Didier Stevens @ 9:42

This new version of format-bytes.py (a tool to decompose structured binary data with format strings) brings a couple of new features.

Format strings can now be stored in libraries: you can store often used format strings (option -f) in text files and refer to them for using with format-bytes.py. A library file has the name of the program (format-bytes) and extension .library. Library files can be placed in the same directory as the program, and/or the current directory.
A library file is a text file. Each format string has a name and takes one line: name=formatstring.

Example:
eqn=<HIHIIIIIBBBBBBBBBB40sIIBB*:XXXXXXXXXXXXXXXXXXsXXXX

This defines format string eqn. It can be retrieved with option -f name=eqn.
This format string can be followed by annotations (use a space character to separate the format string and the annotations):

Example:
eqn=<HIHIIIIIBBBBBBBBBB40sIIBB*:XXXXXXXXXXXXXXXXXXsXXXX 1: size of EQNOLEFILEHDR 9: Start MTEF header 14: Full size record 15: Line record 16: Font record 19: Shellcode (fontname)

A line in a library file that starts with # is a comment and is ignored.

Format strings inside a library can be used with option -f. For example, to use format string eqn1, you use option -f name=eqn1. You prefix the format string name with “name=”, like in this example:

Option -s can also take value r now, to select the remainder: -s r. Like this:

The FILETIME format has been added. To use it explicitly, use representation format T.

And finally, with option -F (Find), you can search for values inside a binary file. For the moment, only integers can be searched. Start the option value with #i# followed by the decimal number to search for.

Example:

format-bytes_V0_0_8.zip (https)
MD5: 22F216C2304434A302B0904A9D4AF1FE
SHA256: A38D9B57DDB23543E2D462CD0AF51A4DCEDA1814CF9EAD315716D471EAACEF19

Saturday 20 April 2019

Extracting “Stack Strings” from Shellcode

Filed under: Malware,My Software,Reverse Engineering — Didier Stevens @ 0:00

A couple of years ago, I wrote a Python script to enhance Radare2 listings: the script extract strings from stack frame instructions.

Recently, I combined my tools to achieve the same without a 32-bit disassembler: I extract the strings directly from the binary shellcode.

What I’m looking for is sequences of instructions like this: mov dword [ebp – 0x10], 0x61626364. In 32-bit code, that’s C7 45 followed by one byte (offset operand) and 4 bytes (value operand).

Or: C7 45 10 64 63 62 61. I can write a regular expression for this instruction, and use my tool re-search.py to extract it from the binary shellcode. I want at least 2 consecutive mov … instructions: {2,}.

I’m using option -f because I want to process a binary file (re-search.py expects text files by default).

And I’m using option -x to produce hexadecimal output (to simplify further processing).

I want to get rid of the bytes for the instruction and the offset operand. I do this with sed:

I could convert this back to text with my tool hex-to-bin.py:

But that’s not ideal, because now all characters are merged into a single line.

My tool python-per-line.py gives a better result by processing this hexadecimal input line per line:

Remark that I also use function repr to escape unprintable characters like 00.

This output provides a good overview of all API functions called by this shellcode.

If you take a close look, you’ll notice that the last strings are incomplete: that’s because they are missing one or two characters, and these are put on the stack with another mov instruction for single or double bytes. I can accommodate my regular expression to take these instructions into account:

This is the complete command:

re-search.py -x -f "(?:\xC7\x45.....){2,}(?:(?:\xC6\x45..)|(?:\x66\xC7\x45...))?" shellcode.bin.vir | sed "s/66c745..//g" | sed "s/c[67]45..//g" | python-per-line.py -e "import binascii" "repr(binascii.a2b_hex(line))"

Tuesday 17 July 2018

!exploitable Crash Analyzer – Statically Linked CRT

Filed under: Reverse Engineering — Didier Stevens @ 0:00

Regularly when I use Microsoft MSEC’s !exploitable WinDbg extension, it doesn’t load because the correct VC runtime is not installed (vcredist 2012) on the machine I’m debugging on.

Since it’s open-source, I decided to recompile it with a statically linked C runtime, making it independent of the installed runtime(s). I used Visual Studio 2017 and let it do the default upgrade of the Visual Studio 2012 solution (default implies Windows XP is no longer supported). The only change I made was option /MT to link the runtime into the DLL.

To load the extension, type command “.load” with the full path to the DLL.
Or you can copy the DLL into a folder of the “extension dll search path”. You can view this search path with command “.chain” or “.extpath”:


Then you can just type “.load msec” to load the extension. If you use folders like x86\winext and x64\winext, you can copy the respective x86 and x64 versions without having to rename the DLL.

You can also load the extension and execute the command with one line (!msec.exploitable), like this:

One downside of statically linking the C runtime, is that I will have to recompile the DLLs if the C runtime gets patched to fix a vulnerability.

You can download the recompiled plugins here:
MSECWinDbgExtensions.zip (https)
MD5: 090D9E4BE43B7272AA54673C366695E3
SHA256: 39AB11FDF9F80608235CE26833F57A850DD2C36C513EB92C97E28714BA0076FA

Monday 28 May 2018

Quickpost: Windows Debugger as Post Mortem Debugger – 32-bit & 64-bit

Filed under: Quickpost,Reverse Engineering — Didier Stevens @ 0:00

I was following Microsoft’s advice to install WinDbg as a post mortem debugger, but didn’t get the expected results.

It turns out that WinDbg x64 version will register itself as the post mortem debugger for 64-bit and 32-bit processes, and not just for 64-bit processes:

Of course, WinDbg x86 version will register itself only for 32-bit processes:

So to make sure that WinDbg x64 version will debug only 64-bit processes and WinDbg x86 version will debug 32-bit processes, run the post mortem registration commands in this order:

"c:\Program Files (x86)\Windows Kits\10\Debuggers\x64\windbg.exe" -I
"c:\Program Files (x86)\Windows Kits\10\Debuggers\x86\windbg.exe" -I

And of course, run the commands from an elevated command prompt, as you’ll need to write to the HKLM hive. Otherwise you’ll get a reminder:

 


Quickpost info


Tuesday 23 May 2017

WannaCry Simple File Analysis

Filed under: Malware,My Software,Reverse Engineering — Didier Stevens @ 7:32

In this video, I show how to get started with my tools and a WannaCry sample.

Tools: pecheck.py, zipdump.py, strings.py

Sample: 84c82835a5d21bbcf75a61706d8ab549

Monday 30 January 2017

Quickpost: Dropbox & Alternate Data Streams

Filed under: Forensics,Quickpost,Reverse Engineering — Didier Stevens @ 0:00

When I got this popup while moving a file from a Dropbox folder, I immediately thought Alternate Data Stream:

20170124-221042

I ran my filescanner on the file, and found an ADS with name com.dropbox.attributes:

20170128-094319

From the Magic HEX value, we can see that the content of the stream (frozen-sea-foam.mp4:com.dropbox.attributes) starts with 0x78 (and the streamsize is 83 bytes). 0x78 hints at zlib deflated data.

If you are not that familiar with magic values, you can use my file-magic tool:

20170128-224649

Trying to decompress the ADS with translate.py gives us JSON data {“dropbox_fileid_local”: {“machineid_attr”: {“data”: “aa4xliox7z5n0qewxOlT3Q==”}}}:

20170128-102645

The data field looks like BASE64, so let’s try to decode it with base64dump.py:

20170128-104110

It decodes with BASE64 to data that looks random. From the names in the JSON data, we can deduce that this is probably a machine ID.

Remark 1: as it could well be my unique machine ID, I altered the value of the ID.

Remark 2: my file-magic.py tool is beta.

Remark 3: if you wonder what the video frozen-sea-foam is, I have it on Instragram.

 


Quickpost info


Saturday 7 November 2015

Analysis Of An Office Maldoc With Encrypted Payload: oledump plugin

Filed under: maldoc,Malware,My Software,Reverse Engineering — Didier Stevens @ 0:00

After a quick and dirty analysis and a “slow and clean” analysis of a malicious document, we can integrate the Python decoder function into a plugin: the plugin_dridex.py

First we add function IpkfHKQ2Sd to the plugin. The function uses the array module, so we need to import it (line 30):

20151106-222710

Then we can add the IpkfHKQ2Sd function (line 152):

20151106-222928

And then we can add function IpkfHKQ2Sd to the list in line 217:

20151106-223132

This is the code that tries different decoding functions that take 2 arguments: a secret and a key.

I also added code (from plugin_http_heuristics) to support Chr concatenations:

20151106-223608

The result is that the plugin can now extract the URLs from this sample:

20151106-222050

Download:
oledump_V0_0_19.zip (https)
MD5: DBE32C21C564DB8467D0064A7D4D92BC
SHA256: 7F8DCAA2DE9BB525FB967B7AEB2F9B06AEB5F9D60357D7B3D14DEFCB12FD3F94

Friday 6 November 2015

Analysis Of An Office Maldoc With Encrypted Payload (Slow And Clean)

Filed under: maldoc,Malware,My Software,Reverse Engineering — Didier Stevens @ 0:00

In my previous post we used VBA and Excel to decode the URL and the PE file.

In this  post we will use Python. I translated the VBA decoding function IpkfHKQ2Sd to Python:

20151105-223017

Now we can decode the URL using Python:

20151105-223901

And also decode the downloaded file with my translate program and the IpkfHKQ2Sd function:

20151105-224328

20151105-224636

 

Thursday 5 November 2015

Analysis Of An Office Maldoc With Encrypted Payload (Quick And Dirty)

Filed under: maldoc,Malware,My Software,Reverse Engineering — Didier Stevens @ 0:00

The malicious office document we’re analyzing is a downloader: 0e73d64fbdf6c87935c0cff9e65fa3be

oledump reveals VBA macros in the document, but the plugins are not able to extract a URL:

20151104-194727

Let’s use a new plugin that I wrote: plugin_vba_dco. This plugin searches for Declare statements and CreateObject calls:

20151104-194827

In the first half of the output (1) we see all lines containing the Declare or CreateObject keyword. In the second half of the output (2) we see all lines containing calls to declared functions or created objects.

Although the code is obfuscated (obfuscation of strings and variable names), the output of this plugin allows us to guess that Ci8J27hf2 is probably a XMLHTTP object, because of the .Open, .send, .Status, … methods and properties.

The Open method of the XMLHTTP object takes 3 parameters: the HTTP method, the URL and a boolean (asynchronous or synchronous call):

20151104-195006

As we can see, the third parameter is False and the first 2 parameters are the return value of a function called IpkfHKQ2Sd. This function takes 2 parameters: 2 strings. The first string is the result of concatenated Chr functions, and the second string is a literal string. Since the Open method requires the HTTP method and URL as strings, is very likely that function IpkfHKQ2Sd is a decoding function that takes 2 strings as input (meaningless to us) and returns a meaningful string.

Here is the original IpkfHKQ2Sd function. It’s heavily obfuscated:

20151104-195102

Here is the same function that I deobfuscated. I didn’t change the function name, but I removed all useless code, renamed variables and added indentation:

20151104-195144

We can now see that this function uses a key (sKey) and XOR operations to decode a secret string (sSecret). And now we can also see that this is just a string manipulation function. It does not contain malicious or dangerous statements or function calls. So it is safe to use in a VBA interpreter, we don’t need to translate it into another language like Python.

We are going to use this deobfuscated function in a new spreadsheet to decode the URL parameter:

20151104-195359

In the VBA editor of this new spreadsheet, we have the deobfuscated IpkfHKQ2Sd function and a test subroutine that calls the IpkfHKQ2Sd function with strings that we found in the .Open method for the URL argument. The decoded string returned by function IpkfHKQ2Sd is displayed via MsgBox. Executing this test subroutine reveals the URL:

20151104-195410

Downloading this file, we see it’s not a JPEG file, but contrary to what we could expect, it’s neither an EXE file:

20151104-195912

Searching for .responseBody in the VBA code, we see that the downloaded file (present in .responseBody) is passed as an argument to function IpkfHKQ2Sd:

20151104-195823

This means that the downloaded file is also encoded. It needs to be decoded with the same function as we used for the URL: function IpkfHKQ2Sd (but with another key).

To convert this file with the deobfuscated function in our spreadsheet, we need to load the file in the spreadsheet, decode it, and save the decoded file to disk. This can be done with my FileContainer.xls tool (to be released). First we load the encoded file in the FileContainer:

20151104-200044

20151104-200105

FileContainer supports file conversion: we have to use command C and push the Process Files button:

20151104-200125

Here is the default conversion function Convert. This default function doesn’t change the file: the output is equal to the input:

20151104-200214

To decode the file, we need to update the Convert function to call the decoding function IpkfHKQ2Sd with the right key. Like this:

20151104-200424

And then, when we convert the file, we obtain an EXE file:

20151104-200952

This EXE turns out to be Dridex malware: 50E3407557500FCD0D81BB6E3B026404

Remark: reusing code from malware is dangerous unless we know exactly what the code does. To decode the downloaded file quickly, we reused the decoding VBA function IpkfHKQ2Sd (I did not translate it into another language like Python). But to be sure it was not malicious, I deobfuscated it first. The deobfuscation process gave me the opportunity to look at each individual statement, thereby giving me insight into the code and come to the conclusion that this function is not dangerous. We could also have used the obfuscated function, but then we ran the risk that malware would execute because we did not fully understand what the obfuscated function did.

Translating the obfuscating function to another language doesn’t make it less dangerous, but it allows us to execute it in a non-Windows environment (like Linux), thereby preventing Windows malware from executing.

Monday 13 July 2015

Extracting Dyre Configuration From A Process Dump

Filed under: Forensics,My Software,Reverse Engineering — Didier Stevens @ 0:00

There are a couple of scripts and programs available on the Internet to extract the configuration of the Dyre banking malware from a memory dump. What I’m showing here is a method using a generic regular expression tool I developed (re-search).

Here is the Dyre configuration extracted from the strings found inside the memory dump:

2015-07-12_14-47-24

I want to produce a list of the domains found as first item in an <litem> element. re-search is a bit like grep -o, it doesn’t select lines but it selects matches of the provided regular expression. Here I’m looking for tag <litem>:

2015-07-12_15-10-39

By default, re-search will process text files line-by-line, like grep. But since the process memory dump is not a text file but a binary file, it’s best not to try to process it line-by-line, but process it in one go. This is done with option -f (fullread).

Next I’m extending my regular expression to include the newline characters following <litem>:

2015-07-12_15-17-35

And now I extend it with the domain (remark that the Dyre configuration supports asterisks (*) in the domain names):

2015-07-12_15-19-58

If you include a group () in your regular expression, re-search will only output the matched group, and not the complete regex match. So by surrounding the regex for the domain with parentheses, I extract the domains:

2015-07-12_15-24-44

This gives me 1632 domains, but many domains appear more than once in the list. I use option -u (unique) to produce a list of unique domain names (683 domains):

2015-07-12_15-28-06

Producing a sorted list of domain names is not simple when they have subdomains:

2015-07-12_15-30-09

That’s why I have a tool to sort domains by tld first, then domain, then subdomain, …

2015-07-12_15-32-28
re-search_V0_0_1.zip (https)
MD5: 5700D814CE5DD5B47F9C09CD819256BD
SHA256: 8CCF0117444A2F28BAEA6281200805A07445E9A061D301CC385965F3D0E8B1AF

Next Page »

Blog at WordPress.com.