Didier Stevens

Sunday 23 August 2020

New Tool: XORSearch.py

Filed under: Announcement,My Software — Didier Stevens @ 19:42

XORSearch, written in C, is a tool of mine I started 10+ years ago. But more and more security tools don’t like it.

So I decided to stop adding new features to XORSeach in C, and start programming a Python version to implement new features. This is a work in progress.

For the moment, the Python version only supports XOR-encoding with a one-byte key, and can only search for printable content.

Take a look at my SANS ISC diary entry to see how I use it.

I will still maintain the C version: perform bug fixes and add new features that require the speed of compiled C.

But features like detecting printable content will normally be used on small files, and then speed is not an issue.

Saturday 18 July 2020

Update XORSearch Version 1.11.4

Filed under: My Software,Update — Didier Stevens @ 10:08

This is a small bug fix version of XORSearch: fixing some printf format strings for Linux, thanks to Lenny Zeltser for reporting.

Because of Google, I can no longer host this tool on my website.

You have to get it from my FalsePositives GitHub repository.

XORSearch_V1_11_4.zip
MD5: E66290D1EB15D9394C8D1264A09ECFE6
SHA256: BF20A1D76AAD83FC3AABEDC6DDC7F96B655DC94BEC3FA276A50AF6046EBB554C

Thursday 9 April 2020

Update XORSearch Version 1.11.3

Filed under: My Software,Update — Didier Stevens @ 0:00

A small change in this new version of XORSearch: option -n now also takes a negative value (output characters left of keyword) or an explicit positive value (output characters right of keyword).

XORSearch_V1_11_3.zip (https)
MD5: 39A5799EC4C77E894A56B215A7E20409
SHA256: 50D1CDF5FE93E29E1D7FCB3CF2256CEAC0034CBD887E4DAC1CB897E14B28BC16

Thursday 27 December 2018

Update: XORSearch Version 1.11.2

Filed under: My Software,Update — Didier Stevens @ 0:00

This update for XORSearch brings new features and bug fixes.

Starting with this version, XORSearch accepts input from stdin. Use filename – to read data from stdin:

Option -S will print out all strings found using all decoders supported by XORSearch. Strings are sequences of printable characters, ASCII and UNICODE, at least 4 characters long.

As option -S brings many of the functionalities of XORStrings to XORSearch, I’m no longer developing XORStrings.

Last new option is -r. You can use option -r to reverse the file before searching.

I’m also including more compiled versions (look inside the ZIP file).

XORSearch_V1_11_2.zip (https)
MD5: 2B76F6C730BAC6324E92A731F42FEB74
SHA256: 4206B843AC2B9417A85A4B5381023EC4613C5B5095A6A0A19A072C21C66DE93F

Wednesday 5 November 2014

XORSearch: Hexdump Support

Filed under: My Software,Update — Didier Stevens @ 22:04

Sometimes I want to check a malware sample with XORSearch, but I can’t because my AV will delete it. My solution is to work with a hexdump of the file.

Option -x allows XORSearch to work with a hexdump.

XORSearch_V1_11_1.zip (https)
MD5: D5EA1E30B2C2C7FEBE7AE7AD6E826BF5
SHA256: 15E9AAE87E7F25CF7966CDF0F8DFCB2648099585D08EAD522737E72C5FACA50A

Monday 29 September 2014

Update: XORSearch With Shellcode Detector

Filed under: My Software,Update — Didier Stevens @ 0:00

XORSearch allows you to search for strings and embedded PE-files brute-forcing different encodings. Now I added shellcode detection.

This new version of XORSearch integrates Frank Boldewin’s shellcode detector. In his Hack.lu 2009 presentation, Frank explains how he detects shellcode in Microsoft Office documents by searching for byte sequences often used in shellcode.

I integrated Frank’s methods in XORSearch, so that you can use it for any file type, not only Microsoft Office files.

20140928-135402

Frank was kind enough to give me his source code for the detection engine. However, I did not integrated is source code as-is. I developed my own engine that uses rules to detect shellcode artifacts. These rules are not hard-coded, but can be externalized, so that you can define your own rules.

Wildcard rule syntax

A wildcard rule is composed of 3 parts: a rule name, a score and a pattern. These are separated by a : character.

Example of a rule:

Find kernel32 base method 1bis:10:64A130000000

The name of this rule is “Find kernel32 base method 1bis”, it has a score of 10, and the pattern is 64A130000000. When XORSearch finds byte pattern 64A130000000, it will report it mentioning rule name “Find kernel32 base method 1bis” and add 10 to the total score. This byte pattern is the following assembly instruction:

MOV EAX, dword [fs:0x30]

This is an instruction often found in shellcode that looks for the base of kernel32.

When assembly instructions reference a register, the register is encoded as bits in the bytes that make up the instruction. For example, pop eax is just one byte: 58. pop ecx is 59, pop edx is 5a, … If you look at the bits of this instruction, they have the following value: 01011RRR. The last 3 bits (RRR) encode the register to use for the pop instruction.

To deal with this, my rule definition language supports wildcards. This is how you encode a pop reg instruction:

(B;01011???)

The B indicates that we want to define a byte using bits and wildcards. 0 and 1 are fixed bit values, and ? is the wildcard: the bit value can be 0 or 1. Thus the pattern (B;01011???) matches bytes 58, 59, 5A, 5B, 5C, 5D, 5E and 5F.

This wildcard allows us to encode patterns for shellcode instructions that use registers. For example , here is an often used set of instructions to determine the EIP with shellcode:

	call label
label:
	pop eax

This pattern is encoded for all possible registers with the following rule:

GetEIP method 1:10:E800000000(B;01011???)

Another instruction often found in shellcode is xor reg1, reg1, like xor eax, eax.

You could represent this with the following pattern:

31(B;11??????)

But this pattern matches more instructions than you want. It matches xor eax, eax, xor ecx, ecx, … but also xor eax, ecx, xor eax, edx, … You want this pattern to match the xor instruction for the same register, and not different registers. That is why you can use the following syntax:

31(B;11A??A??)

By using a letter like A, B, …, as a wildcard, you assign a variable name to the wildcard bit pattern. ??? matches 3 bits. A?? also matches 3 bits, and assigns the variable name A to these 3 bits. When you use this bit pattern again, you make sure that the pattern will only be matched if the bit pattern is identical. Pattern ?????? matches 6 bits regardless of their value. Pattern A??A?? also matches 6 bits, but the first 3 bits must have the same value as the last 3 bits.

Here is another example:

Find kernel32 base method 3:10:6830000000(B;01011A??)648B(B;00B??A??)

This pattern matches the following set of assembly instructions:

	push 0x30
	pop reg1
	mov reg2, dword [fs:reg1]

By using bit pattern A?? for the register of the second instruction, and B??A?? for the registers of the third instruction, you make sure that the third instruction use the same register for indexing as the second instruction.

Up til now, we looked at sequential assembly instructions. But you can also have shellcode patterns with jumps, e.g. non-sequential instructions. Here is an example:

	jmp LABEL1
LABEL2:
	pop eax
	...
	...
LABEL1:
	call LABEL2

To enable to match assembly code patterns with jumps, I introduced the (J;*) pattern in my rule definitions. J stands for a jump, and * represent the numbers of bytes that make up the displacement of the jump instruction (normally 1 byte or 4 bytes). Here is the rule that encodes the above assembly code pattern:

GetEIP method 3:10:E9(J;4)E8(J;4)(B;01011???)

Finally, Frank’s detector also looks for suspicious strings, like UrlDownloadToFile, WinExec, … You can define rules using a hex pattern to detect these strings, but to facilitate the encoding of these rules, I added the str= keyword, like this:

Suspicious strings:2:str=UrlDownloadToFile
Suspicious strings:2:str=WinExec

Using wildcard rules

To use these shellcode wildcard rules with XORSearch, you use options -w or -W. -w allows you to specify your own rule(s), -W uses the build-in rules.

With -w, you can specify your rule as the search argument, or together with option -f, you provide a text file with rules.

Example: XORSearch.exe -w olimpikge.xls “GetEIP method 3:10:E9(J;4)E8(J;4)(B;01011???)”

With -W, you don’t have to provide the rules, XORSearch will use the build-in rules.

Example: XORSearch.exe -W olimpikge.xls

You can view the build-in rules with option -L:

Function prolog signature:10:558BEC83C4
Function prolog signature:10:558BEC81EC
Function prolog signature:10:558BECEB
Function prolog signature:10:558BECE8
Function prolog signature:10:558BECE9
Indirect function call tris:10:FFB7(B;????????)(B;????????)(B;????????)(B;????????)FF57(B;????????)
GetEIP method 4 FLDZ/FSTENV [esp-12]:10:D9EED97424F4(B;01011???)
GetEIP method 1:10:E800000000(B;01011???)
GetEIP method 2:10:EB(J;1)E8(J;4)(B;01011???)
GetEIP method 3:10:E9(J;4)E8(J;4)(B;01011???)
GetEIP method 4:10:D9EE9BD97424F4(B;01011???)
Find kernel32 base method 1:10:648B(B;00???101)30000000
Find kernel32 base method 1bis:10:64A130000000
Find kernel32 base method 2:10:31(B;11A??A??)(B;10100A??)30648B(B;00B??A??)
Find kernel32 base method 3:10:6830000000(B;01011A??)648B(B;00B??A??)
Structured exception handling :10:648B(B;00???101)00000000
Structured exception handling bis:10:64A100000000
API Hashing:10:AC84C07407C1CF0D01C7EBF481FF
API Hashing bis:10:AC84C07407C1CF0701C7EBF481FF
Indirect function call:10:FF75(B;A???????)FF55(B;A???????)
Indirect function call bis:10:FFB5(B;A???????)(B;B???????)(B;C???????)(B;D???????)FF95(B;A???????)(B;B???????)(B;C???????)(B;D???????)
OLE file magic number:10:D0CF11E0
Suspicious strings:2:str=UrlDownloadToFile
Suspicious strings:2:str=GetTempPath
Suspicious strings:2:str=GetWindowsDirectory
Suspicious strings:2:str=GetSystemDirectory
Suspicious strings:2:str=WinExec
Suspicious strings:2:str=ShellExecute
Suspicious strings:2:str=IsBadReadPtr
Suspicious strings:2:str=IsBadWritePtr
Suspicious strings:2:str=CreateFile
Suspicious strings:2:str=CloseHandle
Suspicious strings:2:str=ReadFile
Suspicious strings:2:str=WriteFile
Suspicious strings:2:str=SetFilePointer
Suspicious strings:2:str=VirtualAlloc
Suspicious strings:2:str=GetProcAddr
Suspicious strings:2:str=LoadLibrary

I derived these rules from the source code Frank gave me. Testing these rules on different benign and malicious files revealed 2 things: a couple of rules generated a lot of false positives, and brute-forcing the ROT encoding also generated a lot of false positives. So I removed these rules, and I added an option to disable encodings (option -d). For example, with option -d 3 I disable the brute-forcing of the ROT encoding (1: XOR 2: ROL 3: ROT 4: SHIFT 5: ADD).

When looking for shellcode, you want several rules to trigger. If just one or two rules trigger, they are likely false positives.
XORSearch_V1_11_0.zip (https)
MD5: 7313A198033C0A1F69B79F96894462C7
SHA256: 1700D037D7A9902108F3986D75A9BA250ACBD96E38CC43C5B4BC1FB90761B320

Thursday 20 March 2014

XORSearch: Finding Embedded Executables

Filed under: My Software,Update — Didier Stevens @ 10:58

Someone mentioned on a forum that he found a picture with an embedded, XORed executable. You can easily identify such embedded executables by xorsearching for the string “This program must be run under Win32”. But if the author or compiler modifies this DOS-stub string, you will not find it.

That’s how I got the idea to add an option to search for PE-files: search for string MZ, read the offset to the IMAGE_NT_HEADER structure (e_lfanew), and check if it starts with string PE.

Example: XORSearch.exe -p test.jpg

Found XOR A2 position 00005D1D: 000000E8 ........!..L.!This program cannot be r
Found XOR A2 position 0001221D: 00000108 ........!..L.!This program cannot be r

We found 2 embedded executables in test.jpg (XOR key A2). Remark we didn’t provide a search string, only option -p.

XORSearch also reports the value of e_lfanew and the string found in the DOS-stub. This allows you to inspect the results for false positives.

This can also be used on unencoded files, like this installation file:

XORSearch.exe -p c8400.msi
Found XOR 00 position 00236400: 000000E8 ........!..L.!This program cannot be r
Found XOR 00 position 00286000: 00000100 ........!..L.!This program cannot be r
Found XOR 00 position 00346800: 000000F8 ........!..L.!This program cannot be r
Found XOR 00 position 003A7200: 00000080 ........!..L.!This program cannot be r
Found XOR 00 position 003AD200: 00000080 ........!..L.!This program cannot be r
Found XOR 00 position 004B4800: 00000108 ........!..L.!This program cannot be r
Found XOR 00 position 004DE600: 000000F8 ........!..L.!This program cannot be r
Found XOR 00 position 004FE200: 000000E0 ........!..L.!This program cannot be r
Found XOR 00 position 00520C00: 000000E0 ........!..L.!This program cannot be r
Found XOR 00 position 00542000: 000000E0 ........!..L.!This program cannot be r
Found XOR 00 position 00562400: 00000100 ........!..L.!This program cannot be r
Found XOR 00 position 0058F800: 000000E0 ........!..L.!This program cannot be r

Finally, I added option -e (exclude). This excludes a particular byte-value from encoding. If you suspect a file is XOR encoded, but that byte 0x00 is not encoded, you use option -e 0x00.

XORSearch_V1_10_0.zip (https)
MD5: 23809A03C63914B0742B7F75B73E1597
SHA256: 97BFBC5E8C59F60E10ABDA2D65DF4200B10BE14662D4A447797B341C9AAE17D8

Monday 14 October 2013

Update: XORSearch Version 1.9.2

Filed under: Forensics,My Software,Update — Didier Stevens @ 5:00

I’ve been asked many times to support 32-bit keys with my XORSearch tool. But the problem is that a 32-bit bruteforce attack would take too much time.

Now I found a solution that doesn’t take months or years: a 32-bit dictionary attack.

I assume that the 32-bit XOR key is inside the file as a sequence of 4 consecutive bytes (MSB or LSB).

If you use the new option -k, XORSearch will perform a 32-bit dictionary attack to find the XOR key. The standard bruteforce attacks are disabled when you choose option -k.

XORSearch will extract a list of keys from the file: all unique sequences of 4 consecutive bytes (MSB and LSB order). Key 0x00000000 is excluded. Then it will use this list of keys to perform an XOR dictionary attack on the file, searching for the string you provided. Each key will be tested with an offset of 0, 1, 2 and 3.

It is not unusual to find the 32-bit XOR key inside the file itself. If it is a self-decoding executable, it can contain an XOR x86 instruction with the 32-bit key as operand. Or if the original file contains a sequence of 0x00 bytes (4 consecutive 0x00 bytes at least), then the encoded file will also contain the 32-bit XOR key.

Here is a test where XORSearch.exe searches a 0xDEADBEEF XOR encoded copy of itself. With only 74KB, there are still 100000+ keys to test, taking almost 10 minutes on my machine:

20131013-233829

XORSearch_V1_9_2.zip (https)
MD5: BF1AC6CAA325B6D1AF339B45782B8623
SHA256: 90793BEB9D429EF40458AE224117A90E6C4282DD1C9B0456E7E7148165B8EF32

Wednesday 20 February 2013

Update XORSearch V1.8.0: Shifting

Filed under: My Software,OSX,Reverse Engineering,Update — Didier Stevens @ 21:32

This new version of XORSearch comes with a new operation: shifting left.

It comes in handy to reverse engineer protocols like TeamViewer’s remote access protocol.

Here’s an example. When you run TeamViewer, your machine gets an ID:

20-02-2013 22-11-39

We capture some TeamViewer traffic with Wireshark, and then we use XORSearch to search for TeamViewer ID 441055893 in this traffic:

20130216-231230

And as you can see, XORSearch finds this ID by left-shifting the content of the pcap file with one bit.

Thursday 8 November 2012

XORSearch for OSX

Filed under: Forensics,Malware,My Software,OSX — Didier Stevens @ 21:58

I made a very small change to XORSearch’s source code (dropped malloc.h) so that it compiles on OSX.

You can find the new version on XORSearch’s page.

Next Page »

Blog at WordPress.com.