Didier Stevens

Thursday 27 December 2018

Update: XORSearch Version 1.11.2

Filed under: My Software,Update — Didier Stevens @ 0:00

This update for XORSearch brings new features and bug fixes.

Starting with this version, XORSearch accepts input from stdin. Use filename – to read data from stdin:

Option -S will print out all strings found using all decoders supported by XORSearch. Strings are sequences of printable characters, ASCII and UNICODE, at least 4 characters long.

As option -S brings many of the functionalities of XORStrings to XORSearch, I’m no longer developing XORStrings.

Last new option is -r. You can use option -r to reverse the file before searching.

I’m also including more compiled versions (look inside the ZIP file).

XORSearch_V1_11_2.zip (https)
MD5: 2B76F6C730BAC6324E92A731F42FEB74
SHA256: 4206B843AC2B9417A85A4B5381023EC4613C5B5095A6A0A19A072C21C66DE93F

Wednesday 5 November 2014

XORSearch: Hexdump Support

Filed under: My Software,Update — Didier Stevens @ 22:04

Sometimes I want to check a malware sample with XORSearch, but I can’t because my AV will delete it. My solution is to work with a hexdump of the file.

Option -x allows XORSearch to work with a hexdump.

XORSearch_V1_11_1.zip (https)
SHA256: 15E9AAE87E7F25CF7966CDF0F8DFCB2648099585D08EAD522737E72C5FACA50A

Monday 29 September 2014

Update: XORSearch With Shellcode Detector

Filed under: My Software,Update — Didier Stevens @ 0:00

XORSearch allows you to search for strings and embedded PE-files brute-forcing different encodings. Now I added shellcode detection.

This new version of XORSearch integrates Frank Boldewin’s shellcode detector. In his Hack.lu 2009 presentation, Frank explains how he detects shellcode in Microsoft Office documents by searching for byte sequences often used in shellcode.

I integrated Frank’s methods in XORSearch, so that you can use it for any file type, not only Microsoft Office files.


Frank was kind enough to give me his source code for the detection engine. However, I did not integrated is source code as-is. I developed my own engine that uses rules to detect shellcode artifacts. These rules are not hard-coded, but can be externalized, so that you can define your own rules.

Wildcard rule syntax

A wildcard rule is composed of 3 parts: a rule name, a score and a pattern. These are separated by a : character.

Example of a rule:

Find kernel32 base method 1bis:10:64A130000000

The name of this rule is “Find kernel32 base method 1bis”, it has a score of 10, and the pattern is 64A130000000. When XORSearch finds byte pattern 64A130000000, it will report it mentioning rule name “Find kernel32 base method 1bis” and add 10 to the total score. This byte pattern is the following assembly instruction:

MOV EAX, dword [fs:0x30]

This is an instruction often found in shellcode that looks for the base of kernel32.

When assembly instructions reference a register, the register is encoded as bits in the bytes that make up the instruction. For example, pop eax is just one byte: 58. pop ecx is 59, pop edx is 5a, … If you look at the bits of this instruction, they have the following value: 01011RRR. The last 3 bits (RRR) encode the register to use for the pop instruction.

To deal with this, my rule definition language supports wildcards. This is how you encode a pop reg instruction:


The B indicates that we want to define a byte using bits and wildcards. 0 and 1 are fixed bit values, and ? is the wildcard: the bit value can be 0 or 1. Thus the pattern (B;01011???) matches bytes 58, 59, 5A, 5B, 5C, 5D, 5E and 5F.

This wildcard allows us to encode patterns for shellcode instructions that use registers. For example , here is an often used set of instructions to determine the EIP with shellcode:

	call label
	pop eax

This pattern is encoded for all possible registers with the following rule:

GetEIP method 1:10:E800000000(B;01011???)

Another instruction often found in shellcode is xor reg1, reg1, like xor eax, eax.

You could represent this with the following pattern:


But this pattern matches more instructions than you want. It matches xor eax, eax, xor ecx, ecx, … but also xor eax, ecx, xor eax, edx, … You want this pattern to match the xor instruction for the same register, and not different registers. That is why you can use the following syntax:


By using a letter like A, B, …, as a wildcard, you assign a variable name to the wildcard bit pattern. ??? matches 3 bits. A?? also matches 3 bits, and assigns the variable name A to these 3 bits. When you use this bit pattern again, you make sure that the pattern will only be matched if the bit pattern is identical. Pattern ?????? matches 6 bits regardless of their value. Pattern A??A?? also matches 6 bits, but the first 3 bits must have the same value as the last 3 bits.

Here is another example:

Find kernel32 base method 3:10:6830000000(B;01011A??)648B(B;00B??A??)

This pattern matches the following set of assembly instructions:

	push 0x30
	pop reg1
	mov reg2, dword [fs:reg1]

By using bit pattern A?? for the register of the second instruction, and B??A?? for the registers of the third instruction, you make sure that the third instruction use the same register for indexing as the second instruction.

Up til now, we looked at sequential assembly instructions. But you can also have shellcode patterns with jumps, e.g. non-sequential instructions. Here is an example:

	jmp LABEL1
	pop eax
	call LABEL2

To enable to match assembly code patterns with jumps, I introduced the (J;*) pattern in my rule definitions. J stands for a jump, and * represent the numbers of bytes that make up the displacement of the jump instruction (normally 1 byte or 4 bytes). Here is the rule that encodes the above assembly code pattern:

GetEIP method 3:10:E9(J;4)E8(J;4)(B;01011???)

Finally, Frank’s detector also looks for suspicious strings, like UrlDownloadToFile, WinExec, … You can define rules using a hex pattern to detect these strings, but to facilitate the encoding of these rules, I added the str= keyword, like this:

Suspicious strings:2:str=UrlDownloadToFile
Suspicious strings:2:str=WinExec

Using wildcard rules

To use these shellcode wildcard rules with XORSearch, you use options -w or -W. -w allows you to specify your own rule(s), -W uses the build-in rules.

With -w, you can specify your rule as the search argument, or together with option -f, you provide a text file with rules.

Example: XORSearch.exe -w olimpikge.xls “GetEIP method 3:10:E9(J;4)E8(J;4)(B;01011???)”

With -W, you don’t have to provide the rules, XORSearch will use the build-in rules.

Example: XORSearch.exe -W olimpikge.xls

You can view the build-in rules with option -L:

Function prolog signature:10:558BEC83C4
Function prolog signature:10:558BEC81EC
Function prolog signature:10:558BECEB
Function prolog signature:10:558BECE8
Function prolog signature:10:558BECE9
Indirect function call tris:10:FFB7(B;????????)(B;????????)(B;????????)(B;????????)FF57(B;????????)
GetEIP method 4 FLDZ/FSTENV [esp-12]:10:D9EED97424F4(B;01011???)
GetEIP method 1:10:E800000000(B;01011???)
GetEIP method 2:10:EB(J;1)E8(J;4)(B;01011???)
GetEIP method 3:10:E9(J;4)E8(J;4)(B;01011???)
GetEIP method 4:10:D9EE9BD97424F4(B;01011???)
Find kernel32 base method 1:10:648B(B;00???101)30000000
Find kernel32 base method 1bis:10:64A130000000
Find kernel32 base method 2:10:31(B;11A??A??)(B;10100A??)30648B(B;00B??A??)
Find kernel32 base method 3:10:6830000000(B;01011A??)648B(B;00B??A??)
Structured exception handling :10:648B(B;00???101)00000000
Structured exception handling bis:10:64A100000000
API Hashing:10:AC84C07407C1CF0D01C7EBF481FF
API Hashing bis:10:AC84C07407C1CF0701C7EBF481FF
Indirect function call:10:FF75(B;A???????)FF55(B;A???????)
Indirect function call bis:10:FFB5(B;A???????)(B;B???????)(B;C???????)(B;D???????)FF95(B;A???????)(B;B???????)(B;C???????)(B;D???????)
OLE file magic number:10:D0CF11E0
Suspicious strings:2:str=UrlDownloadToFile
Suspicious strings:2:str=GetTempPath
Suspicious strings:2:str=GetWindowsDirectory
Suspicious strings:2:str=GetSystemDirectory
Suspicious strings:2:str=WinExec
Suspicious strings:2:str=ShellExecute
Suspicious strings:2:str=IsBadReadPtr
Suspicious strings:2:str=IsBadWritePtr
Suspicious strings:2:str=CreateFile
Suspicious strings:2:str=CloseHandle
Suspicious strings:2:str=ReadFile
Suspicious strings:2:str=WriteFile
Suspicious strings:2:str=SetFilePointer
Suspicious strings:2:str=VirtualAlloc
Suspicious strings:2:str=GetProcAddr
Suspicious strings:2:str=LoadLibrary

I derived these rules from the source code Frank gave me. Testing these rules on different benign and malicious files revealed 2 things: a couple of rules generated a lot of false positives, and brute-forcing the ROT encoding also generated a lot of false positives. So I removed these rules, and I added an option to disable encodings (option -d). For example, with option -d 3 I disable the brute-forcing of the ROT encoding (1: XOR 2: ROL 3: ROT 4: SHIFT 5: ADD).

When looking for shellcode, you want several rules to trigger. If just one or two rules trigger, they are likely false positives.
XORSearch_V1_11_0.zip (https)
MD5: 7313A198033C0A1F69B79F96894462C7
SHA256: 1700D037D7A9902108F3986D75A9BA250ACBD96E38CC43C5B4BC1FB90761B320

Thursday 20 March 2014

XORSearch: Finding Embedded Executables

Filed under: My Software,Update — Didier Stevens @ 10:58

Someone mentioned on a forum that he found a picture with an embedded, XORed executable. You can easily identify such embedded executables by xorsearching for the string “This program must be run under Win32”. But if the author or compiler modifies this DOS-stub string, you will not find it.

That’s how I got the idea to add an option to search for PE-files: search for string MZ, read the offset to the IMAGE_NT_HEADER structure (e_lfanew), and check if it starts with string PE.

Example: XORSearch.exe -p test.jpg

Found XOR A2 position 00005D1D: 000000E8 ........!..L.!This program cannot be r
Found XOR A2 position 0001221D: 00000108 ........!..L.!This program cannot be r

We found 2 embedded executables in test.jpg (XOR key A2). Remark we didn’t provide a search string, only option -p.

XORSearch also reports the value of e_lfanew and the string found in the DOS-stub. This allows you to inspect the results for false positives.

This can also be used on unencoded files, like this installation file:

XORSearch.exe -p c8400.msi
Found XOR 00 position 00236400: 000000E8 ........!..L.!This program cannot be r
Found XOR 00 position 00286000: 00000100 ........!..L.!This program cannot be r
Found XOR 00 position 00346800: 000000F8 ........!..L.!This program cannot be r
Found XOR 00 position 003A7200: 00000080 ........!..L.!This program cannot be r
Found XOR 00 position 003AD200: 00000080 ........!..L.!This program cannot be r
Found XOR 00 position 004B4800: 00000108 ........!..L.!This program cannot be r
Found XOR 00 position 004DE600: 000000F8 ........!..L.!This program cannot be r
Found XOR 00 position 004FE200: 000000E0 ........!..L.!This program cannot be r
Found XOR 00 position 00520C00: 000000E0 ........!..L.!This program cannot be r
Found XOR 00 position 00542000: 000000E0 ........!..L.!This program cannot be r
Found XOR 00 position 00562400: 00000100 ........!..L.!This program cannot be r
Found XOR 00 position 0058F800: 000000E0 ........!..L.!This program cannot be r

Finally, I added option -e (exclude). This excludes a particular byte-value from encoding. If you suspect a file is XOR encoded, but that byte 0x00 is not encoded, you use option -e 0x00.

XORSearch_V1_10_0.zip (https)
MD5: 23809A03C63914B0742B7F75B73E1597
SHA256: 97BFBC5E8C59F60E10ABDA2D65DF4200B10BE14662D4A447797B341C9AAE17D8

Monday 14 October 2013

Update: XORSearch Version 1.9.2

Filed under: Forensics,My Software,Update — Didier Stevens @ 5:00

I’ve been asked many times to support 32-bit keys with my XORSearch tool. But the problem is that a 32-bit bruteforce attack would take too much time.

Now I found a solution that doesn’t take months or years: a 32-bit dictionary attack.

I assume that the 32-bit XOR key is inside the file as a sequence of 4 consecutive bytes (MSB or LSB).

If you use the new option -k, XORSearch will perform a 32-bit dictionary attack to find the XOR key. The standard bruteforce attacks are disabled when you choose option -k.

XORSearch will extract a list of keys from the file: all unique sequences of 4 consecutive bytes (MSB and LSB order). Key 0x00000000 is excluded. Then it will use this list of keys to perform an XOR dictionary attack on the file, searching for the string you provided. Each key will be tested with an offset of 0, 1, 2 and 3.

It is not unusual to find the 32-bit XOR key inside the file itself. If it is a self-decoding executable, it can contain an XOR x86 instruction with the 32-bit key as operand. Or if the original file contains a sequence of 0x00 bytes (4 consecutive 0x00 bytes at least), then the encoded file will also contain the 32-bit XOR key.

Here is a test where XORSearch.exe searches a 0xDEADBEEF XOR encoded copy of itself. With only 74KB, there are still 100000+ keys to test, taking almost 10 minutes on my machine:


XORSearch_V1_9_2.zip (https)
MD5: BF1AC6CAA325B6D1AF339B45782B8623
SHA256: 90793BEB9D429EF40458AE224117A90E6C4282DD1C9B0456E7E7148165B8EF32

Wednesday 20 February 2013

Update XORSearch V1.8.0: Shifting

Filed under: My Software,OSX,Reverse Engineering,Update — Didier Stevens @ 21:32

This new version of XORSearch comes with a new operation: shifting left.

It comes in handy to reverse engineer protocols like TeamViewer’s remote access protocol.

Here’s an example. When you run TeamViewer, your machine gets an ID:

20-02-2013 22-11-39

We capture some TeamViewer traffic with Wireshark, and then we use XORSearch to search for TeamViewer ID 441055893 in this traffic:


And as you can see, XORSearch finds this ID by left-shifting the content of the pcap file with one bit.

Thursday 8 November 2012

XORSearch for OSX

Filed under: Forensics,Malware,My Software,OSX — Didier Stevens @ 21:58

I made a very small change to XORSearch’s source code (dropped malloc.h) so that it compiles on OSX.

You can find the new version on XORSearch’s page.

Wednesday 10 October 2012

XORSearch Video

Filed under: Announcement,My Software — Didier Stevens @ 17:41

I will release free stuff on my company’s website Didier Stevens Labs. Like this new XORSearch video.

XORSearch is one of my popular tools, but I hadn’t made a video for it yet:

Monday 18 January 2010

Update: XORSearch Version 1.6.0

Filed under: My Software,Update — Didier Stevens @ 1:26

A couple of new features:

  • searching for Unicode
  • searching for Hex code
  • printing of neighbouring bytes

Unicode support is rather simple: I consider Unicode as ASCII with 2 bytes per character, last byte always equals 0.

Usage case of hexcode search: search for embedded and encoded PE-file by searching for the PE-magic bytes MZ:

XORSearch -h malware.exe 50450000

Remember that XORSearch is not limited to win32, you can compile it on *nix too: cc -o XORSearch XORSearch.c

Download here.

Sunday 19 April 2009

Update: XORSearch V1.4.0

Filed under: My Software,Update — Didier Stevens @ 16:43

Miles Wolbe was looking for some strings in a Dell BIOS update; it took him some time to figure out they are ROT-1 encoded.

I updated my XORSearch tool to support ROT encoding.

Next Page »

Blog at WordPress.com.