Didier Stevens

Monday 20 October 2014

Update: PDFiD With Plugins Part 1

Filed under: My Software,PDF,Update — Didier Stevens @ 8:51

Almost from the beginning when I released PDFiD, people asked me for anti-virus like feature: that PDFiD would tell you if a PDF was malicious or not. Some people even patched PDFiD with a scoring feature.

But I didn’t want to develop an “anti-virus” for PDFs; PDFiD is a triage tool.

Now you can develop your own scoring system with plugins.

Plugins are loaded with option -p, like this:

20141020-102902

I provide 3 plugins: plugin_triage.py, plugin_nameobfuscation.py and plugin_embeddedfile.py. You can run more than one plugin by separating their names with a comma: pdfid.py -p plugin_triage,plugin_embeddedfile js.pdf

Or you can use an @-file: a text file with the names of the plugins you want to run.

To output the result as CSV file, use option -c, and to write the output to a file, use option -o. With option -m, you can provide a minimum score the plugin has to produce for its output to be displayed.

Plugins are Python classes, I’ll explain how to make your own in a later post.

plugin_triage.py produces a score of 1.0 when the PDF requires further analysis, and 0.0 if not.

plugin_nameobfuscation.py produces a score of 1.0 when name obfuscation is used in the PDF.

plugin_embeddedfile.py produces a score of 0.9 when an embedded file is present, and 1.0 when name obfuscation is also used.
pdfid_v0_2_1.zip (https)
MD5: 7463412536678B321276F8720F52DE81
SHA256: F1B4728DD2CE455B863B930E12C6DEC952CB95C0BB3D6924136A6E49ACA877C2

Monday 29 September 2014

Update: XORSearch With Shellcode Detector

Filed under: My Software,Update — Didier Stevens @ 0:00

XORSearch allows you to search for strings and embedded PE-files brute-forcing different encodings. Now I added shellcode detection.

This new version of XORSearch integrates Frank Boldewin’s shellcode detector. In his Hack.lu 2009 presentation, Frank explains how he detects shellcode in Microsoft Office documents by searching for byte sequences often used in shellcode.

I integrated Frank’s methods in XORSearch, so that you can use it for any file type, not only Microsoft Office files.

20140928-135402

Frank was kind enough to give me his source code for the detection engine. However, I did not integrated is source code as-is. I developed my own engine that uses rules to detect shellcode artifacts. These rules are not hard-coded, but can be externalized, so that you can define your own rules.

Wildcard rule syntax

A wildcard rule is composed of 3 parts: a rule name, a score and a pattern. These are separated by a : character.

Example of a rule:

Find kernel32 base method 1bis:10:64A130000000

The name of this rule is “Find kernel32 base method 1bis”, it has a score of 10, and the pattern is 64A130000000. When XORSearch finds byte pattern 64A130000000, it will report it mentioning rule name “Find kernel32 base method 1bis” and add 10 to the total score. This byte pattern is the following assembly instruction:

MOV EAX, dword [fs:0x30]

This is an instruction often found in shellcode that looks for the base of kernel32.

When assembly instructions reference a register, the register is encoded as bits in the bytes that make up the instruction. For example, pop eax is just one byte: 58. pop ecx is 59, pop edx is 5a, … If you look at the bits of this instruction, they have the following value: 01011RRR. The last 3 bits (RRR) encode the register to use for the pop instruction.

To deal with this, my rule definition language supports wildcards. This is how you encode a pop reg instruction:

(B;01011???)

The B indicates that we want to define a byte using bits and wildcards. 0 and 1 are fixed bit values, and ? is the wildcard: the bit value can be 0 or 1. Thus the pattern (B;01011???) matches bytes 58, 59, 5A, 5B, 5C, 5D, 5E and 5F.

This wildcard allows us to encode patterns for shellcode instructions that use registers. For example , here is an often used set of instructions to determine the EIP with shellcode:

	call label
label:
	pop eax

This pattern is encoded for all possible registers with the following rule:

GetEIP method 1:10:E800000000(B;01011???)

Another instruction often found in shellcode is xor reg1, reg1, like xor eax, eax.

You could represent this with the following pattern:

31(B;11??????)

But this pattern matches more instructions than you want. It matches xor eax, eax, xor ecx, ecx, … but also xor eax, ecx, xor eax, edx, … You want this pattern to match the xor instruction for the same register, and not different registers. That is why you can use the following syntax:

31(B;11A??A??)

By using a letter like A, B, …, as a wildcard, you assign a variable name to the wildcard bit pattern. ??? matches 3 bits. A?? also matches 3 bits, and assigns the variable name A to these 3 bits. When you use this bit pattern again, you make sure that the pattern will only be matched if the bit pattern is identical. Pattern ?????? matches 6 bits regardless of their value. Pattern A??A?? also matches 6 bits, but the first 3 bits must have the same value as the last 3 bits.

Here is another example:

Find kernel32 base method 3:10:6830000000(B;01011A??)648B(B;00B??A??)

This pattern matches the following set of assembly instructions:

	push 0x30
	pop reg1
	mov reg2, dword [fs:reg1]

By using bit pattern A?? for the register of the second instruction, and B??A?? for the registers of the third instruction, you make sure that the third instruction use the same register for indexing as the second instruction.

Up til now, we looked at sequential assembly instructions. But you can also have shellcode patterns with jumps, e.g. non-sequential instructions. Here is an example:

	jmp LABEL1
LABEL2:
	pop eax
	...
	...
LABEL1:
	call LABEL2

To enable to match assembly code patterns with jumps, I introduced the (J;*) pattern in my rule definitions. J stands for a jump, and * represent the numbers of bytes that make up the displacement of the jump instruction (normally 1 byte or 4 bytes). Here is the rule that encodes the above assembly code pattern:

GetEIP method 3:10:E9(J;4)E8(J;4)(B;01011???)

Finally, Frank’s detector also looks for suspicious strings, like UrlDownloadToFile, WinExec, … You can define rules using a hex pattern to detect these strings, but to facilitate the encoding of these rules, I added the str= keyword, like this:

Suspicious strings:2:str=UrlDownloadToFile
Suspicious strings:2:str=WinExec

Using wildcard rules

To use these shellcode wildcard rules with XORSearch, you use options -w or -W. -w allows you to specify your own rule(s), -W uses the build-in rules.

With -w, you can specify your rule as the search argument, or together with option -f, you provide a text file with rules.

Example: XORSearch.exe -w olimpikge.xls “GetEIP method 3:10:E9(J;4)E8(J;4)(B;01011???)”

With -W, you don’t have to provide the rules, XORSearch will use the build-in rules.

Example: XORSearch.exe -W olimpikge.xls

You can view the build-in rules with option -L:

Function prolog signature:10:558BEC83C4
Function prolog signature:10:558BEC81EC
Function prolog signature:10:558BECEB
Function prolog signature:10:558BECE8
Function prolog signature:10:558BECE9
Indirect function call tris:10:FFB7(B;????????)(B;????????)(B;????????)(B;????????)FF57(B;????????)
GetEIP method 4 FLDZ/FSTENV [esp-12]:10:D9EED97424F4(B;01011???)
GetEIP method 1:10:E800000000(B;01011???)
GetEIP method 2:10:EB(J;1)E8(J;4)(B;01011???)
GetEIP method 3:10:E9(J;4)E8(J;4)(B;01011???)
GetEIP method 4:10:D9EE9BD97424F4(B;01011???)
Find kernel32 base method 1:10:648B(B;00???101)30000000
Find kernel32 base method 1bis:10:64A130000000
Find kernel32 base method 2:10:31(B;11A??A??)(B;10100A??)30648B(B;00B??A??)
Find kernel32 base method 3:10:6830000000(B;01011A??)648B(B;00B??A??)
Structured exception handling :10:648B(B;00???101)00000000
Structured exception handling bis:10:64A100000000
API Hashing:10:AC84C07407C1CF0D01C7EBF481FF
API Hashing bis:10:AC84C07407C1CF0701C7EBF481FF
Indirect function call:10:FF75(B;A???????)FF55(B;A???????)
Indirect function call bis:10:FFB5(B;A???????)(B;B???????)(B;C???????)(B;D???????)FF95(B;A???????)(B;B???????)(B;C???????)(B;D???????)
OLE file magic number:10:D0CF11E0
Suspicious strings:2:str=UrlDownloadToFile
Suspicious strings:2:str=GetTempPath
Suspicious strings:2:str=GetWindowsDirectory
Suspicious strings:2:str=GetSystemDirectory
Suspicious strings:2:str=WinExec
Suspicious strings:2:str=ShellExecute
Suspicious strings:2:str=IsBadReadPtr
Suspicious strings:2:str=IsBadWritePtr
Suspicious strings:2:str=CreateFile
Suspicious strings:2:str=CloseHandle
Suspicious strings:2:str=ReadFile
Suspicious strings:2:str=WriteFile
Suspicious strings:2:str=SetFilePointer
Suspicious strings:2:str=VirtualAlloc
Suspicious strings:2:str=GetProcAddr
Suspicious strings:2:str=LoadLibrary

I derived these rules from the source code Frank gave me. Testing these rules on different benign and malicious files revealed 2 things: a couple of rules generated a lot of false positives, and brute-forcing the ROT encoding also generated a lot of false positives. So I removed these rules, and I added an option to disable encodings (option -d). For example, with option -d 3 I disable the brute-forcing of the ROT encoding (1: XOR 2: ROL 3: ROT 4: SHIFT 5: ADD).

When looking for shellcode, you want several rules to trigger. If just one or two rules trigger, they are likely false positives.
XORSearch_V1_11_0.zip (https)
MD5: 7313A198033C0A1F69B79F96894462C7
SHA256: 1700D037D7A9902108F3986D75A9BA250ACBD96E38CC43C5B4BC1FB90761B320

Sunday 14 September 2014

Update: SpiderMonkey

Filed under: My Software,Update — Didier Stevens @ 15:00

During my PDF training at 44CON I got the idea for a simple modification: now with document.write(), a third file is created. The file is write.bin.log and contains the pure UNICODE data, e.g. without 0xFFFE header.

To extract shellcode now, you no longer need to edit write.uc.log to remove the 0xFFFE header.

I also included binaries for Windows and Linux (compiled on CentOS 6.0) in the ZIP file.

js-1.7.0-mod-b.zip (https)
MD5: 85B369B5650D4C041D21E8574CF09B9A
SHA256: D3827DF7B2EA81EEE91181B2DE045320E1CFEC46EED33F7CD84CA63C3A36BC38

Monday 1 September 2014

Update: Calculating a SSH Fingerprint From a (Cisco) Public Key

Filed under: Encryption,My Software,Update — Didier Stevens @ 20:17

I think there’s more interest for my program to calculate the SSH fingerprint for Cisco IOS since Snowden started with his revelations.

I fixed a bug with 2048 bit (and more) keys.

20111221-225407

cisco-calculate-ssh-fingerprint_V0_0_2.zip (https)
MD5: C304299624F12341F9935263304F725B
SHA256: 2F2BF65E6903BE3D9ED99D06F0F38B599079CCE920222D55CC5C3D7350BD20FB

Wednesday 16 July 2014

Update: translate.py

Filed under: My Software,Update — Didier Stevens @ 19:37

Some time ago, Chris John Riley reminded me of a program I had written, published … and forgotten: translate.py. Apparently, it is used in SANS classes.

Looking at this program from 2007, I though: my Python coding style has changed since then, I need to rewrite this.

So here is the new version. It’s backward compatible with the old version (same arguments), but it offers more flexibility, like input/output redirection, allowing it to be used in pipes.

And from now on, I’m going to try to add a man page to all new Python program releases. It’s embedded in the source code, and you view it like this: translate.py –man

20140716-212254

Monday 30 June 2014

Update: Stoned Bitcoin

Filed under: Encryption,Forensics,Malware,Update — Didier Stevens @ 0:04

kurt wismer pointed me to this post on pastebin after he read my Stoned Bitcoin blogpost. The author of this pastebin post works out a method to spam the Bitcoin blockchain to cause anti-virus (false) positives.

I scanned through all the Bitcoin transactions (until 24/06/2014) for the addresses listed in this pastebin post (the addresses represent antivirus signatures for 400+ malwares).

All these “malicious” Bitcoin addresses, designed to generate anti-virus false positives,  have been exclusively used in the 8 Bitcoin transactions I mentioned in my previous post.

The pastebin entry was posted on 2014/04/02 19:01:08 UTC.

And here are the 8 transactions with the UTC timestamp of the block in which they appear:

Block: 2014/04/03 23:12:48
Transaction: edb83f04e68bfe78bbfe7ce80d33e85acb9335c96ead5712517b8c70d1f27b38
Block: 2014/04/04 01:10:45
Transaction: 7e49504c7cecea7ea95d78ff14687878ba581a21dc0772805d2925c617514129
Block: 2014/04/04 01:43:25
Transaction: f65895220f04aa0084d9abae938d3f517893e3afbffe25fc9e7073e02331b9ed
Block: 2014/04/04 02:58:13
Transaction: 8a445d12f225a21d36bb78da747efd2e74861fcd033757da572c0434d423acd1
Block: 2014/04/04 04:32:24
Transaction: fcf5cf9893a142897598edfc753bd6162e3638e138fc2feaf4a3477c0cfb65eb
Block: 2014/04/04 04:32:24
Transaction: 2814673f0952b936d578d73197bfd371cefbd73c6294bab16de1575a4c3f6e80
Block: 2014/04/04 09:36:29
Transaction: f09904aaa4fa4a8ec7da06f5e3d318a9b6a218e1a215f9307416fbbadf5a1c8e
Block: 2014/04/04 09:36:29
Transaction: 5dbb9df056c36457228a841d6cc98ac90967bc88411c95372d3c2d92c18060f8

So it took a bit more than 24 hours before someone spammed the Bitcoin blockchain with these transactions designed to trigger false positives.

Thursday 20 March 2014

XORSearch: Finding Embedded Executables

Filed under: My Software,Update — Didier Stevens @ 10:58

Someone mentioned on a forum that he found a picture with an embedded, XORed executable. You can easily identify such embedded executables by xorsearching for the string “This program must be run under Win32″. But if the author or compiler modifies this DOS-stub string, you will not find it.

That’s how I got the idea to add an option to search for PE-files: search for string MZ, read the offset to the IMAGE_NT_HEADER structure (e_lfanew), and check if it starts with string PE.

Example: XORSearch.exe -p test.jpg

Found XOR A2 position 00005D1D: 000000E8 ........!..L.!This program cannot be r
Found XOR A2 position 0001221D: 00000108 ........!..L.!This program cannot be r

We found 2 embedded executables in test.jpg (XOR key A2). Remark we didn’t provide a search string, only option -p.

XORSearch also reports the value of e_lfanew and the string found in the DOS-stub. This allows you to inspect the results for false positives.

This can also be used on unencoded files, like this installation file:

XORSearch.exe -p c8400.msi
Found XOR 00 position 00236400: 000000E8 ........!..L.!This program cannot be r
Found XOR 00 position 00286000: 00000100 ........!..L.!This program cannot be r
Found XOR 00 position 00346800: 000000F8 ........!..L.!This program cannot be r
Found XOR 00 position 003A7200: 00000080 ........!..L.!This program cannot be r
Found XOR 00 position 003AD200: 00000080 ........!..L.!This program cannot be r
Found XOR 00 position 004B4800: 00000108 ........!..L.!This program cannot be r
Found XOR 00 position 004DE600: 000000F8 ........!..L.!This program cannot be r
Found XOR 00 position 004FE200: 000000E0 ........!..L.!This program cannot be r
Found XOR 00 position 00520C00: 000000E0 ........!..L.!This program cannot be r
Found XOR 00 position 00542000: 000000E0 ........!..L.!This program cannot be r
Found XOR 00 position 00562400: 00000100 ........!..L.!This program cannot be r
Found XOR 00 position 0058F800: 000000E0 ........!..L.!This program cannot be r

Finally, I added option -e (exclude). This excludes a particular byte-value from encoding. If you suspect a file is XOR encoded, but that byte 0x00 is not encoded, you use option -e 0x00.

XORSearch_V1_10_0.zip (https)
MD5: 23809A03C63914B0742B7F75B73E1597
SHA256: 97BFBC5E8C59F60E10ABDA2D65DF4200B10BE14662D4A447797B341C9AAE17D8

Monday 23 December 2013

Update: Prefetch File 010 Template

Filed under: Forensics,My Software,Update — Didier Stevens @ 22:01

This update to my Prefetch File 010 Template adds Sections A through D.

20131223-225916
PFTemplate_V0_0_2.zip (https)
MD5: 56A98A78BD4E8D1AED88385AF1DD8446
SHA256: E15D721E46FFB8158C6D14C9A38DE4E3DD5DCD0972896441DF17590C540DBCC3

Saturday 14 December 2013

Update: virustotal-submit.py V0.0.3

Filed under: Malware,My Software,Update — Didier Stevens @ 0:21

There is extra error handling in this new version.

virustotal-search and virustotal-submit have their own page now: VirusTotal Tools.

virustotal-submit_V0_0_3.zip (https)
MD5: 3F9F5421F711E2930AB6F80D87DF9E2B
SHA256: 37CCE3E8469DE097912CB23BAC6B909C9C7F5A5CEE09C9279D32BDB9D6E23BCC

Monday 2 December 2013

4 Times Faster virustotal-search.py

Filed under: Malware,My Software,Update — Didier Stevens @ 0:26

This is an important update to virustotal-search.py.

Rereading the VT API, I noticed I missed the fact that the search query accepts up to 4 search terms.

This new version submits 4 hashes at a time, making it up to 4 times faster than previous versions.

virustotal-search_V0_1_0.zip (https)
MD5: 0141D3677F759317034C416EBF9FF30D
SHA256: FE07859C3FA09DA120D3104FF982AF0D78ADFCF099A10E46E254823502DF4EE4

Next Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 234 other followers