Didier Stevens

Friday 15 November 2013

Update: find-file-in-file.py Version 0.0.3

Filed under: Forensics,My Software,Update — Didier Stevens @ 12:55

shinnai made an interesting comment when I released my tool to find contained files: he wanted to know if I could add a batch mode.

I guess this batch mode is interesting when you want to check if a large set of files contains a particular file. So I added this features and release it here.

Now you can provide more than one containing-file to find-file-in-file.py: you can just type several files, use wildcards and/or use at-files (@file). When you specify @filename, find-file-in-file.py will search in all the files listed in textfile filename (each file on a separate line).

When you provide only one file to search, then this new version will just work like the previous version.

But if you provide more than one file, then batch mode is enabled. In batch mode, the contained file is searched for in each containing file. If a (partial) match is found, it will be included in the report. If no match is found, no output is produced. If you want output even when no match is found, then use option verbose (-v).

Example for a bunch of MSI files:

find-file-in-file.py msi49.tmp *.msi

File: c8400.msi
003a7200 00005600 (100%)
File: Cisco_Jabber.msi
00295600 00001000 (18%)
00294a00 00000c00 (13%)
00296600 00003a00 (67%)

File msi49.tmp was found in only 2 MSI files.
find-file-in-file_v0_0_3.zip (https)
MD5: 8691158700079C786F6905F0CA0F32BC
SHA256: 84506CED140F309503E723831A9EFB99A8CC213532BEB56E00BC4BA5FE235797

Monday 4 November 2013

Update: naft-gfe.py

Filed under: Forensics,My Software,Networking,Update — Didier Stevens @ 20:49

This new version of the generic frame extraction tool (naft-gfe) can handle files (RAM dumps) that are too large to fit into memory.


Use option -b for buffered reads. By default, the file will be read and analyzed in blocks of 101MB (100MB buffer + 1MB overlap buffer).

Since the file is not read completely in memory, there is a possibility that some frames/packets are not completely read in memory. For example, a frame starts in the first block of 100MB, and ends in the second block of 100MB. The analysis routines would miss this frame.

To avoid this, the program reads the first block of 100MB (block A) plus an extra block of 1MB (block B). This block of 101MB (A + B) is analyzed. Then, the second block of 100MB (block C) is read, and the extra block B is prepended to block C for analysis (B + C). Hence the overlap buffer is analyzed twice, but packets are only extracted once from this buffer. This procedure is repeated for the complete file.

It is important that the overlap buffer is large enough to accommodate the largest possible frame or packet. That’s why by default, it is 1MB.

Use options -S and -O to choose your own size for buffer and overlap buffer.

Monday 28 October 2013

NAFT: The Movie

Filed under: Forensics,My Software,Networking — Didier Stevens @ 18:46

I made a video of the Network Appliance Forensic Toolkit demo I gave at my local ISSA chapter.

Monday 21 October 2013

Update: Suspender V0.0.0.4

Filed under: Forensics,Malware,My Software,Update — Didier Stevens @ 10:19

Suspender is a DLL that suspends all threads of a process.

This new version adds an option to suspend a process when it exits. Rename the dll to suspenderx.dll to activate this option (x stands for eXit).

When DllMain is called with DLL_PROCESS_DETACH and the reserved argument is not NULL, the process is exiting. So that’s the trigger to suspend it.


Suspender_V0_0_0_4.zip (https)
MD5: 629255337FE0CA9F631B1A7177D158F0
SHA256: 8E63152620541314926878D01469E2E922298C147740BDEAF7FC6B70EB9305EF

Monday 14 October 2013

Update: XORSearch Version 1.9.2

Filed under: Forensics,My Software,Update — Didier Stevens @ 5:00

I’ve been asked many times to support 32-bit keys with my XORSearch tool. But the problem is that a 32-bit bruteforce attack would take too much time.

Now I found a solution that doesn’t take months or years: a 32-bit dictionary attack.

I assume that the 32-bit XOR key is inside the file as a sequence of 4 consecutive bytes (MSB or LSB).

If you use the new option -k, XORSearch will perform a 32-bit dictionary attack to find the XOR key. The standard bruteforce attacks are disabled when you choose option -k.

XORSearch will extract a list of keys from the file: all unique sequences of 4 consecutive bytes (MSB and LSB order). Key 0×00000000 is excluded. Then it will use this list of keys to perform an XOR dictionary attack on the file, searching for the string you provided. Each key will be tested with an offset of 0, 1, 2 and 3.

It is not unusual to find the 32-bit XOR key inside the file itself. If it is a self-decoding executable, it can contain an XOR x86 instruction with the 32-bit key as operand. Or if the original file contains a sequence of 0×00 bytes (4 consecutive 0×00 bytes at least), then the encoded file will also contain the 32-bit XOR key.

Here is a test where XORSearch.exe searches a 0xDEADBEEF XOR encoded copy of itself. With only 74KB, there are still 100000+ keys to test, taking almost 10 minutes on my machine:


XORSearch_V1_9_2.zip (https)
MD5: BF1AC6CAA325B6D1AF339B45782B8623
SHA256: 90793BEB9D429EF40458AE224117A90E6C4282DD1C9B0456E7E7148165B8EF32

Monday 7 October 2013

Finding Contained Files

Filed under: Forensics,My Software — Didier Stevens @ 0:00

Some time ago I had to figure out if a file was embedded inside another file.

It’s not a file carving problem. I had both files. I just needed to be sure that file A was contained inside file B.

With a hex editor I could find parts of file A inside file B, but it looked like file A was split up and scattered at different locations in file B.

I Googled a bit for a tool, but nothing came up, so I wrote my own Python program.

With my new tool I was able to get assured that file msi49.tmp was inside file c8400.msi:


You can see that file msi49.tmp is one contiguous sequence inside file c8400.msi starting at position 0x3A7200.

But I was more interested to know if file msi49.tmp was also inside file Cisco_Jabber.msi:


And you can see it is, but not as one contiguous sequence. It’s split in 3 sequences.

This tool can also be used to find a downloaded file inside a pcap/pcapng file. I downloaded AnalyzePESig_V0_0_0_2.zip while taking a Wireshark capture.


Or to find a file opened by an application. Here I look into the process dump:


The only limitation is that both files need to be read into memory. But when I’ve time, I’ll turn this into a plugin for the Volatility framework.

The program looks for sequences of at least 10 bytes long (this is an option). If your file is divided in sequences smaller than 10 bytes, then my program will not find the embedded file. Unless you lower the minimum length, but don’t go as low as 1 byte, because then you’re likely to be finding random data.

I’m not 100% sure that my program will find all possible cases of embedded files. No problem if it’s one contiguous sequence, or several sequences in logical order. But I’ve to review my algorithm to be sure it will also find all possible cases of embedded files with sequences in random order. I think it will, but I need to prove it.

find-file-in-file_v0_0_1.zip (https)
MD5: 2984F01404770B92953823D39907B055
SHA256: 1AD124A9A31DACFE1FC9F3B89B3117D3A70D5BC15B712CC1748BEA893612686C

Monday 30 September 2013

Bugfix virustotal-submit.py Version 0.0.2

Filed under: My Software,Update — Didier Stevens @ 13:12

This is a bugfix for my virustotal-submit.py program.

I fixed a bug in the error handling code for unreadable ZIP files.

virustotal-submit_V0_0_2.zip (https)
MD5: 1152A8507FE7A668DCDF5C44DEAD11DF
SHA256: D5A4E5C3E80F98D4A82A128D8C9DBA395C2B9CDFE9F37E2B0882904D47673CE5

Wednesday 18 September 2013

Update: pdf-parser V0.4.3

Filed under: My Software,PDF — Didier Stevens @ 20:20

There’s still time to register for my “Hacking PDF” training at Brucon next week.

I introduced a bug in pdf-parser version 0.3.8 that changed the behavior of the -w option (raw).

This new version is a fix for this bug.

pdf-parser_V0_4_3.zip (https)
MD5: 2220FFE37AEA36FC593AE33440385E76
SHA256: 1416624938359FDD375108D922350D1B7B0E41B3A40A48F778D6D72D8A405DE6

Friday 30 August 2013

Brucon Hacking PDF Training

Filed under: Announcement,Didier Stevens Labs,PDF — Didier Stevens @ 8:56

When you register before September 7th with discount code MC201305 you will get 5% discount.

What do you want from training? I want to gain knowledge. I designed my “Hacking PDF” training with this goal in mind.

“Hacking PDF” is a 2-day training focusing on the PDF language, not on reversing PDF readers. By attending this training, you will first acquire knowledge about the PDF language. And then we will use this knowledge to analyze malicious PDFs (day 1) and create PDFs for fun and profit (day 2).

Learning to use tools is nice, and learning new skills is interesting. But I want more. I also want to get a deep understanding of the subject. Because with this knowledge, I can develop new tools and invent new techniques.

On day one I explain the fundamentals of the PDF language. We take a look at several features of the language that malware authors use and abuse. And then we start analyzing PDFs. You learn to use my tools pdfid and pdf-parser on 20 simple PDF exercises. The exercise is to find the malicious behavior of the PDF, the goal is to gain understanding of PDF malware. And then we move on to the real deal: analyzing real, in-the-wild PDF malware.
On day two we use our understanding of the PDF language and PDF malware to create our own PDF files and modify existing PDF files. This is done with pure Python tools and other free tools. Adobe products are not used in this training, except to view PDFs. We will learn to do simple and smart fuzzing of PDFs, create PDFs that exploit vulnerabilities in PDF readers, embed files and PDFs, and a lot of other interesting hacks …

You can find a “Hacking PDF” slideshow here.

There are not many pre-requisites for this training:
1)    You don’t need to know anything about PDF, I will teach you what we need to know.
2)    We use Python scripts, but you don’t need to be a Python programmer. We will modify existing scripts, so a bit of programming knowledge like if statements and loops is enough.
3)    Not need to understand assembly or shellcode, we use a shellcode emulator. And I will provide you the shellcode for day 2, you do not need to write it yourself.
4)    You need to be at ease with the command-line
5)    A security mindset is an advantage ;-)

When you register before September 7th with discount code MC201305 you will get 5% discount.

Saturday 24 August 2013

Quickpost: Proxy Cookies

Filed under: Forensics,Networking,Quickpost — Didier Stevens @ 11:20

Cookies set bij network proxies can be identified by their name.

BlueCoat proxy cookies start with BCSI-CS-.

Cisco IronPort proxy cookies start with iptac-. The string after iptac is the serial number of the device.

Google for these and you’ll find some examples.

More info later.

Quickpost info

« Previous PageNext Page »

The Rubric Theme. Blog at WordPress.com.


Get every new post delivered to your Inbox.

Join 221 other followers