Didier Stevens

Wednesday 18 February 2015

Analyzing A Fraudulent Document With Error Level Analysis

Filed under: Forensics,My Software,PDF — Didier Stevens @ 0:00

Some time ago I had the chance to try out an image forensic method (Error Level Analysis) on a PDF. It was a fraudulent document (a form), but with a special characteristic: the criminal converted the original form (a PDF) to JPEG, edited the JPEG with a raster graphics editor, and then inserted the edited JPEG in a PDF document. This gave me the opportunity to try out Error Level Analysis (ELA) on a “text document”.

I can’t share the PDF, but I recreated one to use in this blogpost.

First I search for images in the PDF document:

pdf-parser.py -s image example-edited.pdf

Result:

obj 4 0
 Type: 
 Referencing: 6 0 R

  <<
    /Font
    /XObject
      <<
        /Im4 6 0 R
      >>
    /ProcSet [/PDF/Text/ImageC/ImageI/ImageB]
  >>


obj 6 0
 Type: /XObject
 Referencing: 
 Contains stream

  <<
    /Type /XObject
    /Subtype /Image
    /Width 680
    /Height 965
    /BitsPerComponent 8
    /ColorSpace /DeviceRGB
    /Filter /DCTDecode
    /Length 233133
  >>

The image is in object 6. I extract the image:

pdf-parser.py -o 6 -d example-edited.jpeg example-edited.pdf

Here it is:

example-edited

If you Google for Error Level Analysis, you’ll find a couple of websites that provide online image forensics. But that was not an option for me, I could not share the document.

I found this C program for ELA, and later I wrote my own Python program (what else?), that I’ll use for this example:

image-forensics-ela.py example-edited.jpeg example-edited-ela.png

example-edited-ela

The colored pixels reveal the word I edited. You can see it better when I overlay the 2 images:

image-overlay.py -a 0.6 example-edited.jpeg example-edited-ela.png example-edited-overlay.png

example-edited-overlay

FYI: there is also a GIMP plugin for ELA.

You can download the examples and programs here:

blogpost-ela-files.zip (https)
MD5: 4F3071A9162C5CA8B7B10A41F662093A
SHA256: CBA786368D7BAF65E1E9F854C315BFB60FF89910429106513A0C41C180D8FCAB

Tuesday 17 February 2015

Update: oledump.py Version 0.0.8

Filed under: Malware,My Software,Update — Didier Stevens @ 0:00

This new version brings support for multiple YARA rule files.

The plugin_http_heuristics plugin was updated, and there is a new plugin: plugin_dridex.

oledump_V0_0_8.zip (https)
MD5: 29EBF73F5512B0BC250CD0A0977A2C72
SHA256: 09C451116FCDE7763173E1538C687734D92267A0D192499AFD118D8D923165B9

Monday 16 February 2015

Update EICARgen Version 2.1

Filed under: My Software,Update — Didier Stevens @ 0:00

Version 2.1 of EICARgen can create an Excel spreadsheet (.xls) with the EICAR test file embedded with OLE.

Sunday 15 February 2015

Update: YARA Rule JPEG_EXIF_Contains_eval

Filed under: Forensics,Malware,Update — Didier Stevens @ 11:21

Now that YARA version 3.3.0 supports word boundaries in regular expressions, I’ve updated my YARA Rule for Detecting JPEG Exif With eval().

yara-rules-V0.0.5.zip (https)
MD5: 298EB636B3A3CB6A073815A83A6D1BA6
SHA256: EA00D044A3A0FE29265817407E382034593E0DAAD9887416E7FC128DA24B8830

Tuesday 10 February 2015

Update: oledump.py Version 0.0.7

Filed under: Malware,My Software,Update — Didier Stevens @ 0:00

This new version adds support for the new office file format (.docx, .xlsx, …) stored inside a ZIP file (so a ZIP inside a ZIP) and an option to print YARA strings.

And the HTTP heuristics plugin has some extra heuristics.

oledump_V0_0_7.zip (https)
MD5: 7A953BAFFA1E5285651699996FA2DF84
SHA256: F5DC5F650F005E530A7D0CF510C33E3A4EF29AD85B1DA2618B237F53A46B86B5

Monday 2 February 2015

AirPcap Channel Hopping With Python

Filed under: Didier Stevens Labs,My Software,WiFi,Wireshark — Didier Stevens @ 0:00

I’m teaching a Wireshark WiFi and Lua 2-day class at Brucon Spring Training 2015. You get an AirPcap packet capture adapter when you attend this class.

I made a modification to my Python program to do channel hopping with the AirPcap adapter. Now you can specify a sequence of channels with option -c.

20150201-142106
apc-channel_v0_2.zip (https)
MD5: 52169F5CB679E6C0DF1F8D47DA38F779
SHA256: 59F4BEE229F5EF5B7AF27BAF6AA972DCDC9E6A6007E8E468AE7BC7C3F1CB89DD

Thursday 22 January 2015

Converting PEiD Signatures To YARA Rules

Filed under: Forensics,Malware,My Software — Didier Stevens @ 0:56

I converted Jim Clausing’s PEiD rules to YARA rules so that I can use them to detect executable code in suspect Microsoft Office Documents with my oledump tool.

Of course, I wrote a program to do this automatically: peid-userdb-to-yara-rules.py

This program converts PEiD signatures to YARA rules. These signatures are typically found in file userdb.txt. Since PEiD signature names don’t need to be unique, and can contain characters that are not allowed in YARA rules, the name of the YARA rule is prefixed with PEiD_ and a running counter, and non-alphanumeric characters are converted to underscores (_).
Signatures that can not be parsed are ignored.

Here is an example:
PEiD signature:

 [!EP (ExE Pack) V1.0 -> Elite Coding Group]
 signature = 60 68 ?? ?? ?? ?? B8 ?? ?? ?? ?? FF 10
 ep_only = true

Generated YARA rule:

 rule PEiD_00001__EP__ExE_Pack__V1_0____Elite_Coding_Group_
 {
     meta:
         description = "[!EP (ExE Pack) V1.0 -> Elite Coding Group]"
         ep_only = "true"
     strings:
         $a = {60 68 ?? ?? ?? ?? B8 ?? ?? ?? ?? FF 10}
     condition:
         $a
 }

PEiD signatures have an ep_only property that can be true or false. This property specifies if the signature has to be found at the PE file’s entry point (true) or can be found anywhere (false).
This program will convert all signatures, regardless of the value of the ep_only property. Use option -e to convert only rules with ep_only property equal to true or false.

Option -p generates rules that use YARA’s pe module. If a signature has ep_only property equal to true, then the YARA rule’s condition becomes $a at pe.entry_point instead of just $a.

Example:

 import "pe"

 rule PEiD_00001__EP__ExE_Pack__V1_0____Elite_Coding_Group_
 {
     meta:
         description = "[!EP (ExE Pack) V1.0 -> Elite Coding Group]"
         ep_only = "true"
     strings:
         $a = {60 68 ?? ?? ?? ?? B8 ?? ?? ?? ?? FF 10}
     condition:
         $a at pe.entry_point
 }

Specific signatures can be excluded with option -x. This option takes a file that contains signatures to ignore (signatures like 60 68 ?? ?? ?? ?? B8 ?? ?? ?? ?? FF 10, not names like [!EP (ExE Pack) V1.0 -> Elite Coding Group]).

Download my YARA Rules.

peid-userdb-to-yara-rules_V0_0_1.zip (https)
MD5: D5B9B6FA7EC50A107A70419D30FEC9ED
SHA256: F8A12B5522B92AE7E3EDF11ACFAEEA7FDCC7FBDA8DC827D288A2D92B2B2CA5E2

Tuesday 20 January 2015

YARA Rule: Detecting JPEG Exif With eval()

Filed under: Forensics,Malware — Didier Stevens @ 20:39

My first release of 2015 was a new YARA rule to detect JPEG images with an eval() function inside their Exif data.

Such images are not new, but I needed an example to develop a complex YARA rule:

rule JPEG_EXIF_Contains_eval
{
    meta:
        author = "Didier Stevens (https://DidierStevens.com)"
        description = "Detect eval function inside JPG EXIF header (http://blog.sucuri.net/2013/07/malware-hidden-inside-jpg-exif-headers.html)"
        method = "Detect JPEG file and EXIF header ($a) and eval function ($b) inside EXIF data"
    strings:
        $a = {FF E1 ?? ?? 45 78 69 66 00}
        $b = /\Weval\s*\(/
    condition:
        uint16be(0x00) == 0xFFD8 and $a and $b in (@a + 0x12 .. @a + 0x02 + uint16be(@a + 0x02) - 0x06)
}

Here is an example of such an image:

20150120-204954

The YARA rule has 3 conditions that must be satisfied:

  1. JPEG magic header FFD8, tested with: uint16be(0x00) == 0xFFD8
  2. Exif structure: FF E1 ?? ?? 45 78 69 66 00
  3. eval function inside Exif data, tested with a regular expression: \Weval\s*\(

Condition 1 is straightforward: the file must start with FFD8. I’m using test uint16be(0x00) == 0xFFD8 instead of searching for {FF D8} at 0x00. FF D8 is a short string, searching for {FF D8} can cause performance problems (you’ll get a warning from YARA when it compiles rules with such short strings).

Condition 2 checks for the presence of the Exif data header. Bytes 3 and 4 (?? ??) encode the length of the Exif Data.

Condition 3 checks for the presence of the eval function. To reduce the number of false positives that would occur when searching for string eval, we use a regular expression that matches string eval, possibly followed by whitespace characters (\s*), and an opening parenthesis: \(. And we don’t want letters or numbers before the string eval (we don’t want to match a string like deval), eval must be the start of a word. To achieve this with regular expressions, you use a word boundary: \b. So our regular expression would be \beval\s*\(. Unfortunately, YARA’s regular expression engine does not support word boundaries, so I had to come up with something else. I match any character that is not alphanumeric: \W. Be warned that there is a small difference between \W and \b. \b also matches the beginning of a string (like $), while \W has to match a character. So the regular expression I use is \Weval\s*\(.

The eval function must also be found inside the Exif data. We don’t want to trigger on the eval function if it is found somewhere else in the image. That’s where YARA’s in ( .. ) syntax comes in.

The first 18 bytes of the Exif structure are various headers which we ignore, so our eval function $b must start at @a + 0x12 or further.

The total size of the Exif structure is given by expression 0x02 + uint16be(@a + 0x02). We add this to the start of the Exif header (@a): @a + 0x02 + uint16be(@a + 0x02). And finally, we have to subtract the size of the string matched by the regular expression. Unfortunately, YARA has no function to calculate this length. So we will use the minimum length our regular expression can match: 6 characters. So our eval function $b must start no further than @a + 0x02 + uint16be(@a + 0x02) – 0x06. Putting all this together gives: $b in (@a + 0x12 .. @a + 0x02 + uint16be(@a + 0x02) – 0x06)

FYI: Victor told me that he plans to add a string length function to YARA, so our condition will then become: $b in (@a + 0x12 .. @a + 0x02 + uint16be(@a + 0x02) – &b)

You can find all my YARA rules here: YARA Rules.

Friday 16 January 2015

Update: oledump.py Version 0.0.6

Filed under: Malware,My Software,Update — Didier Stevens @ 16:11

My last software release for 2014 was oledump.py V0.0.6 with support for the “ZIP/XML” Microsoft Office fileformat and YARA.

In this post I will highlight support for the “new” Microsoft Office fileformat (.docx, .docm, .xlsx, .xlsm, …), which is mainly composed of XML files stored inside a ZIP container. Except macros which are still stored with OLE files (inside the ZIP container).

When oledump.py detects that the file is actually a ZIP file, it searches through all the files stored inside the ZIP container for OLE files, and analyses these.

Here is an example of a simple spreadsheet with macros. The xlsm file contains one OLE file: xl/vbaProject.bin. oledump gives it the identifier A. All the streams inside the OLE file are reported, and their index is prefixed with the identifier (A in this example).

20150112-232122

If you want to select the stream with the macros, you use A6, like this: oledump.py -s A1

oledump also supports the analysis of an OLE file stored in a password protected ZIP file (typically, malware samples are stored inside ZIP files with password infected). When oledump.py analyses a ZIP file with extension .zip, it assumes that the file is NOT using the “new” Microsoft Office fileformat. Only when the file is a ZIP file but the extension is not .zip does oledump assume that the file is using the “new” Microsoft Office fileformat.

I have another example in my Internet Storm Center Guest Diary Entry.

oledump_V0_0_6.zip (https)
MD5: E32069589FEB7B53707D00D7E0256F79
SHA256: 8FCEFAEF5E6A2779FC8755ED96FB1A8DACDBE037B98EE419DBB974B5F18E578B

Thursday 8 January 2015

Didier Stevens Suite

Filed under: My Software — Didier Stevens @ 20:14

I bundled most of my software in a ZIP file. In all modesty, I call it Didier Stevens Suite.

« Previous PageNext Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 262 other followers