Didier Stevens

Tuesday 20 January 2015

YARA Rule: Detecting JPEG Exif With eval()

Filed under: Forensics,Malware — Didier Stevens @ 20:39

My first release of 2015 was a new YARA rule to detect JPEG images with an eval() function inside their Exif data.

Such images are not new, but I needed an example to develop a complex YARA rule:

rule JPEG_EXIF_Contains_eval
{
    meta:
        author = "Didier Stevens (https://DidierStevens.com)"
        description = "Detect eval function inside JPG EXIF header (http://blog.sucuri.net/2013/07/malware-hidden-inside-jpg-exif-headers.html)"
        method = "Detect JPEG file and EXIF header ($a) and eval function ($b) inside EXIF data"
    strings:
        $a = {FF E1 ?? ?? 45 78 69 66 00}
        $b = /\Weval\s*\(/
    condition:
        uint16be(0x00) == 0xFFD8 and $a and $b in (@a + 0x12 .. @a + 0x02 + uint16be(@a + 0x02) - 0x06)
}

Here is an example of such an image:

20150120-204954

The YARA rule has 3 conditions that must be satisfied:

  1. JPEG magic header FFD8, tested with: uint16be(0x00) == 0xFFD8
  2. Exif structure: FF E1 ?? ?? 45 78 69 66 00
  3. eval function inside Exif data, tested with a regular expression: \Weval\s*\(

Condition 1 is straightforward: the file must start with FFD8. I’m using test uint16be(0x00) == 0xFFD8 instead of searching for {FF D8} at 0x00. FF D8 is a short string, searching for {FF D8} can cause performance problems (you’ll get a warning from YARA when it compiles rules with such short strings).

Condition 2 checks for the presence of the Exif data header. Bytes 3 and 4 (?? ??) encode the length of the Exif Data.

Condition 3 checks for the presence of the eval function. To reduce the number of false positives that would occur when searching for string eval, we use a regular expression that matches string eval, possibly followed by whitespace characters (\s*), and an opening parenthesis: \(. And we don’t want letters or numbers before the string eval (we don’t want to match a string like deval), eval must be the start of a word. To achieve this with regular expressions, you use a word boundary: \b. So our regular expression would be \beval\s*\(. Unfortunately, YARA’s regular expression engine does not support word boundaries, so I had to come up with something else. I match any character that is not alphanumeric: \W. Be warned that there is a small difference between \W and \b. \b also matches the beginning of a string (like $), while \W has to match a character. So the regular expression I use is \Weval\s*\(.

The eval function must also be found inside the Exif data. We don’t want to trigger on the eval function if it is found somewhere else in the image. That’s where YARA’s in ( .. ) syntax comes in.

The first 18 bytes of the Exif structure are various headers which we ignore, so our eval function $b must start at @a + 0x12 or further.

The total size of the Exif structure is given by expression 0x02 + uint16be(@a + 0x02). We add this to the start of the Exif header (@a): @a + 0x02 + uint16be(@a + 0x02). And finally, we have to subtract the size of the string matched by the regular expression. Unfortunately, YARA has no function to calculate this length. So we will use the minimum length our regular expression can match: 6 characters. So our eval function $b must start no further than @a + 0x02 + uint16be(@a + 0x02) – 0x06. Putting all this together gives: $b in (@a + 0x12 .. @a + 0x02 + uint16be(@a + 0x02) – 0x06)

FYI: Victor told me that he plans to add a string length function to YARA, so our condition will then become: $b in (@a + 0x12 .. @a + 0x02 + uint16be(@a + 0x02) – &b)

You can find all my YARA rules here: YARA Rules.

Blog at WordPress.com.