Didier Stevens

Wednesday 18 February 2015

Analyzing A Fraudulent Document With Error Level Analysis

Filed under: Forensics,My Software,PDF — Didier Stevens @ 0:00

Some time ago I had the chance to try out an image forensic method (Error Level Analysis) on a PDF. It was a fraudulent document (a form), but with a special characteristic: the criminal converted the original form (a PDF) to JPEG, edited the JPEG with a raster graphics editor, and then inserted the edited JPEG in a PDF document. This gave me the opportunity to try out Error Level Analysis (ELA) on a “text document”.

I can’t share the PDF, but I recreated one to use in this blogpost.

First I search for images in the PDF document:

pdf-parser.py -s image example-edited.pdf

Result:

obj 4 0
 Type: 
 Referencing: 6 0 R

  <<
    /Font
    /XObject
      <<
        /Im4 6 0 R
      >>
    /ProcSet [/PDF/Text/ImageC/ImageI/ImageB]
  >>


obj 6 0
 Type: /XObject
 Referencing: 
 Contains stream

  <<
    /Type /XObject
    /Subtype /Image
    /Width 680
    /Height 965
    /BitsPerComponent 8
    /ColorSpace /DeviceRGB
    /Filter /DCTDecode
    /Length 233133
  >>

The image is in object 6. I extract the image:

pdf-parser.py -o 6 -d example-edited.jpeg example-edited.pdf

Here it is:

example-edited

If you Google for Error Level Analysis, you’ll find a couple of websites that provide online image forensics. But that was not an option for me, I could not share the document.

I found this C program for ELA, and later I wrote my own Python program (what else?), that I’ll use for this example:

image-forensics-ela.py example-edited.jpeg example-edited-ela.png

example-edited-ela

The colored pixels reveal the word I edited. You can see it better when I overlay the 2 images:

image-overlay.py -a 0.6 example-edited.jpeg example-edited-ela.png example-edited-overlay.png

example-edited-overlay

FYI: there is also a GIMP plugin for ELA.

You can download the examples and programs here:

blogpost-ela-files.zip (https)
MD5: 4F3071A9162C5CA8B7B10A41F662093A
SHA256: CBA786368D7BAF65E1E9F854C315BFB60FF89910429106513A0C41C180D8FCAB

4 Comments »

  1. Thank you for your post.

    I’ve been playing with ELA and JPEG and I’ve been also “playing” with your script. How do you know when something is forged? Sometimes seems that’s only a change of colour in the document. ELA works OK in some cases but in other cases, it returns false positives. How can deal with this?

    Thanks in advance,

    Alex.

    Comment by Alejandro — Tuesday 5 May 2015 @ 15:00

  2. I’ve only done ELA on black & white text. Every text added or changed was indicated by color pixels. The other detections, the false positives, had gray pixels.

    Comment by Didier Stevens — Wednesday 6 May 2015 @ 4:29

  3. could you explain how to run this python program and analyze ela.I am asking this I am not a expert in program,but I want to analyze the image
    thank you in advance

    Comment by ARJUN — Monday 28 December 2015 @ 14:10

  4. @ARJUN If the command-line is a problem for you, do it on-line, there are a couple of websites that will analyze images.

    Comment by Didier Stevens — Monday 28 December 2015 @ 15:38


RSS feed for comments on this post. TrackBack URI

Leave a Reply (comments are moderated)

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: