Some time ago I had the chance to try out an image forensic method (Error Level Analysis) on a PDF. It was a fraudulent document (a form), but with a special characteristic: the criminal converted the original form (a PDF) to JPEG, edited the JPEG with a raster graphics editor, and then inserted the edited JPEG in a PDF document. This gave me the opportunity to try out Error Level Analysis (ELA) on a “text document”.
I can’t share the PDF, but I recreated one to use in this blogpost.
First I search for images in the PDF document:
pdf-parser.py -s image example-edited.pdf
obj 4 0
Referencing: 6 0 R
/Im4 6 0 R
obj 6 0
The image is in object 6. I extract the image:
pdf-parser.py -o 6 -d example-edited.jpeg example-edited.pdf
Here it is:
If you Google for Error Level Analysis, you’ll find a couple of websites that provide online image forensics. But that was not an option for me, I could not share the document.
I found this C program for ELA, and later I wrote my own Python program (what else?), that I’ll use for this example:
image-forensics-ela.py example-edited.jpeg example-edited-ela.png
The colored pixels reveal the word I edited. You can see it better when I overlay the 2 images:
image-overlay.py -a 0.6 example-edited.jpeg example-edited-ela.png example-edited-overlay.png
FYI: there is also a GIMP plugin for ELA.
You can download the examples and programs here: