Didier Stevens

Monday 27 October 2014

Update: PDFiD With Plugins Part 2

Filed under: My Software,PDF,Update — Didier Stevens @ 8:40

The second feature in this new version of PDFiD is selection. With this, you can select PDFs using criteria you provide.

Example:

pdfid.py -S “pdf.javascript.count > 0” *.pdf

This command will select all files with extension .pdf in the current directory that are PDFs and have a /JavaScript count larger than zero. The selection expression you provide is a Python expression. Here is a list off attributes to use in your selection expressions:

pdf.version
pdf.filename
pdf.errorOccured
pdf.errorMessage
pdf.isPDF
pdf.header

pdf.keywords[keywordname].count
pdf.keywords[keywordname].hexcode

pdf.keywords['/AA'].count
pdf.keywords['/Root'].count # if option -a and if /Root present in PDF

pdf.obj.count
pdf.obj.hexcode
pdf.endobj.count
pdf.endobj.hexcode
pdf.stream.count
pdf.stream.hexcode
pdf.endstream.count
pdf.endstream.hexcode
pdf.xref.count
pdf.xref.hexcode
pdf.trailer.count
pdf.trailer.hexcode
pdf.startxref.count
pdf.startxref.hexcode
pdf.page.count
pdf.page.hexcode
pdf.encrypt.count
pdf.encrypt.hexcode
pdf.objstm.count
pdf.objstm.hexcode
pdf.js.count
pdf.js.hexcode
pdf.javascript.count
pdf.javascript.hexcode
pdf.aa.count
pdf.aa.hexcode
pdf.openaction.count
pdf.openaction.hexcode
pdf.acroform.count
pdf.acroform.hexcode
pdf.jbig2decode.count
pdf.jbig2decode.hexcode
pdf.richmedia.count
pdf.richmedia.hexcode
pdf.launch.count
pdf.launch.hexcode
pdf.embeddedfile.count
pdf.embeddedfile.hexcode
pdf.xfa.count
pdf.xfa.hexcode
pdf.colors_gt_2_24.count
pdf.colors_gt_2_24.hexcode

Be careful if you are going to use this in an automated scenario where you don’t control the selection expression. This expression is evaluated in Python with the eval function, and there is no input validation.

 

Monday 20 October 2014

Update: PDFiD With Plugins Part 1

Filed under: My Software,PDF,Update — Didier Stevens @ 8:51

Almost from the beginning when I released PDFiD, people asked me for anti-virus like feature: that PDFiD would tell you if a PDF was malicious or not. Some people even patched PDFiD with a scoring feature.

But I didn’t want to develop an “anti-virus” for PDFs; PDFiD is a triage tool.

Now you can develop your own scoring system with plugins.

Plugins are loaded with option -p, like this:

20141020-102902

I provide 3 plugins: plugin_triage.py, plugin_nameobfuscation.py and plugin_embeddedfile.py. You can run more than one plugin by separating their names with a comma: pdfid.py -p plugin_triage,plugin_embeddedfile js.pdf

Or you can use an @-file: a text file with the names of the plugins you want to run.

To output the result as CSV file, use option -c, and to write the output to a file, use option -o. With option -m, you can provide a minimum score the plugin has to produce for its output to be displayed.

Plugins are Python classes, I’ll explain how to make your own in a later post.

plugin_triage.py produces a score of 1.0 when the PDF requires further analysis, and 0.0 if not.

plugin_nameobfuscation.py produces a score of 1.0 when name obfuscation is used in the PDF.

plugin_embeddedfile.py produces a score of 0.9 when an embedded file is present, and 1.0 when name obfuscation is also used.
pdfid_v0_2_1.zip (https)
MD5: 7463412536678B321276F8720F52DE81
SHA256: F1B4728DD2CE455B863B930E12C6DEC952CB95C0BB3D6924136A6E49ACA877C2

Blog at WordPress.com.