I’m giving a 2-day training on PDF at Brucon 2013. Early-bird price applies til June 15th.
Sometimes PDFiD will give you false positives for /JS and /AA. This happens with files of a couple of MBs or bigger, because it’s statistically very likely that /AA or /JS (only three bytes long) appear inside a stream. And since PDFiD, contrary to pdf-parser, has no notion of pdf objects and streams, it can produce false positives, like this:
PDFiD 0.1.2 CCNPSecurityFIREWALL642617OfficialCertGuide.pdf
PDF Header: %PDF-1.6
obj 6018
endobj 6017
stream 1897
endstream 1897
xref 1
trailer 1
startxref 1
/Page 773
/Encrypt 1
/ObjStm 0
/JS 3
/JavaScript 0
/AA 1
/OpenAction 0
/AcroForm 0
/JBIG2Decode 0
/RichMedia 0
/Launch 0
/EmbeddedFile 0
/XFA 0
/Colors > 2^24 0
And when you search for /AA or /JS with pdf-parser, you will not find objects that have /AA or /JS in their dictionary:
pdf-parser.py -s /AA CCNPSecurityFIREWALL642617OfficialCertGuide.pdf
Up til now, I advised users suspecting false positives, to search the PDF document with a hex editor and see if they found /AA or /JS inside a stream. But now, with the latest version of pdf-parser supporting searching inside a stream, you can do it like this:
pdf-parser.py --searchstream /AA --unfiltered CCNPSecurityFIREWALL642617OfficialCertGuide.pdf
obj 1848 0
Type: /XObject
Referencing: 38 0 R
Contains stream
<<
/Length 121194
/Filter /DCTDecode
/Width 800
/Height 600
/BitsPerComponent 8
/ColorSpace 38 0 R
/Intent /RelativeColorimetric
/Type /XObject
/Subtype /Image
>>