I’m giving a 2-day training on PDF at Brucon 2013. Early-bird price applies til June 15th.
Sometimes PDFiD will give you false positives for /JS and /AA. This happens with files of a couple of MBs or bigger, because it’s statistically very likely that /AA or /JS (only three bytes long) appear inside a stream. And since PDFiD, contrary to pdf-parser, has no notion of pdf objects and streams, it can produce false positives, like this:
And when you search for /AA or /JS with pdf-parser, you will not find objects that have /AA or /JS in their dictionary:
pdf-parser.py -s /AA CCNPSecurityFIREWALL642617OfficialCertGuide.pdf
Up til now, I advised users suspecting false positives, to search the PDF document with a hex editor and see if they found /AA or /JS inside a stream. But now, with the latest version of pdf-parser supporting searching inside a stream, you can do it like this:
pdf-parser.py --searchstream /AA --unfiltered CCNPSecurityFIREWALL642617OfficialCertGuide.pdf obj 1848 0 Type: /XObject Referencing: 38 0 R Contains stream << /Length 121194 /Filter /DCTDecode /Width 800 /Height 600 /BitsPerComponent 8 /ColorSpace 38 0 R /Intent /RelativeColorimetric /Type /XObject /Subtype /Image >>