Didier Stevens

Thursday 28 February 2019

Update: pdf-parser.py Version 0.7.0

Filed under: My Software,PDF,Update — Didier Stevens @ 0:00

This new version of pdf-parser brings support for analysis of stream objects (/ObjStm). Use new option -O to enable this mode.

Stream objects (/ObjStm) are objects that contain other objects: they have a stream, containing other objects. These contained objects can not have a stream.

pdfid.py detects the presence of stream objects:

But pdfid can not look inside a stream, to figure out what objects are inside. That’s why I always say to use pdf-parser to select and decompress stream objects, and then pipe this through pdfid:

When pdf-parser parses a stream object, it does not parse the content of its stream:

This changes with this new version of pdf-parser. When option -O is used, pdf-parser extracts objects from /ObjStm streams and handles them like normal objects. In the following example, object 2 is contained in object 1:

pdf-parser provides statistics for a PDF’s content with option -a:

Combining option -a with option -O includes objects present inside stream objects (this is an alternative for combining both tools: pdf-parser -s objstm -f a.pdf | pdfid -f):

This output shows that /JavaScript can be found in object 7. We need to use option -O to find object 7 “hiding” in object 1:

If we forget to use option -O, object 7 is not found:

Here is a video showing this new feature:

pdf-parser_V0_7_0.zip (https)
MD5: CDE355BB3FCACE3C4EDBC762E632F9AB
SHA256: 219FF0BB729C4478679A79163CA9942296ACF49E4EC06D128CBC53FBEE25FF05

3 Comments »

  1. […] Update: pdf-parser.py Version 0.7.0 […]

    Pingback by Title: Overview of Content Published in February | Didier Stevens — Saturday 2 March 2019 @ 0:00

  2. […] Update: pdf-parser.py Version 0.7.0  […]

    Pingback by Week 9 – 2019 – This Week In 4n6 — Monday 4 March 2019 @ 7:33

  3. […] pdf-parser.py version 0.7.0, I prefer another method: using option -O to let pdf-parser.py extract and parse the objects inside […]

    Pingback by Analyzing a Phishing PDF with /ObjStm | Didier Stevens — Thursday 7 March 2019 @ 0:00


RSS feed for comments on this post. TrackBack URI

Leave a Reply (comments are moderated)

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.