Update: pdf-parser.py Version 0.7.0

Thursday 28 February 2019

Update: pdf-parser.py Version 0.7.0

Filed under: My Software,PDF,Update — Didier Stevens @ 0:00

This new version of pdf-parser brings support for analysis of stream objects (/ObjStm). Use new option -O to enable this mode.

Stream objects (/ObjStm) are objects that contain other objects: they have a stream, containing other objects. These contained objects can not have a stream.

pdfid.py detects the presence of stream objects:

But pdfid can not look inside a stream, to figure out what objects are inside. That’s why I always say to use pdf-parser to select and decompress stream objects, and then pipe this through pdfid:

When pdf-parser parses a stream object, it does not parse the content of its stream:

This changes with this new version of pdf-parser. When option -O is used, pdf-parser extracts objects from /ObjStm streams and handles them like normal objects. In the following example, object 2 is contained in object 1:

pdf-parser provides statistics for a PDF’s content with option -a:

Combining option -a with option -O includes objects present inside stream objects (this is an alternative for combining both tools: pdf-parser -s objstm -f a.pdf | pdfid -f):

This output shows that /JavaScript can be found in object 7. We need to use option -O to find object 7 “hiding” in object 1:

If we forget to use option -O, object 7 is not found:

Here is a video showing this new feature:

pdf-parser_V0_7_0.zip (https)
MD5: CDE355BB3FCACE3C4EDBC762E632F9AB
SHA256: 219FF0BB729C4478679A79163CA9942296ACF49E4EC06D128CBC53FBEE25FF05

Comments (4)

4 Comments »

[…] Update: pdf-parser.py Version 0.7.0 […]

Pingback by Title: Overview of Content Published in February | Didier Stevens — Saturday 2 March 2019 @ 0:00
[…] Update: pdf-parser.py Version 0.7.0 […]

Pingback by Week 9 – 2019 – This Week In 4n6 — Monday 4 March 2019 @ 7:33
[…] pdf-parser.py version 0.7.0, I prefer another method: using option -O to let pdf-parser.py extract and parse the objects inside […]

Pingback by Analyzing a Phishing PDF with /ObjStm | Didier Stevens — Thursday 7 March 2019 @ 0:00
[…] There’s a new environment variable, PDFPARSER_OPTIONS, that can be used to provide extra options you want to include with each execution of pdf-parser.py. This is useful for option -O, an option to parse stream objects. […]

Pingback by Update Of My PDF Tools | Didier Stevens — Monday 30 September 2019 @ 19:16

RSS feed for comments on this post. TrackBack URI

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28

Didier Stevens

Thursday 28 February 2019