Didier Stevens

Tuesday 18 November 2008

My ISSA / OWASP Talk “Risky PDF”

Filed under: PDF — Didier Stevens @ 18:34

For those of you who attended my ISSA / OWASP talk Risky PDF, thanks for your interesting and challenging questions! I’m very pleased with the feedback I got.

You can download the presentation and demo files here. All my PDF blogpost can be found using category PDF.

A recurring remark I received afterward is about claiming not to be a PDF expert, while my presentation (and research) clearly shows otherwise.

I didn’t express myself clearly. When I started my presentation by stating that I’m not a PDF expert, I meant that I don’t know how to produce a PDF document with a nice layout, a content table, an index, captivating graphics, … I don’t even know how to use Adobe Professional to create a PDF document with embedded JavaScript. So don’t ask me questions about producing “benign” PDF documents, because I don’t have a clue.

But I do have build-up expertise in malicious PDF documents. I’ve become an expert in analyzing PDF malware. I know how to create a PDF document with embedded JavaScript from scratch, just using a text editor (and I’ve build tools to automate this). And I can perform a forensic analysis of PDF documents.

My PDF expertise is limited to malicious usage and forensics. Outside of the IT security field, people with my expertise are not considered PDF experts. It wasn’t intended as false modesty, I just can’t help you troubleshoot “benign” PDFs 😉

Comments (1)

Monday 10 November 2008

Shoulder Surfing a Malicious PDF Author

Filed under: Forensics,Malware,PDF — Didier Stevens @ 21:32

Ever since I read about the incremental updates feature of the PDF file format, I’ve been patiently waiting for a malicious PDF document with incremental updates to come my way. Thanks to Bojan, that day has finally arrived.

The 2 malicious PDF documents I received (data.pdf and info.pdf) both exploit the same Acrobat JavaScript util.printf vulnerability.

data.pdf is very interesting to me: it’s one PDF file containing 5 incremental updates, essentially bringing us an archeological record of the malware author’s trial-and-error session. So let’s start uncovering what the malware writer has been up to.

Looking at the type of objects inside data.pdf (with my PDF parser), we can see many startxref and xref objects:

20081110-202238

The metadata of data.pdf reveals that the guy (from personal experience, I know that most bad programmers are males 😉 ) used Adobe Acrobat 8.1.0 to create this document in the early hours of Thursday November 6th 2008, and that his machine has timezone setting +01:00.

It took 52 minutes 32 seconds to create the first version of data.pdf. This version contains everything to execute a JavaScript script upon opening of the document, but the script to be executed is empty.

44 seconds later, a second version is created, containing this script:

20081110-185852

This script performs a heap spray (the most indented section of function main) of shellcode (contained in variable sccs) and then exploits the util.printf format string bug. This exploit is contained in function main, which should be triggered by app.setTimeOut after 3 seconds. However, the use of setTimeOut in this script is buggy (details can be found in Adobe’s JS API Reference), and main() will never execute.

After 44 seconds, another version is created to try to get this exploit to work. He modified the call to setTimeOut like this:

This is completely wrong, so after 4 minutes and 12 seconds (probably spend Googling for an answer as to why this doesn’t work), he returns to the previous call, but now hopes that 5 seconds will do better than 3 seconds.

Of course, it doesn’t. After one minute and a half, he gives up, and modifies the script to execute his exploit without delay:

20081110-190045

I can’t say he’s a sharp programmer or tenacious, but at least, he’s result-driven…

Let’s turn our attention to the second malicious PDF (info.pdf) I received. This file contains no incremental updates, but it’s still interesting because it has the same origin as data.pdf. This file was created at exactly the same time, and contains the same identification (/ID[<DD95D438BE408D4FB12AC2FE7ED5E6C6><14FA8F4917ED8449B59BF6CFA41C39BD>]) as data.pdf. Most PDF applications add a unique ID to the trailer of every PDF document they create. info.pdf was saved a day later (about 37 hours later), and contains the same exploit script as data.pdf, but with an extra layer of JavaScript obfuscation.

Bojan confirmed he was the first to submit these files to Virustotal. I calculated the MD5 hashes for the different versions of data.pdf, but none were submitted to VT, so our guy didn’t use VT for QA.

It was an interesting experience, “spying” on this malware author. Let’s hope they don’t stop using incremental updates, and that some of them will be careless enough to leave personal data hidden in their malicious PDF documents.

data.pdf MD5 1A8E5242F21727959683FA8CC7AA94AD

info.pdf MD5 23F31C83EE658BB5C2635BEFDE56199A

Comments (29)

Sunday 9 November 2008

Creating PDF Test-Files

Filed under: My Software,PDF — Didier Stevens @ 12:56

As promised, I’m releasing a couple of my PDF tools as a warm-up to my ISSA Belgium and OWASP Belgium talk.

After having manually created some PDF test-files (just using a text editor), I stepped up to the next level and wrote a quick-and-dirty Python module to generate PDF documents by assembling fundamental PDF elements.

My mPDF.py module contains a class with methods to create headers, indirect objects, stream objects, trailers and XREFs. One of the programs I wrote based on this module is make-pdf-javascript.py. This Python program allows me to create a simple PDF document with embedded JavaScript that will execute upon opening of the PDF document. Program details and download here.

An example: to create a PDF document exploiting the util.printf Adobe Reader vulnerability in its simplest form (e.g. no shellcode and no heap spray), issue the following command:

Here it crashes Adobe Reader 8.1.2 on Windows XP SP2:

20081109-130302

Comments (1)

Picture Puzzle

Filed under: Puzzle — Didier Stevens @ 7:41

As I announced via Twitter, here’s a new puzzle. Find the message I’ve hidden in this picture.

First one to post a comment with the correct answer can get a sticker. For those who don’t know, comments are moderated.

Comments (10)

Monday 3 November 2008

Quickpost: Remember FireOx?

Filed under: Hacking,Quickpost — Didier Stevens @ 17:05

Remember FireOx?

This time, I tested my Excel scripts on a CommNet machine, here at TechEd Barcelona. Worked without problem.

Comments (1)

Saturday 1 November 2008

Quickpost: “An Old IE Trick” Revisited

Filed under: Malware,Quickpost — Didier Stevens @ 22:30

One year ago I blogged about an old IE trick still being used by malware. What can be said now that I resubmitted my test files to Virustotal (VT)? Not much, because VT is not an anti-virus test tool (it’s a virus test tool).

More AV products detect my test files now; and test files with longer zero byte sequences, that weren’t detected a year ago, are getting detected now. So I’m not really going out on a limb here when I say that the detection has improved. But there’s no way to quantify this improvement with VT results alone.

My test file with 255 contiguous zero bytes, which wasn’t detected by VT one year ago, is being detected by 6 AV products now. But it must be clear that I can’t conclude from this that only 6 AV products have been improved in the past year.

First of all, we can’t know if all AV products that have been improved in the past year, have been upgraded on the VT site. It’s very likely that some new engines have not been installed on VT yet.

Second, this improvement might not come to expression on VT. VT uses command-line scanners, and many AV protection features are not present in the command-line versions.

Third, the improved detection could just be the result of new signatures for the very same test files I submitted. Just out of curiosity, I created a new file with 543 contiguous zero bytes. It gets detected by some AV products.

If you’re interested in the detailed detections, here are the links to the VT results:

Quickpost info

Comments (3)

Quickpost: Fingerprinting PDF Files

Filed under: Malware,PDF,Quickpost — Didier Stevens @ 11:57

Per request, a more detailed post on how I use my pdf-parser stats option.

I have two malicious PDF files with a different title, different size (100K and 700K) and different content. But they share an identical internal PDF structure, because they have exactly the same number and type of fundamental elements:

These statistics were generated with the following command:

pdf-parser.py --stats malware.pdf

As both malicious PDF files produce identical stats (or fingerprint), I can assume they share the same origin.

Quickpost info

Thursday 30 October 2008

pdf-parser.py

Filed under: My Software,PDF — Didier Stevens @ 17:19

I’m publishing my pdf-parser tool featured in my last video. Details and download here.

Comments (3)

Thursday 23 October 2008

Excel Exercises in Style

Filed under: Hacking — Didier Stevens @ 10:34

I developed another variant of my “Excel macro injects embedded DLL” script.

In stead of creating and loading a temporary DLL from VBScript, I inject and execute shellcode directly from the VBA application.

Some HIPS would prevent my previous script from running, because it loaded an unapproved DLL. But my new version doesn’t load a DLL.

Of course, writing shellcode is more difficult than developing a PE executable.

Comments (10)

Tuesday 21 October 2008

The Case of the Corrupted Stream Object

Filed under: Malware,PDF,Reverse Engineering — Didier Stevens @ 21:38

A malicious PDF file I analyzed a couple of months ago (the one featured in this video) had a corrupted stream object. It uses a /FlateDecode filter, but I could not find a way to decompress it with the zlib library. Back then, I wrote it off as an error of the malware author.

Lately, I’ve been analyzing some shellcode, and while looking at the shellcode in said malicious PDF, I saw it! The second-stage shellcode, a egghunt shellcode, is searching through process memory for the 8 bytes at the beginning of the corrupted stream object.

The malware author knows that the PDF reader loads the PDF document in memory, so he just overwrote the stream object with his third-stage shellcode. This way, his third-stage shellcode is already in memory, waiting to be found by his second-stage shellcode. And the size of his third-stage shellcode is not limited by the buffer he is overflowing.

Didier Stevens

Tuesday 18 November 2008

My ISSA / OWASP Talk “Risky PDF”

Monday 10 November 2008

Shoulder Surfing a Malicious PDF Author

Sunday 9 November 2008

Creating PDF Test-Files

Picture Puzzle

Monday 3 November 2008

Quickpost: Remember FireOx?

Saturday 1 November 2008

Quickpost: “An Old IE Trick” Revisited

Quickpost: Fingerprinting PDF Files

Thursday 30 October 2008

pdf-parser.py

Thursday 23 October 2008

Excel Exercises in Style

Tuesday 21 October 2008

The Case of the Corrupted Stream Object

Pages

Top Posts

Categories

Blog Stats

Twitter @DidierStevens

Archives