Didier Stevens

Thursday 14 May 2009

Malformed PDF Documents

Filed under: Malware,My Software,PDF — Didier Stevens @ 7:55

For the sake of this post, I consider a PDF document malformed when it doesn’t observe the basic structure of a PDF document.

I’ve seen a couple of malicious, malformed PDF documents. The most recent was a malicious swine flu PDF document that contains another, bening, PDF document with information about the swine flu (obtained from the CDC site). This second PDF document is displayed to mislead the user while the exploit runs.

20090513-211945

This second PDF document is XOR-encoded and appended to the end of the malicious PDF document, making the malicious PDF document malformed (FYI: the PDF file format supports embedded files, but this wasn’t used here). A PDF reader like Adobe or Foxit has no problems opening this malformed PDF, because it scans a PDF document for the trailer (%%EOF) starting from the end of the document. Everything that follows this trailer and doesn’t adhere to the PDF syntax is just ignored.

20090513-213940

I’ve added some new features to my PDF tools to handle malformed PDF documents.

PDFiD

The new version of PDFiD has an –extra option. Like it names imply, use it to add extra analysis data to the PDFiD report. The extra option adds entropy calculations to the report:

20090513-220050

For a normal PDF file, expect the total entropy and the entropy of bytes inside stream objects to be close to the maximum value 8.0. This means that the distribution of byte values is close to random, which is characteristic of compressed and encrypted data.

Outside streams objects, the data appears much less random, and the entropy is much lower, usually around 4.0 or 5.0.

However, for malformed PDF documents, where data is added without using stream objects, the entropy outside stream objects is much higher. Here is the report for the malicious swine flu PDF:

20090513-203729

Another datum added to the report by using the –extra option is for the end-of-file marker %%EOF.

The “%%EOF” line mentions the number of times %%EOF appears in the document (more than once usually indicates incremental updates). “After last %%EOF” counts the number of bytes after the last %%EOF. This value will be not be zero when data has been appended.

pdf-parser

The previous versions of pdf-parser output a lot of “todo 10″ data (an indication of malformed PDF data) when they parse a malformed PDF document. I’ve suppresed this behavior, you’ll need to use option –verbose to enable it from now on, should you need it. Since I first use PDFiD to check a PDF document before using pdf-parser, I don’t consider the “todo” output relevant anymore, as PDFiDs entropy and %%EOF report will tell me if a PDF document is malformed.

20090513-223049

But the other new option in pdf-parser, –extract, is more important. Example:

pdf-parser.py –extract payload.bin malformed.pdf

This option will extract all malformed data from malformed.pdf and write it to file payload.bin, giving you easy access to the embedded payload.

Samples

You can download a normal and malformed Hello World PDF file here to get familiarized with my updated tools. 4096 random bytes have been appended to the end of the PDF document to make it malformed.

Here is a last example when the entropy calculation can be handy even if the payload is stored inside a stream object:

20090513-203522

The reason the total entropy and entropy of bytes inside stream objects is very low here, is that this malicious PDF document has a payload with a very long, uncompressed NOP-sled (more than one million times 0×90).

Tuesday 21 April 2009

PDFiD On VirusTotal

Filed under: Malware,My Software,PDF — Didier Stevens @ 16:59

I know my posts here are rather emotionless, and that’s how I prefer them for this blog.

But this time, I’m very proud and I’m not hiding it: my PDFiD tool is now running on VirusTotal!

Thanks for your work Julio!

PDFiD will give you statistics of some very basic elements of the PDF language. This helps you decide if a PDF could be malicious or not.

pdfid-virustotal

Tuesday 31 March 2009

PDFiD

Filed under: Malware,My Software,PDF — Didier Stevens @ 7:08

I’ve developed a new tool to triage PDF documents, PDFiD. It helps you differentiate between PDF documents that could be malicious and those that are most likely not. I’ve kept the design very simple (it’s not a parser, but a string scanner) to be fast and to avoid exploitable bugs.

VirusTotal will included it if Julio Canto is satisfied with the tests :-) .

20090330-214223

Monday 9 March 2009

Quickpost: /JBIG2Decode “Look Mommy, No Hands!”

Filed under: Malware,PDF,Quickpost,Vulnerabilities — Didier Stevens @ 7:11

My previous blogpost showed how minimal user interaction can still get a malicious PDF document to infect a machine. Remembering F-Secure’s misadventure with .WMF and Google Desktop Search, I took some time to look at Windows Indexing Service. The news is not good. This time, I can get a PoC PDF document to trigger the /JBIG2Decode bug without any user interaction whatsoever. And the bug happens in a process running with Local System rights!

On a Windows XP SP2 machine with Windows Indexing Services started and Adobe Acrobat Reader 9.0 installed, there is absolutely no user interaction required to trigger the /JBIG2Decode vulnerability. When the PoC PDF file is on the disk, it will be indexed by Windows Indexing Services and the buggy /JBIG2Decode code will be executed.

When Adobe Acrobat Reader 9.0 is installed, it also installs an IFilter (AcroRdIF.dll). This COM object extends the Windows Indexing Service with the capability to read and index PDF documents. When the Windows Indexing Service encounters a PDF file, it will index it. The content indexing daemon (cidaemon.exe) calls the Acrobat IFilter (AcroRdIF.dll) which loads the Acrobat PDF parser (AcroRD32.dll). If the PDF document contains a malformed /JBIG2Decode stream object, it will result in an access violation in the instruction at 0x01A7D89A.

In other words, if you’ve a malicious PDF document on a machine with Windows Indexing Services, it can infect your machine. And you don’t need a user to open or select the PDF document.

The good news is that Windows Indexing Services is not started on a default Windows XP SP2 install. Update: Although Windows Indexing Services is not on by default, after you’ve executed a search as local admin, you’ll be asked if you want “to make future searches faster”. If you answer yes, Windows Indexing Services will be automatically started.
The bad news is that Windows Indexing Services runs under the local system account on Windows XP SP2. This results in a privilege escalation.

Consider a Windows machine with Windows Indexing Services running, Adobe Acrobat reader installed and a file sharing service (FTP/IIS/P2P/…). Uploading a specially crafted PDF document to this machine will give you a local system shell.

To disable Windows Indexing Services’ capability to index PDF documents, unregister the IFilter: regsvr32 /u AcroRdIf.dll

But IFilters are also used by other software:

  • Microsoft Search Server 2008
  • Windows Desktop Search
  • SharePoint
  • SQL Server (full-text search)

My PoC PDF file also triggers in /JBIG2Decode in Windows Desktop Search (I tested version 4.0). But Windows Desktop Search has a better security architecture than Windows Indexing Service. Although the service runs under the Local System account, the actual calling of the IFilters is done in a separate process that runs under the Local Service account (this account has less privileges and can’t take full control of the machine).

wds

I’ve not analyzed other applications using IFilters. If you use ScarePoint (that’s how my wife, who has to work with it, calls it) or another IFilter supporting application and you want to be safe, unregister the Acrobat IFilter.

And don’t forget that, depending onĀ  your Windows version and CPU, you’re also protected by technologies like DEP and ASLR.

Google Desktop Search doesn’t use IFilters, unless you’ve installed this plugin to add IFilter support to Google Desktop Search.

Wednesday 4 March 2009

Quickpost: /JBIG2Decode Trigger Trio

Filed under: Malware,PDF,Quickpost,Vulnerabilities — Didier Stevens @ 14:35

Sometimes a piece of malware can execute without even opening the file. As this is the case with the /JBIG2Decode vulnerability in PDF documents, I took the time to produce a short video showing 3 ways the vulnerability can trigger without even opening the PDF document.

The first 2 demos use a “classic” /JBIG2Decode PDF exploit, the third demo uses a new PoC /JBIG2Decode PDF exploit I developed. This PDF document has a malformed /JBIG2Decode stream object in the metadata instead of the page. All PDF documents used have just a malformed /JBIG2Decode stream object, they don’t include a payload (shellcode), neither a JavaScript heap spray.

So how is it possible to exploit this vulnerability in a PDF document without having the user open this document? The answer lies in Windows Explorer Shell Extensions. Have you noticed that when you install a program like WinZip, an entry is added to the right-click menu to help you compress and extract files? This is done with a special program (a shell extension) installed by the WinZIp setup program.

When you install Adobe Acrobat Reader, a Column Handler Shell Extension is installed. A column handler is a special program (a COM object) that will provide Windows Explorer with additional data to display (in extra columns) for the file types the column handler supports. The PDF column handler adds a few extra columns, like the Title. When a PDF document is listed in a Windows Explorer windows, the PDF column handler shell extension will be called by Windows Explorer when it needs the additional column info. The PDF column handler will read the PDF document to extract the necessary info, like the Title, Author, …

This explains how the PDF vulnerability can be exploited without you opening the PDF document. Under the right circumstances, a Windows Explorer Shell Extension will read the PDF document to provide extra information, and in doing so, it will execute the buggy code and trigger the vulnerability. Just like it would when you would explicitly open the document. In fact, we could say that the document is opened implictly, because of your actions with Windows Explorer.

So let me demo 3 circumstances under which a PDF Shell Extension will act and thereby trigger the vulnerability. One important detail before I do this: when the exception occurs in the Adobe Acrobat code, it is trapped by Windows Explorer without any alert. That’s why in the demos, I attached a debugger (ODBG) to Windows Explorer to intercept and visualize this exception. So each time the vulnerability triggers, the view switches to the debugger to display the exception.

In the first demo, I just select the PDF document with one click. This is enough to exploit the vulnerability, because the PDF document is implicitly read to gather extra information.

In the second demo, I change the view to Thumbnails view. In a thumbnail view, the first page of a PDF document is rendered to be displayed in a thumbnail. Rendering the first page implies reading the PDF document, and hence triggering the vulnerability.

In the third demo, I use my special PDF document with the malformed stream object in the metadata. When I hover with the mouse cursor over the document (I don’t click), a tooltip will appear with the file properties and metadata. But with my specially crafted PDF document, the vulnerability is triggered because the metadata is read to display the tooltip…

So be very careful when you handle malicious files. You could execute it inadvertently, even without double-clicking the file. That’s why I always change the extension of malware (trojan.exe becomes trojan.exe.virus) and handle them in an isolated virus lab. Outside of that lab, I encrypt the malware.

YouTube, Vimeo and hires Xvid.


Quickpost info


Tuesday 9 December 2008

Updates: bpmtk and Hakin9; PDF and Metasploit

Filed under: Announcement,Hacking,Malware,My Software,PDF,Update — Didier Stevens @ 21:23

Hakin9 has published my bpmtk article. The article mentions bpmtk version 0.1.4.0; however, this new version has no new features. But it comes with extra PoC code, like a LUA-mode keylogger and “rootkit”. New blogposts will explain this new PoC code.

bpmtk12

And upcoming bpmtk version 0.1.5.0 contains a new feature to inject shellcode. Just have to update the documentation.

On the PDF front: I’ve produced my first Ruby code ;-) . I worked together with MC from Metasploit to optimize the PDF generation code in this util.printf exploit module. It uses some obfuscation techniques I described 8 months ago.

Wednesday 26 November 2008

Update: Restoring Safe Mode with a .REG file, and a Live CD

Filed under: Malware,Update — Didier Stevens @ 19:39

As more malware seems to delete the SafeBoot keys nowadays, and even prevents you from restoring these keys, I’m posting this “Enhanced Fix Safe Mode” procedure. In essence, it’s the same as my first procedure, but to avoid interference by the malware, we will boot from a Live CD and then fix the registry. Booting from a Live CD means that we boot a clean OS from the CD, and thus prevent the malware from running and interfering with our rescue operation. In a nutshell: boot from a Live CD, load the HKLM registry hive and merge the missing SafeBoot keys.

Notice that the configuration of the machine you’re fixing might be different from the one I’m describing. The system directory could be on another drive than C, you could need to fix ControlSet002 in stead of ControlSet001, …
So watch out, and update this procedure according to the configuration of the crippled machine.

And since you’re going to modify a critical system file, make a backup first (at least of the CONFIG directory).

Copy the respective reg file to your C:\ drive (for example SafeBoot-for-Windows-XP-SP2.reg for XP SP2).
Shutdown the PC and start from a Windows Live CD, like the Ultimate Boot CD For Windows.

Start RegEdit:

safeboot-0000

Select HKEY_LOCAL_MACHINE, and load the hive file C:\WINDOWS\system32\config\system (File / Load Hive…):

safeboot-0003

Name the loaded hive FixSafeboot:

safeboot-0004

Open the key HKLM\FixSafeboot\ControlSet### which is lacking the Safeboot key (there could be more than one ControlSet key you want to fix):

safeboot-0005

safeboot-0006

If the SafeBoot key is not missing (or the keys beneath it), you’re either looking in the wrong place or you’re not dealing with a corrupted SafeBoot key (in which case applying this procedure is useless).

If you’re not sure which ControlSet### to fix, take a peek at the value of Current in the Select key:

safeboot-0016

Here the value for Current is 1, so it’s ControlSet001 which will be used when the system boots, and that’s the one we want to fix.

Open C:\SafeBoot-for-Windows-XP-SP2.reg (the one you copied on the C:\ drive) with notepad:

safeboot-0007

safeboot-0008

Perform a search and replace: replace SYSTEM\CurrentControlSet with FixSafeboot\ControlSet### (### being the number of the ControlSet you want to fix, like 001). Save the modified reg file:

safeboot-0009

safeboot-0010

Import the reg file C:\SafeBoot-for-Windows-XP-SP2.reg with regedit (File / Import…):

safeboot-0011

safeboot-0012

Check that the SafeBoot key has been added:

safeboot-0013

Select the FixSafeboot key and unload it (File / Unload Hive…):

safeboot-0014

safeboot-0015

Shutdown the PC and start in Safe Mode (F8).

If you still can’t boot into Safe Mode, you’re either facing another problem than a Safe Mode disabling malware, or the malware operates early in the boot process and interferes with Safe Mode booting. If you suspect malware, try scanning with a Live CD with an anti-virus scanner, like the F-Secure Rescue CD.

Monday 10 November 2008

Shoulder Surfing a Malicious PDF Author

Filed under: Forensics,Malware,PDF — Didier Stevens @ 21:32

Ever since I read about the incremental updates feature of the PDF file format, I’ve been patiently waiting for a malicious PDF document with incremental updates to come my way. Thanks to Bojan, that day has finally arrived.

The 2 malicious PDF documents I received (data.pdf and info.pdf) both exploit the same Acrobat JavaScript util.printf vulnerability.

data.pdf is very interesting to me: it’s one PDF file containing 5 incremental updates, essentially bringing us an archeological record of the malware author’s trial-and-error session. So let’s start uncovering what the malware writer has been up to.

Looking at the type of objects inside data.pdf (with my PDF parser), we can see many startxref and xref objects:

20081110-202238

The metadata of data.pdf reveals that the guy (from personal experience, I know that most bad programmers are males ;-) ) used Adobe Acrobat 8.1.0 to create this document in the early hours of Thursday November 6th 2008, and that his machine has timezone setting +01:00.

It took 52 minutes 32 seconds to create the first version of data.pdf. This version contains everything to execute a JavaScript script upon opening of the document, but the script to be executed is empty.

44 seconds later, a second version is created, containing this script:

20081110-185852

This script performs a heap spray (the most indented section of function main) of shellcode (contained in variable sccs) and then exploits the util.printf format string bug. This exploit is contained in function main, which should be triggered by app.setTimeOut after 3 seconds. However, the use of setTimeOut in this script is buggy (details can be found in Adobe’s JS API Reference), and main() will never execute.

After 44 seconds, another version is created to try to get this exploit to work. He modified the call to setTimeOut like this:

20081110-185933

This is completely wrong, so after 4 minutes and 12 seconds (probably spend Googling for an answer as to why this doesn’t work), he returns to the previous call, but now hopes that 5 seconds will do better than 3 seconds.

20081110-190004

Of course, it doesn’t. After one minute and a half, he gives up, and modifies the script to execute his exploit without delay:

20081110-190045

I can’t say he’s a sharp programmer or tenacious, but at least, he’s result-driven…

Let’s turn our attention to the second malicious PDF (info.pdf) I received. This file contains no incremental updates, but it’s still interesting because it has the same origin as data.pdf. This file was created at exactly the same time, and contains the same identification (/ID[<DD95D438BE408D4FB12AC2FE7ED5E6C6><14FA8F4917ED8449B59BF6CFA41C39BD>]) as data.pdf. Most PDF applications add a unique ID to the trailer of every PDF document they create. info.pdf was saved a day later (about 37 hours later), and contains the same exploit script as data.pdf, but with an extra layer of JavaScript obfuscation.

Bojan confirmed he was the first to submit these files to Virustotal. I calculated the MD5 hashes for the different versions of data.pdf, but none were submitted to VT, so our guy didn’t use VT for QA.

It was an interesting experience, “spying” on this malware author. Let’s hope they don’t stop using incremental updates, and that some of them will be careless enough to leave personal data hidden in their malicious PDF documents.

data.pdf MD5 1A8E5242F21727959683FA8CC7AA94AD

info.pdf MD5 23F31C83EE658BB5C2635BEFDE56199A

Saturday 1 November 2008

Quickpost: “An Old IE Trick” Revisited

Filed under: Malware,Quickpost — Didier Stevens @ 22:30

One year ago I blogged about an old IE trick still being used by malware. What can be said now that I resubmitted my test files to Virustotal (VT)? Not much, because VT is not an anti-virus test tool (it’s a virus test tool).

More AV products detect my test files now; and test files with longer zero byte sequences, that weren’t detected a year ago, are getting detected now. So I’m not really going out on a limb here when I say that the detection has improved. But there’s no way to quantify this improvement with VT results alone.

My test file with 255 contiguous zero bytes, which wasn’t detected by VT one year ago, is being detected by 6 AV products now. But it must be clear that I can’t conclude from this that only 6 AV products have been improved in the past year.

First of all, we can’t know if all AV products that have been improved in the past year, have been upgraded on the VT site. It’s very likely that some new engines have not been installed on VT yet.

Second, this improvement might not come to expression on VT. VT uses command-line scanners, and many AV protection features are not present in the command-line versions.

Third, the improved detection could just be the result of new signatures for the very same test files I submitted. Just out of curiosity, I created a new file with 543 contiguous zero bytes. It gets detected by some AV products.

If you’re interested in the detailed detections, here are the links to the VT results:


Quickpost info


Quickpost: Fingerprinting PDF Files

Filed under: Malware,PDF,Quickpost — Didier Stevens @ 11:57

Per request, a more detailed post on how I use my pdf-parser stats option.

I have two malicious PDF files with a different title, different size (100K and 700K) and different content. But they share an identical internal PDF structure, because they have exactly the same number and type of fundamental elements:

These statistics were generated with the following command:

pdf-parser.py --stats malware.pdf

As both malicious PDF files produce identical stats (or fingerprint), I can assume they share the same origin.


Quickpost info


« Previous PageNext Page »

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 93 other followers