Didier Stevens

Wednesday 1 July 2009

Embedding and Hiding Files in PDF Documents

Filed under: My Software, PDF — Didier Stevens @ 6:28

My corrupted PDF quip inspired me to program another steganography trick: embed a file in a PDF document and corrupt the reference, thereby effectively making the embedded file invisible to the PDF reader.

The PDF specification provides ways to embed files in PDF documents. I’m releasing my Python program to create a PDF file with embedded file (I used make-pdf-embedded.py to create my EICAR.pdf).

Here’s how a PDF document with an embedded file looks like:

20090630-220314

/EmbeddedFiles points to the dictionary with the embedded files:

20090630-220228

As names defined in the PDF specification are case sensitive, changing the case changes the semantics: /Embeddedfiles has no meaning, and thus the PDF reader ignores it and doesn’t find the embedded file.

20090630-220137

20090630-215901

Actually, I used this trick in my Brucon puzzle. I used the –stego option of make-pdf-embedded.py:

20090630-222453

Of course, once you know the stego trick, it’s easy to recover the embedded file: edit the PDF document with an hex editor and change the case back to /EmbeddedFiles.

But if you want to make it harder to detect, use PDF obfuscation techniques. Or embed the file twice with incremental updates. First version is the file you want to hide, second version is a decoy…

The PDF language offers so many features to hide and obfuscate data!

Download:

make-pdf_V0_1_2.zip (https)

MD5: 305D57692C27DD3CD91D8C85A3932948

SHA256: A030BBCB8B54137D8047A4CB5C350725599383A4B113CABBA8871AC221378C5B

Tuesday 30 June 2009

MessageBox Shellcode

Filed under: My Software — Didier Stevens @ 5:40

Per request, I release my assembly code I’ve used in my previous blogposts to display a message box when the injected shellcode gets executed. It’s nothing special, but it will save you some time when you need a similar program.

Assemble the code with nasm like this:

nasm -o sc-mba-hello.bin sc-mba-hello.asm

I use the DLL locating code published in The Shellcoder’s Handbook, you can find it in the include file sc-api-functions.asm. MessageBoxA is located in user32.dll, this dll has to be loaded in the process you’re injecting with sc-mba-hello.

sc-ods.asm is a similar program, calling OutputDebugStringA in stead of MessageBoxA.

Download:

my-shellcode_v0_0_1.zip (https)

MD5: F215B29BA3C8F24CFBA5C24BED65B68A

SHA256: EA1DB8028954CEB18B8AD2EB37CA6BA0CD7CDC6B9A64F10561382152701C013F

The shellcode:

sc-mba-hello

Monday 29 June 2009

Quickpost: Time Lapse Photography With a Nokia Mobile

Filed under: Hardware, My Software, Quickpost — Didier Stevens @ 2:20

Did you know Nokia mobile phones with the S60 platform can be programmed in Python? During my last holiday, I wrote a small program for time lapse photography with my mobile. Here is the result, showing tidal ebbs and flows in Saint-Vaast-la-Hogue and Cancale:

This is the Python program I wrote to take a picture every minute:

#!/usr/bin/python

__description__ = 'Tool to take pictures with a Nokia phone at regular intervals'
__author__ = 'Didier Stevens'
__version__ = '0.1.1'
__date__ = '2009/06/22'

"""

Source code put in public domain by Didier Stevens, no Copyright
https://DidierStevens.com
Use at your own risk

History:
 2009/06/17: start
 2009/06/22: refactoring

Todo:
 Get Threading to work
"""

import camera
import time
import os

timelapseFolder = 'e:\\timelapse\\'
sleepTime = 57

def TakeAndSavePicture():
    global timelapseFolder

    now = '%04d%02d%02d-%02d%02d%02d' % time.localtime()[0:6]
    pic = camera.take_photo()
    pic.save(os.path.join(timelapseFolder, now, '.jpeg'))
    print 'Picture taken: %s' % now

def Main():
    global timelapseFolder
    global sleepTime

    print 'Timelapse photography started'
    if not os.path.isdir(timelapseFolder):
        os.mkdir(timelapseFolder)
        print 'Timelapse folder created: %s' % timelapseFolder
    print 'Wait between pictures %d' % sleepTime
    while True:
        TakeAndSavePicture()
        time.sleep(sleepTime)

if __name__ == '__main__':
    Main()

And then I use Avisynth to combine the jpeg pictures in a movie like this (I join pictures 00001.jpg through 00197.jpeg, 5 per second and produce a 25 fps movie):

ImageSource("%05d.jpeg", 1, 197, 5).ChangeFPS(25)

Quickpost info


Thursday 25 June 2009

bpmtk: Injecting VBScript

Filed under: Hacking, My Software, bpmtk — Didier Stevens @ 7:03

Here’s a new trick: injecting VBScript in a process. I’ve developed a DLL that will create a COM instance of the VBScripting engine and let it execute a VBScript. Injecting this DLL in a running program results in execution of the VBScript in the context of the running program. Here’s an example where I wrote a VBScript to search and replace a string in the memory of the notepad process:

Here is part of the VBScript I developed to search and replace inside the memory of a process. It uses custom methods like Peek, Poke and Output that I’ve added to the scripting engine:

20090609-205420

I’ll provide more details in an upcoming blogpost on bpmtk version 0.1.5.0, but you can already download it here.

YouTube, Vimeo and hires Xvid.

Sunday 7 June 2009

Update: Disitool V0.3

Filed under: My Software, Update — Didier Stevens @ 23:15

Last January, I got a little challenge from @hdmoore via my Twitter account: add data to a signed executable without invalidating the Authenticode signature. I updated my Digital signature tool, but I realize now I had only announced the update on Twitter, not on my blog.

The trick is to increase the size of the image data directory for the digital signature and inject the extra data after the digital signature. This way, the Authenticode validation algorithm ignores the extra data, because it considers it to be part of the signature. Use Disitool’s new inject command:

disitool.py inject ms-patch.exe data.bin ms-patch-data.exe

The authenticode signature of ms-patch.exe will remain valid in ms-patch-data.exe, provided that the length of the injected data (file data.bin) is a multiple of 8.

You can use the paddata option to make the injected data size a multiple of 8 if it isn’t:

disitool.py inject --paddata ms-patch.exe data.bin ms-patch-data.exe

Disitool can be downloaded here.

Monday 1 June 2009

Quickpost: Sending WiFi Beacon Frames with an AirPcap Adapter

Filed under: My Software, Quickpost, WiFi — Didier Stevens @ 10:29

While preparing for my OSWP exam, I came across an unpublished Python program for the AirPcap adapter. I cleaned-it up a bit and here it is: apc-b

This program allows you to send out beacon frames, a very simple way to spoof WiFi access points.

This is the command to generate beacon frames on channel 6 for a couple of ESSIDs listed in file apc-b-2.txt:

20090601-120518

And here is Kismet on my N800 capturing these beacon frames:

kismet-n800


Quickpost info


Thursday 14 May 2009

Malformed PDF Documents

Filed under: Malware, My Software, PDF — Didier Stevens @ 7:55

For the sake of this post, I consider a PDF document malformed when it doesn’t observe the basic structure of a PDF document.

I’ve seen a couple of malicious, malformed PDF documents. The most recent was a malicious swine flu PDF document that contains another, bening, PDF document with information about the swine flu (obtained from the CDC site). This second PDF document is displayed to mislead the user while the exploit runs.

20090513-211945

This second PDF document is XOR-encoded and appended to the end of the malicious PDF document, making the malicious PDF document malformed (FYI: the PDF file format supports embedded files, but this wasn’t used here). A PDF reader like Adobe or Foxit has no problems opening this malformed PDF, because it scans a PDF document for the trailer (%%EOF) starting from the end of the document. Everything that follows this trailer and doesn’t adhere to the PDF syntax is just ignored.

20090513-213940

I’ve added some new features to my PDF tools to handle malformed PDF documents.

PDFiD

The new version of PDFiD has an –extra option. Like it names imply, use it to add extra analysis data to the PDFiD report. The extra option adds entropy calculations to the report:

20090513-220050

For a normal PDF file, expect the total entropy and the entropy of bytes inside stream objects to be close to the maximum value 8.0. This means that the distribution of byte values is close to random, which is characteristic of compressed and encrypted data.

Outside streams objects, the data appears much less random, and the entropy is much lower, usually around 4.0 or 5.0.

However, for malformed PDF documents, where data is added without using stream objects, the entropy outside stream objects is much higher. Here is the report for the malicious swine flu PDF:

20090513-203729

Another datum added to the report by using the –extra option is for the end-of-file marker %%EOF.

The “%%EOF” line mentions the number of times %%EOF appears in the document (more than once usually indicates incremental updates). “After last %%EOF” counts the number of bytes after the last %%EOF. This value will be not be zero when data has been appended.

pdf-parser

The previous versions of pdf-parser output a lot of “todo 10″ data (an indication of malformed PDF data) when they parse a malformed PDF document. I’ve suppresed this behavior, you’ll need to use option –verbose to enable it from now on, should you need it. Since I first use PDFiD to check a PDF document before using pdf-parser, I don’t consider the “todo” output relevant anymore, as PDFiDs entropy and %%EOF report will tell me if a PDF document is malformed.

20090513-223049

But the other new option in pdf-parser, –extract, is more important. Example:

pdf-parser.py –extract payload.bin malformed.pdf

This option will extract all malformed data from malformed.pdf and write it to file payload.bin, giving you easy access to the embedded payload.

Samples

You can download a normal and malformed Hello World PDF file here to get familiarized with my updated tools. 4096 random bytes have been appended to the end of the PDF document to make it malformed.

Here is a last example when the entropy calculation can be handy even if the payload is stored inside a stream object:

20090513-203522

The reason the total entropy and entropy of bytes inside stream objects is very low here, is that this malicious PDF document has a payload with a very long, uncompressed NOP-sled (more than one million times 0×90).

Monday 11 May 2009

PDF Filter Abbreviations

Filed under: My Software, PDF — Didier Stevens @ 0:00

@binjo ’s tweet made me realize PDF filter abbreviations do apply to stream objects too, although the PDF reference document only defines them for inline images. Here are the abbreviations:

  • ASCIIHexDecode -> AHx
  • ASCII85Decode -> A85
  • LZWDecode -> LZW
  • FlateDecode -> Fl
  • RunLengthDecode -> RL
  • CCITTFaxDecode -> CCF
  • DCTDecode -> DCT

This means that, for example, a flatedecode filter for a stream object can not only be specified as /Filter /FlateDecode, but also as /Filter /Fl.

I updated my PDF-tools to support this.

And jprosco e-mailed me an update to my pdf-parser tool to support ASCIIHexDecode, because he had to analyze some malicious PDF documents that used it to encode the JavaScript.

Sunday 10 May 2009

Quickpost: Disinformational Tweets

Filed under: My Software, Nonsense, Quickpost — Didier Stevens @ 12:55

This useless Python program is the result of some lazy Sunday coding. It will create random tweets based on a template file. You could use it to try to protect your privacy on Twitter by disinforming potential data miners.

Will I use it for my Twitter account? No, I don’t need a program to disinform ;-)

20090510-142457

Each time you run the program, it will post one random tweet. This tweet is generated from a templates file. Each line in the templates file is the template for a tweet. You can use variables (between curly braces, example: {location}) in the templates to increase the number of possible tweets. Variables and their values are also stored in the template file, after the template lines. Your template file must allow the program to generate at least 2 different tweets, because it generates a tweet different from the last tweet.

20090510-143740

The program requires the twitter module, itself requiring the simplejson module.

And you need to create a credentials file (disinformational-tweets.cred) with the Twitter credentials of the account for which the program has to generate random Tweets. The first line of the credentials file has to contain the username, the second line has to contain the password.

A Firefox plugin to generate these tweets would probably be more ‘useful’, but hey, it’s a lazy Sunday.

Download:

disinformational-tweets_v0_0_1.zip (https)

MD5: 36CDB584634ED299E7ACE0D64E846003

SHA256: C5FCE76443549C3A8882B799B6F7A754EF6AEE5F11F3E94FF255EE541205C17B


Quickpost info


Wednesday 6 May 2009

Shellcode 2 VBScript

Filed under: Hacking, My Software — Didier Stevens @ 9:06

I had not posted my Python script to convert shellcode to VBScript, so here it is.

Download:

shellcode2vbscript_v0_1.zip (https)

MD5: AAB0431127C657C9A3EF67E1C73E6711

SHA256: D1CDDAFCB734EC3F35E558DECFF2EDB73DC0C394936814B602B605F09DE4A5E5

Older Posts »

Blog at WordPress.com.