Didier Stevens

Sunday 26 September 2021

Patching A Java .class File

Filed under: 010 Editor,Forensics,Hacking,Malware — Didier Stevens @ 0:00

010 Editor is one of few commercial applications that I use daily. It’s a powerful binary editor with scripting and templates.

I recently had to patch a Java .class file: extend a string inside that class. Before going the route of decompiling / editing / recompiling, I tried with 010 Editor.

Here is the file opened inside the editor:

When opening the file, 010 Editor recognized the .class extension and installed and ran the template for .class files. That’s what I wanted to know: is there a template for .class files? Yes, there is!

Here is how you can apply a template manually, in case the file extension is not the original extension:

And this is how the template results look like:

Under the hex/ascii dump, the template results are displayed: a set of nested fields that match the internal structure of .class file. For example, the first field I selected here, u4 magic, is the magic header of a .class file: CAFEBABE.

The string I want to extend is this one:

I need to extend string “1.2 (20210922)”. Into something like “1.2 (20210922a)”.

Doing so will make the string longer, thus I need to add a byte to the file (trivial), but I also need to make sure that the binary structure of .java files remain valid: for example, if there is something in that structure like a field length, I need to change the field length too.

I’m not familiar with the internal structure of .class files, that why I’m using 010 Editor’s .class template, hoping that the template will make it clear to me what needs to be changed.

To find the template result field I need to modify, I position my cursor on the string I want to modify inside the ASCII dump, I right-click and select “Jump To Template Variable”:

Which selects the corresponding template variable:

So my cursor was on the 10th byte (bytes[9]) of the string, which is part of template variable cp_info constant_pool[27]. From that I gather that the string I want to modify is inside a pool of constants.

I can select that template variable:

And here I can see which bytes inside the .class file were selected. It’s not only the string, but also bytes that represent the tag and length. The length is 14, that’s indeed the length of the string I want to extend. Since I want to add 1 character, I change the length from 14 to 15: I can do that inside the template results by double-clicking the value 14, I don’t need to make that change inside the hexdump:

Next I need to add a character to the string. I can do that in the ASCII dump:

I have to make sure that the editor is in insert mode (INS), so that when I type characters, they are inserted at the cursor, in stead of overwriting existing bytes:

And then I can type my extra character:

So I have changed the constant string I wanted to change. Maybe there are more changes to make to the internal structure of this .class file, like other length fields … I don’t know. But what I do as an extra check is: save the modified file and run the template again. It runs without errors, and the result looks good.

So I guess there are no more changes to make, and I decide to tryout my modified .class file and see what happens: it works, so there are no other changes to make.

Saturday 27 March 2021

FileZilla Uses PuTTY’s Registry Fingerprint Cache

Filed under: Encryption,Forensics,Networking — Didier Stevens @ 10:01

Today I figured out that FileZilla uses PuTTY‘s registry key (HKCU\SOFTWARE\SimonTatham\PuTTY\SshHostKeys) to cache SSH fingerprints.

This morning, I connected to my server over SFTP with FileZilla, and got this prompt:

That’s unusual. I logged in over SSH, and my SSH client did not show a warning. I checked the fingerprint on my server, and it matched the one presented by FileZilla.

What’s going on here? I started to search through FileZilla configuration files (XML files) looking for the cached fingerprints, and found nothing. Then I went to the registry, but there’s no FileZilla entry under my HKCU Software key.

Then I’m taking a look with ProcMon to figure out where FileZilla caches its fingerprints. After some searching, I found the answer:

FileZilla uses PuTTY’s registry keys!

And indeed, when I start FileZilla again and allow it to cache the key, it appears in PuTTY’s registry keys.

One last check: I modified the registry entry and started FileZilla again:

And now FileZilla warns me that the key is different. That confirms that FileZilla reads and writes PuTTY’s registry fingerprint cache.

So that answered my question: “Why did FileZilla warn me this morning?” “Because the key was not cached”.

But then I was left with another question: “Why is the key no longer cached, because it was cached?”

Well, I started to remember that some days ago today, I had been experimenting with PuTTY’s registry keys. I most likely deleted that key (PuTTY is not my default SSH client). I verified the last-write timestamp for PuTTY’s registry key, and indeed, 4 days ago it was last written to.

Update:

Thanks to Nicolas for pointing out that fzsftp is based on PuTTY:

Friday 12 March 2021

Quickpost: “ProxyLogon PoC” Capture File

Filed under: Forensics,Networking,Quickpost,Vulnerabilities — Didier Stevens @ 18:43

I was able to get the “ProxyLogon PoC” Python script running against a vulnerable Exchange server in a VM. It required some tweaks to the code, and also a change in Exchange permissions, as explained in this tweet by @irsdl.

I created a capture file:

More details will follow.

Update: I added a second capture file (proxylogon-poc-capture-with-keys-and-webshell.pcapng), this one includes a request to the webshell that was installed.

proxylogon-poc-capture-with-keys_V2.zip (https)
MD5: A005AC9CCE0F833C99B5113E79005C7D
SHA256: AA092E099141F8A09F62C3529D8B27624CD11FF348738F78CA9A1E657F999755


Quickpost info


Wednesday 15 April 2020

Analyzing Malformed ZIP Files

Filed under: Forensics,maldoc,My Software — Didier Stevens @ 0:00

With version 0.0.16 (we are now at version 0.0.18), I updated my zipdump.py tool to handle (deliberately) malformed ZIP files. My zipdump tool uses Python’s ZIP module to analyze ZIP files.

Now, zipdump has a an option (-f) to scan arbitrary binary files for ZIP records.

I will show here how this feature can be used, by analyzing a sample Xavier Mertens wrote a diary entry about. This sample is a Word document with macros, an OOXML (Office Open XML format) file (.docm). It is malformed, because 1) there’s an extra byte at the beginning and 2) there’s a byte missing at the end.

When you use my zipdump tool to look at the file, you get an error:

Using option -f l (list), we can find all PKZIP records inside arbitrary, binary files:

When using option -f with value l, a listing will be created of all PKZIP records found in the file, plus extra data. Some of these entries in this report will have an index, that can be used to select the entry.

In this example, 2 entries can be selected:

p: extra bytes at the beginning of the file (prefix)

1: an end-of-central-directory record (PK0506 end)

Using option -f p, we can select the prefix (extra data at the beginning of the file) for further analysis:

And from this hex/ascii dump, we learn that there is one extra byte at the beginning of the ZIP file, and that it is a newline characters (0x0A).

Using option -f 1, we can select the EOCD record to analyze the ZIP file:

As this generates an error, we need to take a closer look at the EOCD record by adding option -i (info):

With this info, we understand that the missing byte makes that the comment length field is one byte short, and this causes the error seen in previous image.

ZIP files can contain comments (for the ZIP container, and also for individual files): these are stored at the end of the PKZIP records, preceded by a 2-byte long, little-endian integer. This integer is the length of the comment. If there is no comment, this integer is zero (0x00).

Hence, the byte we are missing here is a NULL (0x00) byte. We can append a NULL byte to the sample, and then we should be able to analyze the ZIP file. In stead of modifying the sample, I use my tool cut-bytes.py to add a single NULL byte to the file (suffix option: -s #h#00) and then pipe this into zipdump:

File 5 (vbaProject.bin) contains the VBA macros, and can be piped into oledump.py:

I also created a video:

zipdump_v0_0_18.zip (https)
MD5: 34DC469E8CD4E5D3E9520517DEFED888
SHA256: 270B26217755D7ECBCB6D642FBB349856FAA1AE668DB37D8D106B37D062FADBB

Tuesday 28 January 2020

etl2pcapng: Support For Process IDs

Filed under: Forensics,Networking — Didier Stevens @ 0:00

You can start a packet capture on a vanilla Windows machine with command “netsh trace start capture=yes” (and end it with “netsh trace stop”).

This packet capture file, with extension .etl, can not be opened with Wireshark. Until recently, I used Microsoft’s Message Analyzer, but this tool is no longer supported and installation files have been removed from Microsoft’s site.

In comes etl2pcapng, a new open-source utility from Microsoft that converts an .etl file to .pcapng format:

Utility that converts an .etl file containing a Windows network packet capture into .pcapng format“.

I contributed to version 1.3.0 of etl2pcapng, by adding a comment containing the Process ID to each packet. etl files contain metadata (like the PID of the process associated with the network traffic) that got lost when translating to pcapng format. As the pcapng format has no option to store the PID for each packet, but it supports packet comments, I stored the PID inside packet comments:

Notice this warning by Microsoft:

The output pcapng file will have a comment on each packet indicating the PID of the current process when the packet was logged. WARNING: this is frequently not the same as the actual PID of the process which caused the packet to be sent or to which the packet was delivered, since the packet capture provider often runs in a DPC (which runs in an arbitrary process). The user should keep this in mind when using the PID information.

Monday 21 October 2019

Quickpost: ExifTool, OLE Files and FlashPix Files

Filed under: Forensics,maldoc,Malware,Quickpost — Didier Stevens @ 0:00

ExifTool can misidentify VBA macro files as FlashPix files.

The binary file format of Office documents (.doc, .xls) uses the Compound File Binary Format, what I like to refer as OLE files. These files can be analyzed with my tool oledump.py.

Starting with Office 2007, the default file format (.docx, .docm, .xlsx, …) is Office Open XML: OOXML. It’s in essence a ZIP container with XML files inside. However, VBA macros inside OOXML files (.docm, .xlsm) are not stored as XML files, they are still stored inside an OLE file: the ZIP container contains a file with name vbaProject.bin. That is an OLE file containing the VBA macros.

This can be observed with my zipdump.py tool:

oledump.py can look inside the ZIP container to analyze the embedded vbaProject.bin file:

And of course, it can handle an OLE file directly:

When ExifTool is given a vbaProject.bin file for analysis, it will misidentify it as a picture file: a FlashPix file.

That’s because when ExifTool doesn’t have enough metadata or an identifying extension to identify an OLE file, it will fall back to FlashPix file detection. That’s because FlashPix files are also based on the OLE file format, and AFAIK ExifTool started out as an image tool:

That is why on VirusTotal, vbaProject.bin files from OOXML files with macros, will be misidentified as FlashPix files:

When the extension of a vbaProject.bin file is changed to .doc, ExifTool will misidentify it as a Word document:

ExifTool is not designed to identify VBA macro files (vbaProject.bin). These files are not Office documents, neither pictures. But since they are also OLE files, ExifTool tries to guess what they are, based on the extension, and if that doesn’t help, it falls back to the FlashPix file format (based on OLE).

There’s no “bug” to fix, you just need to be aware of this particular behavior of ExifTool: it is a tool to extract information from media formats, when it analyses an OLE file and doesn’t have enough metadata/proper file extension, it will fall back to FlashPix identification.

 


Quickpost info


Tuesday 15 October 2019

PowerShell, Add-Type & csc.exe

Filed under: .NET,Forensics — Didier Stevens @ 0:00

Have you ever noticed that some PowerShell scripts result in the execution of the C# compiler csc.exe?

This happens when a PowerShell script uses cmdlet Add-Type.

Like in this command:

powershell -Command “Add-Type -TypeDefinition \”public class Demo {public int a;}\””

This command just adds the definition of a class (Demo) with one member (a).

When this Add-Type cmdlet is executed, the C# compiler is invoked by PowerShell to compile this class definition (a C# program) into an assembly (DLL) with the .NET type to be used by the PowerShell script.

A temporary file (oj5zlfcy.cmdline in this example) is created inside folder %appdata%\local\temp with extension .cmdline. This is passed as argument to the invoked C# compiler csc.exe, and contains directions to compile a C# program (oj5zlfcy.0.cs):

/t:library /utf8output /R:”System.dll” /R:”C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\System.Management.Automation\v4.0_3.0.0.0__31bf3856ad364e35\System.Management.Automation.dll” /R:”System.Core.dll” /out:”C:\Users\testuser1\AppData\Local\Temp\oj5zlfcy.dll” /debug- /optimize+ /warnaserror /optimize+ “C:\Users\testuser1\AppData\Local\Temp\oj5zlfcy.0.cs”

The C# program (oj5zlfcy.0.cs in this example) contains the class definition passed as argument to cmdlet Add-Type:

public class Demo {public int a;}

Both these files start with a UTF-8 BOM (EF BB BF).

The C# compiler (csc.exe) can invoke compilation tools when necessary, like the resource compiler cvtres.exe.

This results in the creation of several temporary files:

All these files are removed when cmdlet Add-Type terminates.

 

Wednesday 25 July 2018

Extracting DotNetToJScript’s PE Files

Filed under: Forensics,Malware — Didier Stevens @ 0:00

I added a new option (-I, –ignorehex) to base64dump.py to make the extraction of the PE file inside a JScript script generated with DotNetToJScript a bit easier.

DotNetToJScript is James Forshaw‘s “tool to generate a JScript which bootstraps an arbitrary .NET Assembly and class”.

Here is an example of a script generated by James’ tool:

The serialized .NET object is embedded as a string concatenation of BASE64 strings, assigned to variable serialized_obj.

With re-search.py, I extract all strings from the script (e.g. strings delimited by double quotes):

The first 3 strings are not part of the BASE64 encoded object, hence I get rid of them (there are no unwanted strings at the end):

And now I have BASE64 characters, I just have to get rid of the doubles quotes and the newlines (base64dump searches for continuous strings of BASE64 characters). With base64dump‘s -w option I can get rid of whitespace (including newlines), and with option -i I can get rid of the double-quote character. Unfortunately, escaping of this character (\”) works on Windows, but then cmd.exe gets confused for the next pipe (it expects a closing double-quote). That’s why I introduced option -I, to specify characters with their hexadecimal value. Double-quote is 0x22, thus I use option -I 22:

This is the serialized object, and it contains the .NET assembly I want to analyze. .NET assemblies are .DLLs, e.g. PE files. With my YARA rule to detect PE files, I can find it inside the serialized data:

A PE file was found, and it starts at position 0x04C7. I can cut this data out with option -c:

Another method to find the start of the PE file, is to use a cut expression that searches for ‘MZ’, like this:

If there is more than one instance of string MZ, different cut-expressions must be tried to find the real start of the PE file. For example, this is the cut-expression to select data starting with the second instance of string MZ: -c “[‘MZ’]2:”

It’s best to pipe the cut-out data into pecheck, to validate that it is indeed a PE file:

pecheck also helps with finding the length of the PE file (with the given cut-expression, I select all data until the end of the serialized data).

Remark that there is an overlay (bytes appended to the end of the PE file), and that it starts at position 0x1400. Since I don’t expect an overlay in this .NET assembly, the overlay is not part of the PE file, but it is part of the serialization meta data.

Hence I can cut out the PE file precisely like this:

This PE file can be saved to disk now for reverse-engineering.

I have not read the .NET serialization format specification, but I can make an educated guess. Right before the PE file, there is the following data:

Remark the first 4 bytes (5 bytes before the beginning of the PE file): 00 14 00 00. That’s 0x1400 as a little-endian 32-bit integer, exactly the length of the PE file, 5120 bytes:

So that’s most likely another method to determine the length of the PE file.

 

Friday 29 December 2017

Cracking Encrypted PDFs – Conclusion

Filed under: Encryption,Forensics,Hacking,PDF — Didier Stevens @ 0:00

TL;DR: PDFs protected with 40-bit keys can not guarantee confidentiality, even with strong passwords. When you protect your PDFs with a password, you have to encrypt your PDFs with strong passwords and use long enough keys. The PDF specification has evolved over time, and with it, the encryption options you have. There are many encryption options today, you are no longer restricted to 40-bit keys. You can use 128-bit or 256-bit keys too.

There is a trade-off too: the more advanced encryption option you use, the more recent the PDF reader must be to support the encryption option you selected. Older PDF readers are not able to handle 256-bit AES for example.

Since each application capable of creating PDFs will have different options and descriptions for encryption, I can not tell you what options to use for your particular application. There are just too many different applications and versions. But if you are not sure if you selected an encryption option that will use long enough keys, you can always check the /Encrypt dictionary of the PDF you created, for example with my pdf-parser (in this example /Length 128 tells us a 128-bit key is used):

Or you can use QPDF to encrypt an existing PDF (I’ll publish a blog post later with encryption examples for QPDF).

But don’t use 40-bit keys, unless confidentiality is not important to you:

I first showed (almost 4 years ago) how PDFs with 40-bit keys can be decrypted in minutes, using a commercial tool with rainbow tables. This video illustrates this.

Later I showed how this can be done with free, open source tools: Hashcat and John the Ripper. But although I could recover the encryption key using Hashcat, I still had to use a commercial tool to do the actual decryption with the key recovered by Hashcat.

Today, this is no longer the case: in this series of blog posts, I show how to recover the password, how to recover the key and how to decrypt with the key, all with free, open source tools.

Overview of the complete blog post series:

 

Thursday 28 December 2017

Cracking Encrypted PDFs – Part 3

Filed under: Encryption,Forensics,Hacking,PDF — Didier Stevens @ 0:00

I performed a brute-force attack on the password of an encrypted PDF and a brute-force attack on the key of (another) encrypted PDF, both PDFs are part of a challenge published by John August.

The encryption key is derived from the password. it’s not just based on the password only, but also on metadata. This implies that different PDFs encrypted with the same user password, will have different encryption keys.

When you recover the user password of an encrypted PDF, you can just use it with PDF readers like Adobe Reader: they will ask you for the password, you provide it and the PDF will be decrypted and rendered.

But when you recover the key of an encrypted PDF, you can not use it with PDF reader: there is no feature that will allow you to input a key in stead of a password. The only method I knew to decrypt a PDF document with its encryption key, was to use Elcomsoft’s PDF cracking tool:

Now I worked out a second method: I modified the source code of QPDF so that it will accept encryption keys too. It’s a quick and dirty hack, I did not add a new option to QPDF but I “hijacked” the –password option. If the value to the option –password starts with string “key:”, then QPDF will not derive the key from the provided password, but it will use the key provided as hexadecimal characters. Here is how I use it to decrypt the “tough” PDF:

I also made a small modification to the –show-encryption option, to display the encryption key:

Update: I had an email exchange with Jay Berkenbilt, the author of QPDF, and he will look into this patch and possibly add a new key option to QPDF.

If you are interested in my modified version of QPDF, you can find the modified source code files and Windows binaries here:

qpdf-patched.zip (https)
MD5: 57E1A5A232E12B45D0A927181A1E8C3B
SHA256: 6F17E095B38AE72F229A6662216DDCE86057D2BA1C567B07FEF78B8A93413495

Update: this is the complete blog post series:

Next Page »

Blog at WordPress.com.