Didier Stevens

Tuesday 8 February 2022

Windows Explorer: Improper Exif Data Removal

Filed under: Forensics,Vulnerabilities,Windows 7 — Didier Stevens @ 20:53

Windows explorer has an option to remove properties from media files: “Remove Properties and Personal Information”. For example, removing Exif data from JPEG files.

There is an issue with this feature: it does not properly remove Exif data.

Within an open folder (Windows explorer), select a media file (I’m using Canon_40D.jpg), right-click and select properties:

Select Details:

Then click “Remove Properties and Personal Information”:

When you click OK, a new file will be created: Canon_40D – Copy.jpg (I renamed this file to Canon_40D-redacted-W11.jpg, because I tested this first on my Windows 11 machine).

File Canon_40D.jpg contains Exif data pertaining to the camera, like its maker and model:

File Canon_40D-redacted-W11.jpg (the redacted version of file Canon_40D.jpg) contains less Exif data: the maker and model properties have been removed:

Looking at the redacted file with binary editor 010 Editor, I noticed that these properties had not been completely removed. Let me explain.

JPEG files are composed of segments of data, these segments can be analyzed with my tool jpegdump.py.

Here is the output for file Canon_40D.jpg (original file), I’m using option -E to include the SHA256 hash of the data of each segment:

And here is the output for file Canon_40D-redacted-W11.jpg (redacted file):

Notice that all the hashes of the segments are identical, except for the third segment, APP1. This segment contains the Exif data. This means that only the Exif data has changed, nothing else, like the picture itself.

Segments APP1 of both pictures have the same size, 2476 bytes. Although properties have been removed, Microsoft Explorer’s removal feature did not shrink the segment.

When I open the original file (Canon_40D.jpg) with 010 Editor, a template for JPEG files is automatically used to parse the structure of the JPEG file. This can be seen in the Template Results below the hexadecimal dump:

The JPG template is also able to parse Exif data: I drilled down in the template hierarchy, until I found the Exif properties (circled in red). There are 11 properties, the first is Make (tag 0x010F) and the second one is Model (tag 0x0110).

Opening the template DIRENTRY structure for property Make reveals the following fields:

Remark that the string “Canon”, that is the string value of the Make property, is not contained inside the DIRENTRY structure for said property. What it does contain, is the size of the string (6 bytes) and an offset to the string itself (an offset of 146 bytes). Literal string values (StrAscii structures) are stored inside the Exif data structure, after the list of DIRENTRY structures.

It is the same for the Model property:

The Model DIRENTRY structure points to an ASCII string (StrAscii), size is 4 bytes and offset is 152 bytes.

Let’s take a look now, again with 010 Editor, at the cleaned file that was created by clicking “Remove Properties and Personal Information” (Canon_40D-redacted-W11.jpg):

Notice that this file has 8 DIRENTRY structures in stead of 11: 3 Exif properties have been removed (Make, Model & Software). And the StrAscii structures for these 3 properties do not appear in the template result.

However, these 3 StrAscii string values are still inside the APP1 segment:

They are at the exact same location as in the original file (Canon_40D.jpg):

Conclusion: when you use Windows Explorer’s “Remove Properties and Personal Information” feature, Exif properties will be removed, but if these are string properties, they are not completely removed.

Windows Explorer’s “Remove Properties and Personal Information” feature removes DIRENTRY structures (properties), but does not remove StrAscii structures (properties’ string values).

When the Exif data of JPEG files cleaned with this feature is viewed, the orphaned strings will not appear. But when they are viewed with a binary editor, these strings do appear. And of course, they can also be easily visualized with a strings utility (here I’m using my strings utility strings.py):

You will not know to which properties these strings belong, because that information has been erased (DIRENTRY structures). But here, just the string values themselves, are enough to know that this is a Canon camera and that GIMP software was used to produce the final picture.

In case you want to test this yourself and try to reproduce my findings, you can download file Canon_40D.jpg from here. The file I created by using Windows Explorer’s “Remove Properties and Personal Information” feature, Canon_40D-redacted-W11.jpg, has a SHA256 of 8B190028D0F9F2A6F7EDB1DC0955008D73173C32C19C32CE62372C7163EE1223. I tested this on a fully patched Windows 10 machine (21H2) and a fully patched Windows 11 machine. The results where completely identical.

And as I know that some remaining Windows 7 users will want to know if Windows 7 is also affected: a fully patched Windows 7 machine has the same issue (though the cleaned file was different from the W10/W11 file).

If you absolutely want to make sure that all metadata is gone from your media files, do not use Windows Explorer (for the moment). Use another tool. Ideally, use a tool that completely removes the segments containing metadata (APP1, APP2, …).

I did inform Microsoft about this issue.

Friday 26 July 2013

MSI: The Case Of The Invalid Signature

Filed under: Forensics,Malware,Windows 7 — Didier Stevens @ 22:01

I found a suspicious file on a Windows XP machine. I was able to trace its origin back to a Windows Installer package (.msi). This package in c:\windows\installer had an invalid digital signature. Like this:

20130726-233848

Very suspicious.

A bit later I found another msi package containing the same suspicious file. But this time, the package had a valid digital signature. What’s going on?

After a deep dive into the internals of msi packages, I found the answer.

When an msi package is installed, it is cached inside the Windows Installer directory (%windir%\Installer). Prior to Windows Installer 5.0 (released with Windows 7), cached packages were stripped of their embedded cab files. But with digitally signed msi files, the signature remained inside the file: the digitally signed file was modified, hence the signature was invalidated. This behavior changed with Windows Installer 5.0: cached packages are no longer stripped, hence the signature remains valid.

This blogpost by Heath Stewart explains this change in more detail. Unfortunately, my Google-skills were not good enough to find this blogpost prior to my deep dive into msi files. Hindsight Googling FTW! 😉

Saturday 17 December 2011

FORCE_INTEGRITY With DLLs

Filed under: Windows 7,Windows Vista — Didier Stevens @ 17:36

I’ve talked about using the FORCE_INTEGRITY flag with EXEs, but how about DLLs? Its effect is similar.

If flag FORCE_INTEGRITY is set for a DLL, and the DLL is not signed or the signature is invalid, Windows will not load the DLL inside a process.

The error code will be 577, or:

Windows cannot verify the digital signature for this file.
A recent hardware or software change might have installed
a file that is signed incorrectly or damaged, or that might
be malicious software from an unknown source.

Friday 9 December 2011

LoadDLLViaAppInit with FORCE_INTEGRITY

Filed under: My Software,Windows 7 — Didier Stevens @ 12:46

In Windows 7 and Windows Server 2008 R2, Microsoft added a feature to the AppInit_DLLs mechanism. When the REG_DWORD RequireSignedAppInit_DLLs is set to 1, the DLLs to be loaded via AppInit_DLLs have to be signed.

You can find properly signed versions of LoadDLLViaAppInit here:
LoadDLLViaAppInit_FI.zip (https)
MD5: 2867B6AADF6C9FFA224D2D6A0153AD91
SHA256: E732451401B37087FAC619BD500E370FE3C21FB764F2E2E99C76EDBADEC86204

Nothing has changed to these DLLs, I’ve not changed the version number. I only set the FORCE_INTEGRITY flag and signed them.

Thursday 17 November 2011

Hotfix For SRP/AppLocker Bypass

Filed under: Windows 7 — Didier Stevens @ 10:53

Remember Microsoft has features to bypass its own Software Restriction Policies and AppLocker: Circumventing SRP and AppLocker, By Design and Circumventing SRP and AppLocker to Create a New Process, By Design.

Microsoft has issued a hotfix for this bypass: KB2532445

It is only for Windows 7 and Windows Server 2008 R2 though, it will not help you if you use SRP on Windows XP or Vista.

Thanks to @mount_knowledge.

Circumventing SRP and AppLocker, By Design

Wednesday 2 November 2011

Ariad 64-bit

Filed under: My Software,Windows 7 — Didier Stevens @ 19:33

You can now download a 64-bit version of my Ariad driver.

I’ve been using this driver on my x64 Windows 7 test machine only for a couple of days, so this is still beta software.

As for the installation and configuration, it’s exactly the same as the 32-bit version: you need to download the 32-bit version for the .inf files and the GUI.

Thursday 27 October 2011

Using DLLCHARACTERISTICS’ FORCE_INTEGRITY Flag

Filed under: Windows 7,Windows Vista — Didier Stevens @ 17:46

I discovered the flag FORCE_INTEGRITY last year when I released my tool setdllcharacteristics. This flag will force a check of the executable’s digital signature (on Windows Vista and Windows 7) and will prevent the program from running if the signature is invalid (or missing).

But it’s only now that I hold all the pieces to test this flag. A normal authenticode signature is not enough. And you can not use a selfsigned certificate. You need to buy a certificate (aka Software Publisher Certificate, SPC) from a commercial CA for which Microsoft issues a cross-certificate. And then you need to use your SPC and the related cross-certificate to sign your executable (with flag FORCE_INTEGRITY set) as explained here.

This is the same process for signing kernel-mode binaries, or user-mode binaries for AppInit_DLLs or other protected components.

I have the habit of signing my tools with a self-signed cert, so that I can quickly check if my tool has not been altered when I use it on another system (think infected machine). But now that I have a commercial SPC, I can go a step further: I can force Windows to check the integrity of my tools before executing them. If they have changed, Windows will warn me and refuse to run my tools:

There is a small performance hit because the loader has to check the signature, but you will not feel this if you don’t run the executable hundreds of times per second. There’s no problem with casual use.

If you want to test this, you can download a dummy application I signed here (32-bit). When you change the executable (TestIntegrityCheckFlag.exe), Windows will refuse to run it.

If this feature of Windows interests you, consider also the fact that you don’t need to own the source code to sign executables. If you use applications that are not protected by this flag, you can set the flag yourself and then sign the executable. But I don’t recommend that you publish this application, unless you get the author’s permission.

This method is good to protect your tools from malware, but not from malicious individuals: they just need to remove the FORCE_INTEGRITY flag from your executable and Windows will happily execute it regardless of the validity of the signature (I’m not speaking about kernel-mode binaries or other protected processes that require the FORCE_INTEGRITY flag to be set).

Remember that this is for Windows Vista and Windows 7; Windows XP will just ignore this flag. Windows 2008 R2 should also honor this flag, but I’ve not tested this. And it works on 32-bit and 64-bit systems.

Thursday 29 September 2011

Add Bottom Up Randomization To (Your Own) Source Code

Filed under: Vulnerabilities,Windows 7,Windows Vista — Didier Stevens @ 19:14

EMET’s new Bottom Up Randomization spectacularly increased the entropy of DLL’s base addresses loaded into my test program. Instead of 15 different addresses, I had more than 200.

Matt Miller told me how he implemented Bottom Up Randomization:

“It works by reserving a random number (between [0,256]) of 64K regions via VirtualAlloc. This has the effect of consuming a small portion of the bottom part of the address space. Since the Windows kernel assigns base addresses for collided DLLs by searching for a free region starting at the bottom of the address space, bottom up randomization ensures that a random base address will be assigned. Without bottom up randomization the bottom part of the address space remains fairly static (with some exceptions, such as due to heap, stack, and EXE randomization).”

So I decided to add this algorithm at the start of my test program:

int iIter;
int iRand;

srand(time(NULL));
iRand = rand() % 256 + 1;
for (iIter = 0; iIter < iRand; iIter++)
 VirtualAlloc(NULL, 64*1024, MEM_COMMIT | MEM_RESERVE, PAGE_NOACCESS);

Again, the result is spectacular. In stead of 15 base addresses, with the most frequent address being using 30% of the time, my Bottom Up Randomization implementation gives me more than 300 addresses after 150.000 runs. And there’s no single address being used more than 0,5% of the time.

From now on, I’m going to include this in my programs, and I advise you to do the same with your programs. Or to open source programs you use.

Thursday 1 September 2011

Bottom Up Randomization Saves Mandatory ASLR

Filed under: Vulnerabilities,Windows 7,Windows Vista — Didier Stevens @ 17:32

I recently found out that pseudo-ASLR (or mandatory ASLR in EMET) has a lower entropy than real ASLR. While real ASLR has a 8-bit entropy for base addresses, mandatory ASLR turned out only to have about 4 bits of entropy, and the distribution was far from uniform. What I forgot to tell you in that post, is that I just enabled Mandatory ASLR as mitigation in EMET, and nothing else:

Matt Miller told me that a new feature of EMET version 2.1, Bottom Up Randomization, would greatly improve the entropy of mandatory ASLR.

The results are spectacular. When I let my test program run around 500,000 times, I get almost 200 different base addresses. And the distribution is more uniform too, no address appears more frequently than 3% of the time.

To get decent protection from mandatory ASLR, be sure to use the latest version of EMET (2.1) and enable Bottom Up Randomization. This gives you the same entropy than real ASLR, with the added bonus that the base address will change each time the application is started, compared to real ASLR which requires a reboot.

Tuesday 16 August 2011

So How Good is Pseudo-ASLR?

Filed under: Vulnerabilities,Windows 7,Windows Vista — Didier Stevens @ 0:29

Let me first define what I mean with pseudo-ASLR. Address Space Layout Randomization (introduced in Windows Vista) loads executable files at different memory addresses. Studies have shown that ASLR uses 256 different base addresses and that the distribution is pretty uniform.

Pseudo-ASLR is what EMET and my tool SE_ASLR enforce. When a DLL does not support ASLR, memory at the base address of this DLL is allocated right before the DLL is loaded into the process. Since the address is not free, the image loader will load the DLL at a different address, thereby « randomizing » the base address. But how good is this randomization?

As I pointed out in my article on EMET, this base address is different each time a new process is started (unlike ASLR which needs a reboot for the base address to change). So maybe this is better ?

I developed a test program that loads a DLL but pre-allocates memory at the address of the DLL before loading. Then I ran that program thousands of times on a Windows 7 32-bit machine.

Running this program about 50.000 times gives me 68 different addresses. That’s by far not as good as 256 with ASLR. But what’s more important, is that the distribution of these addresses is not uniform at all:

There’s one address (0x000E0000 in my test) that is used 30% of the time. 2 other addresses are used 10% of the time. Rebooting the machine does not change this distribution.

When I do the same test, but enforce ASLR with EMET, I get a similar result:

Again there’s an address that is selected 30% of the time, but it’s different from my previous test. Rebooting the Windows 7 machine doesn’t change the address.

In this test, EMET uses only 15 different addresses, compared to the 68 addresses in the first test. I’ll have to research this difference, I’ve no explanation for it.

Conclusion from this simple test: pseudo-ASLR is rather weak, because I can predict the base address and I will be right one time out of three, which is not bad at all when I can launch my attack several times.

Next Page »

Blog at WordPress.com.