Didier Stevens

Sunday 28 February 2021

Update: oledump.py Version 0.0.60

Filed under: My Software,Update — Didier Stevens @ 23:08

This new version of oledump.py brings an update to plugin plugin_biff to help with the recovery of protection passwords.

oledump_V0_0_60.zip (https)
MD5: BC7631059077294223BB225D16FB7186
SHA256: D847E499CB84B034E08BCDDC61ADDADA39B90A5FA2E1ABA0756A05039C0D8BA2

Monday 22 February 2021

re-search.py And Custom Validations

Filed under: My Software — Didier Stevens @ 0:00

My tool re-search.py is a tool that uses regular expressions to search through files. You can use regular expressions from a small builtin library, or provide your own regular expressions.

And these regular expressions can be augmented with extra conditions, like validation with a custom Python function.

I’m going to illustrate this here with a regular expression to match credit card numbers. Credit card numbers have a check digit (calculated with the Luhn algorithm) and I’m going to augment the regular expression to validate the check digit.

I’m using the following regular expression to match credit card numbers (I’m limiting myself to credit card numbers of 16 digits): \b(\d{4}( ?)\d{4}\2\d{4}\2\d{4})\b

This regular expression consist of 4 expressions to match 4 digits “\d{4}”. Each block of 4 digits could be separated with a space character ” ?”.

I’m putting this in a capture group “( ?)” so that I can refer back to this matched group with backreference \2 (it’s the second capture group, because the complete credit card number is also put in a capture group, e.g. the first capture group).

The reason I’m using a backreference to match the first optional space character, is because I want to match the next 2 separating space characters if and only if a first space character was matched. So I want to match (1111222233334444 and 1111 2222 3333 4444, but not 11112222 3333 4444 for example). Either all 4 groups are separated, or none are separated.

Finally, I put this expression in a capture group, and enclose it with a boundary check “\b”. This is to avoid matching credit card numbers that are immediately preceded or followed by letters or digits.

So I can use this regular expression with re-search.py on a test file:

You can see that the first 2 test credit card numbers are identical, except for the last digit: the check digit. So at most one of these 2 can be a valid credit card number.

This can be checked with the Luhn algorithm.

Here is a small Python script to calculate this Luhn check digit:

# 2020/02/06
# https://stackoverflow.com/questions/21079439/implementation-of-luhn-formula

import string

def luhn_checksum(card_number):
    def digits_of(n):
        return [int(d) for d in str(n)]
    digits = digits_of(card_number)
    odd_digits = digits[-1::-2]
    even_digits = digits[-2::-2]
    checksum = 0
    checksum += sum(odd_digits)
    for d in even_digits:
        checksum += sum(digits_of(d*2))
    return checksum % 10

def is_luhn_valid(card_number):
    return luhn_checksum(card_number) == 0

def CCNValidate(ccn):
    return is_luhn_valid(''.join(digit for digit in ccn if digit in string.digits))

Python function luhn_checksum calculates the check digit for an input of digits, and Python function is_luhn_valid return True when the calculate Luhn number matches the check digit.

To use this last function with the regular expression I created, I need another Python function: CCNValidate. This function receives the string matched by the regulator expression, extracts the digits and checks the Luhn check digit.

To let my tool re-search.py call this function CCNValidate when a credit card number is matched by the regular expression, I precede the regular expression with a comment, like this:

(?#extra=P:CCNValidate)\b(\d{4}( ?)\d{4}\2\d{4}\2\d{4})\b

(?#…) is a comment in the regular expression syntax. It is ignored by the parser (i.e. not used for matching). … is the comment itself, which can be anything.

re-search.py interprets this comment: when the comment starts with “extra=”, re-search is dealing with an augmented regular expression. P indicates that a Python function has to be called when the regular expression matches, and CCNValidate is the name of the Python functon to call when the regular expression matches.

All this combined gives me the following command:

You can see that the first credit card number that was matched in the first example, no longer matches: that’s because 6 is not the correct Luhn number for this credit card number.

Besides providing re-search.py with this augmented regular expression, I also need to provide the Python script containing the validation functions: I do this with option –script CCNValidate.py



Sunday 21 February 2021

Update: re-search.py Version 0.0.16

Filed under: My Software,Update — Didier Stevens @ 12:20

This new version of re-search.py, my tool to search files with a builtin library of regular expressions, brings an update to the url and url-domain regexes to match hostnames with underscores (_) and a Python 3 fix.

re-search_V0_0_16.zip (https)
MD5: 21A7096116F50CCA051A152066B2DB50
SHA256: 4A3AC1B1BED68660316011F14EFC84B344BE3FF7E335CDFA8F1AAA2C0D2D06B0

Friday 12 February 2021

Quickpost: oledump.py plugin_biff.py: Remove Sheet Protection From Spreadsheets

Filed under: Malware,My Software,Quickpost — Didier Stevens @ 0:00

My new version of plugin_biff.py has a new option: –hexrecord.

Here I’ll show how I use this to remove the sheet protection from malicious spreadsheets.

If you want to open a malicious spreadsheet (for example with Excel 4 macros) in a sandbox, to inspect its content with Excel, chances are that it is protected.

I’m not talking about encryption (this is something that can be handled with my tool msoffcrypto-crack.py), but about sheet protection.

Enabling sheet protection can be done in Excel as follows:

Although you have to provide a password, that password is not used to derive an encryption key. An .xls file with sheet protection is not encrypted.

If you use my tool oledump.py together with plugin_biff.py, you can select all BIFF records that have the string “protect” in their name or description (-O protect). This will give you different records that govern sheet protection.

First, let’s take a look at an empty, unprotected (and unencrypted) .xls spreadsheet. With option -O protect I select the appropriate records, and with option -a I get an hex/ascii dump of the record data:

We can see that there are several records, and that their data is all NULL (0x00) bytes.

When we do the same for a spreadsheet with sheet protection, we get a different view:

First of all we have 4 extra records, and their data isn’t zero: the flags are set to 1 (01 00 little-endian) and the Protection Password data is AB94. That is the hash of the password (P@ssw0rd) we typed to create this sheet.

To remove this sheet protection, we just need to set all data to 0x00. This is something that can be done with an hex editor.

First use option -R instead of option -a:

This will give you the complete records (type, length and data) in hexadecimal. Next you can search for each record using this hexadecimal data with an hex editor and set the data bytes to 0x00.

Searching for the first record 120002000100:

Setting the data to 0x00: 0100 -> 0000

Do this for the 4 records, and then save the spreadsheet under a different name (keep the original intact).

Now you can open the spreadsheet, and the sheet protection is gone. You can now unhide hidden sheets for example.

Quickpost info

Wednesday 10 February 2021

Update: oledump.py Version 0.0.59

Filed under: My Software,Update — Didier Stevens @ 0:00

This new version of oledump.py has a small change in the XML detection logic, and adds options –hexrecord and –xordeobfuscate to plugin plugin_biff.py.


oledump_V0_0_59.zip (https)
MD5: 89CC85EDADA0BB6978A75BA37065A65D
SHA256: BE62B45AE20D3BF5B3C335742F08067297079F6B8431A5CC82401BF67BFA50F6

Monday 1 February 2021

Overview of Content Published in January

Filed under: Announcement — Didier Stevens @ 0:00

Here is an overview of content I published in January:

Blog posts:

YouTube videos:

Videoblog posts:

SANS ISC Diary entries:

Blog at WordPress.com.