Didier Stevens

Friday 7 July 2006

Viewing strings in executables

Filed under: Reverse Engineering — Didier Stevens @ 8:30

I was having an interesting chat with a colleague. He is developing a VB6 application for his personal use, and has some of his friends beta-testing it. At one point in the conversation he asked me if it’s possible to view the strings in the compiled application, because he tried and didn’t see them. It turns out he implemented a popup message to remind his beta testers to check for an update, and wondered if this would be easy to circumvent.

It’s usually easy to circumvent, I’ll explain it here for my colleague and you.

Strings used in the source code of VB6 are stored as UNICODE strings in the compiled executable. UNICODE strings are usually encoded in UTF-16 on Windows, as UTF-16 is the standard format for the Windows API. ASCII characters are encoded with 2 bytes in UTF-16.

So for example, the string “Hello” is represented as 48 00 65 00 6C 00 6C 00 6F 00 in UTF-16. It’s because of the 2 bytes per character that my college didn’t remark the string.

There are many free utilities on the net to view strings in executables, I use these:

Consider this simple VB6 program (it will display a message box “Hello” when executed after 1/1/2007):

    If Now > DateValue("1/1/2007") Then 
        MsgBox "Hello" 
    End If

You can view the strings in the compiled program, hello.exe, with the Sysinternals tool strings like this (it’s a command-line tool):

strings hello.exe

The result will be this:

    Strings v2.2 
    Copyright (C) 1999-2005 Mark Russinovich 
    Sysinternals - www.sysinternals.com                         


You can see strings “1/1/2007” and “Hello” listed at the beginning.

The BinText utility has a GUI, has more options and provides you with more info. Start BinText, use the Browse button to select the executable to analyze, and press Go to display the strings:


The red U indicates that it’s a UNICODE string. The first hexadecimal number is the position of the string in the file (hello.exe), and the second hexadecimal number is the position of the string in memory when the executable is executed.

So you will find string 1/1/2007 at position 000016D0 in the file and position 004016D0 in memory.

BTW, this technique to dump files will almost always fail when analyzing malware, because these files are often packed or encrypted. A simple trick to view the strings of such malware code is done with Process Explorer by Sysinternals. When the malware is running (you’ll want to run it on an isolated machine, like a virtual machine), start Process Explorer and display the properties of the running malware. The Strings tab will show you the strings in the file (Image) and in memory (Memory). Since the malware as unpacked/decoded itself when it started, you’ll be able to view the strings in memory.

The strings in an executable will give you an idea of what this program will do: you’ll recognize registry keys, URLs, filenames, …

Back to our example, hello.exe. We recognize a date, 1/1/2007. When you run the program (on a day before 1/1/2007), nothing happens. When you run the program after 1/1/2007 (by changing your computer clock), the Hello message box appears. So now we have clearly established that the string 1/1/2007 is indeed the date after which the message box is displayed.

How would someone modify this program to avoid the message box? A simple trick, which requires no programming skills, is to change the date 1/1/2007 to a later date, say 1/1/3007.

You’ll need a hex editor to do this, like the free XVI32.

  1. Start XVI32 and open the file hello.exe
  2. We know that string 1/1/2007 is at file position 000016D0: type CTRL+G, select hexadecimal, type 000016D0 and click OK
  3. The cursor is know positioned at the beginning of string 1/1/2007.
  4. Select digit 2 in the right pane:
  5. Type 3, this will replace digit 2 (32 hexadecimal) with digit 3 (33 hexadecimal)
  6. Save the file hello.exe

One can argue that this technique can only be applied to very simple programs, which have few strings and which store dates as strings. This is partially true, because it’s still easy to find dates in programs that don’t use dates for their normal operation. Like the program of my colleague. It’s almost 1 Mb large, but it only uses dates to decide when to display the message box. So they are easy to find, and easy to modify.

In fact, I didn’t use this string patching technique to show him how his message box could be disabled, but I changed the program logic by analyzing the assembler code and patching a few bytes. I will explain this in a next post.
But this shows my colleague how easy it is to disable his message box.

He can make it more difficult by hiding the dates:

  1. don’t use strings to represent dates
  2. use string manipulation to hide the date
  3. encrypt the strings and decrypt them at runtime
  4. pack the executable
  5. protect the executable with specialized software (e.g. ASprotect)

But this will not stop a determined attacker, there are even generic unprotection tools for ASprotect.

Blog at WordPress.com.