Didier Stevens

Monday 21 June 2021

Update: oledump.py Version 0.0.61

Filed under: My Software,Update — Didier Stevens @ 0:00

This new version of oledump.py comes with Excel 4 formula parsing improvements in the plugin_biff plugin.

oledump_V0_0_61.zip (https)
MD5: 6DC34FFAF4ED0066696ED230878AEED9
SHA256: 41A68ABA19BBA74DAE653BE62D4A63A5AE409FB6DC1DAEEB2D419AA1B493728A


  1. Hi Didier,

    i have come across a problem with oledump.

    If there are german umlauts in the vba scipts, which i try to export via for example “oledump.py -s A4 -v c:\myfile.xlsm >my_exported_vb_file.txt” then you get different results within cmd and powershell when redirecting the stdout from python (for example version 3.8.8) to a txt file.

    It is only possible within a CMD shell to redirect correctly. There will be a txt file created with ANSI.
    I have not found a solution to make it possible within powershell. (tried: change codepage, $outputparameters, inputparameters regarding encoding, and so on)
    If i create a separate print(“öäü”) within a python script and start it via powershell, then the output is nice and correct…
    i think it is a decoding problem within your script?

    If you like i can give you a file for testing purposes. just leave me a mail at blog.didierstevens.com_20210713[at]noldi.de

    Best regards and thank you for this gorgeos tool

    Comment by Markus A. — Tuesday 13 July 2021 @ 10:30

  2. Hello. What do you want to achieve? Because by default, if you use redirection with cmd.exe ou get an ANSI text file and if you use powershell you get a UNICODE file. Do you want to create an ANSI file with powershell redirection?
    And regarding print(“öäü”). Did you also redirect that to a file? Because when I do that, it’s no longer nice and correct. Only when I print it, not when I redirect it.

    Comment by Didier Stevens — Tuesday 13 July 2021 @ 21:29

  3. Hi Didier,

    what it want to achieve: I want to iterate over all Makros in a file, then select the stream in the iteration with “-s –vbadecompressskipattributes” and the export the macro to a textfile for later investigation.
    I have written a powershell script which calls oledump.py in order to work further with the stdout. (Main goal of the whole script is to automatically generate a md5-hash over every macro in a file in order to generate a allowlist/blocklist of macros)
    But at this point I am stuck.

    You are right – a redirection within cmd.exe is no problem. Everythings fine.
    But – whatever settings i make within powershell – a redirect of an output of oledump.py to a file with “>” or directly to a string variable (within powershell) – the encoding (of the python output and the input handling within powershell) seems to be a problem. (no “german umlauts” are redirected correctly.)

    Perhaps it is easier to understand with the following details (some unimportant lines are missing – just to focus it)

    $result = (python $path_to_oledump $_.FullName);
    foreach ($line in ($result | Select-String -Pattern “: M” -CaseSensitive -AllMatches)) {
    $stream_with_macros=(($line -split “:”)[0]).Trim()
    $macro = (python $path_to_oledump -s $stream_with_macros –vbadecompressskipattributes $_.FullName) -join “`n”
    $macro | Out-File $myfilename

    Meanwhile I have asked any programmer i know in my near, but no one have a solution or further ideas – so you are my last hope. Hopefully you have ideas, how to call oledump.py via powershell correctly and export the output correctly.
    Just test it with a xlsm an a macro within like
    sub test()
    end sub

    Best regards and have a nice day

    Comment by Markus A. — Tuesday 13 July 2021 @ 22:21

  4. Markus,

    1) that kind of text file you want to create with PowerShell: does it have to be an ANSI or UNICODE file? Or something else. I need to know the format you want.
    2) what happens when you do: python -c “print(‘öäü’)” > somefile.txt ?

    Comment by Didier Stevens — Tuesday 13 July 2021 @ 22:39

  5. Hi Didier,

    1) UNICODE would be nice. Powershell (dependant on the version) normally uses utf-8.

    This is $OutputEncoding from powershell 7
    Preamble :
    BodyName : utf-8
    EncodingName : Unicode (UTF-8)
    HeaderName : utf-8
    WebName : utf-8
    WindowsCodePage : 1200
    IsBrowserDisplay : True
    IsBrowserSave : True
    IsMailNewsDisplay : True
    IsMailNewsSave : True
    IsSingleByte : False
    EncoderFallback : System.Text.EncoderReplacementFallback
    DecoderFallback : System.Text.DecoderReplacementFallback
    IsReadOnly : True
    CodePage : 65001

    2) I have tested “python -c “print(‘öäü’)” > somefile.txt” within cmd, powershell 5.1 and powershell 7.1
    Every filecontent created was broken. So… because of some reason it is not possible to redirect the stdout output with “>” to a file correctly – but i don’t know why.
    (Without the redirection (“python -c “print(‘öäü’)”) the output in the cmd/powershell windows is nice)

    Perhaps this additionally helps
    I have viewed the file content of the cmd and powershell 7.1 output with a hex editor
    F6 E4 FC 0D 0A = öäü..

    In powershell 7.1 it is
    C3 B7 C3 B5 C2 B3 0D 0A = ÷õ³..
    so completly gibberish in my opinion.

    IN powershell 5.1 the same. Gibberish. :-/
    FF FE F7 00 F5 00 B3 00 0D 00 0A 00 = ÿþ÷.õ.³…..

    In my opinion: The only clean solution for exporting a macro within oledump is to implement a command line parameter like “–export” and/or “encoding” which writes the output directly within python, because every other environment is unreliable i think. What’s your optinion?

    Best regards and a nice day to you
    Markus A.

    Comment by Markus A. — Wednesday 14 July 2021 @ 9:02

  6. I have to correct/ add something regarding to 2)
    The file created with cmd.exe is correct within notepad.exe . in my statement above i have opened it with notepad++. (seems to be a bug in notepad++)
    But the statements regarding powershell and created files keeps the same.
    Greetings Markus

    Comment by Markus A. — Wednesday 14 July 2021 @ 9:48

  7. I did some tests.
    With environment variable PYTHONIOENCODING, you can specify what encoding Python should use for its output.
    And with cmdlet Out-File, you can speficy the encoding of the file written to.

    So, the following creates a UTF8 file with correct representation of ä:

    C:\Python38-32\python.exe -c “print(‘ä’)” | out-file -encoding utf8 .\umlaud.txt

    So I’m outputting UTF16, and letting Out-File convert this to UTF8.

    Comment by Didier Stevens — Wednesday 14 July 2021 @ 14:58

  8. Hi Didier,

    thank you, but then it must be a problem within oledump.py.
    Just try this with a makro-file in which there are umlauts.
    Below you see the black squares with question marks. (at normal command line output)
    After that i create the umlaud.txt. The results of the file content is below (snip/snap)

    PS C:\Users\m.arnoldi> $env:PYTHONIOENCODING=”UTF16″
    PS C:\MAKRO-OFFICE-SIGNATUR\oledump_V0_0_60> C:\Users\m.arnoldi\AppData\Local\Programs\Python\Python38\python.exe .\oledump.py -s A4 -v C:\install\LW8060.xlsm
    Attribute VB_Name = “Mod_Ausf�hren”
    ‘A7 ist Ausgangszelle f�r last Row
    ‘D1 ist f�r Sub Pos. Nr.

    Sub Ausf�hren()

    Dim Abbruch As Boolean

    Abbruch = False

    ListFiles Abbruch
    If Abbruch = True Then Exit Sub

    End Sub

    Sub Formel()

    ActiveCell.FormulaR1C1 = “=SUM(R[3]C:R[994]C)”

    End Sub
    PS C:\MAKRO-OFFICE-SIGNATUR\oledump_V0_0_60> C:\Users\m.arnoldi\AppData\Local\Programs\Python\Python38\python.exe .\oledump.py -s A4 -v C:\install\LW8060.xlsm | Out-File -Encoding UTF8 .\umlaud.txt

    — snip file content of umlaud.txt —
    Attribute VB_Name = “Mod_Ausf³hren”
    ‘A7 ist Ausgangszelle f³r last Row
    ‘D1 ist f³r Sub Pos. Nr.

    Sub Ausf³hren()

    Dim Abbruch As Boolean

    Abbruch = False

    ListFiles Abbruch
    If Abbruch = True Then Exit Sub

    End Sub

    Sub Formel()

    ActiveCell.FormulaR1C1 = “=SUM(R[3]C:R[994]C)”

    End Sub

    — snap —

    Comment by Markus A. — Thursday 15 July 2021 @ 11:32

  9. I’ll look into this.

    Meanwhile, this is how you can make proper ANSI:
    set PYTHONIOENCODING to latin and out-file encoding to oem

    Comment by Didier Stevens — Thursday 22 July 2021 @ 21:37

RSS feed for comments on this post. TrackBack URI

Leave a Reply (comments are moderated)

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.