Here is the version I talked about in my Bitcoin virus posts.
It also has an embedded man page (use option –man).
find-file-in-file_v0_0_4.zip (https)
MD5: CD381616158BD233D94B368554B824C6
SHA256: FD5C4E3EC99371754E58B93D3D96CBA7A86C230C47FC9C27C9B871ED8BFB9149
Man page:
Usage: find-file-in-file.py [options] file-contained file-containing […]
Find if a file is present in another file
Arguments:
file-containing can be a single file, several files, and/or @file
@file: run the command on each file listed in the text file specified
wildcards are supported
batch mode is enabled when more than one file is specified
Source code put in the public domain by Didier Stevens, no Copyright
Use at your own risk
https://DidierStevens.com
Options:
–version show program’s version number and exit
-h, –help show this help message and exit
-m MINIMUM, –minimum=MINIMUM
Minimum length of byte-sequence to find (default 10)
-o, –overlap Found sequences may overlap
-v, –verbose Be verbose in batch mode
-p, –partial Perform partial search of contained file
-O OUTPUT, –output=OUTPUT
Output to file
-b RANGEBEGIN, –rangebegin=RANGEBEGIN
Select the beginning of the contained file (by default
byte 0)
-e RANGEEND, –rangeend=RANGEEND
Select the end of the contained file (by default last
byte)
-x, –hexdump Hexdump of found bytes
-q, –quiet Do not output to standard output
–man Print manual
Manual:
find-file-in-file is a program to test if one file (the contained
file) can be found inside another file (the containing file).
Here is an example.
We have a file called contained-1.txt with the following content:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
and have a file called containing-1.txt with the following content:
0000ABCDEFGHIJKLM1111NOPQRSTUVWXYZ2222
When we execute the following command:
find-file-in-file.py contained-1.txt containing-1.txt
We get this output:
0x00000004 0x0000000d (50%)
0x00000015 0x0000000d (50%)
Finished
This means that the file contained-1.txt was completely found inside
file containing-1.txt At position 0x00000004 we found a first part
(0x0000000d bytes) and at position 0x00000015 we found a second part
(0x0000000d bytes).
We can use option hexdump (-x) to see which bytes were found:
find-file-in-file.py -x contained-1.txt containing-1.txt
0x00000004 0x0000000d (50%)
41 42 43 44 45 46 47 48 49 4a 4b 4c 4d
0x00000015 0x0000000d (50%)
4e 4f 50 51 52 53 54 55 56 57 58 59 5a
Finished
The containing file may contain the contained file in an arbitrary
order, like file containing-2.txt:
0000NOPQRSTUVWXYZ1111ABCDEFGHIJKLM2222
Example:
find-file-in-file.py -x contained-1.txt containing-2.txt
0x00000015 0x0000000d (50%)
41 42 43 44 45 46 47 48 49 4a 4b 4c 4d
0x00000004 0x0000000d (50%)
4e 4f 50 51 52 53 54 55 56 57 58 59 5a
Finished
The containing file does not need to contain the complete contained
file, like file containing-3.txt:
0000ABCDEFGHIJKLM1111
Example:
find-file-in-file.py -x contained-1.txt containing-3.txt
0x00000004 0x0000000d (50%)
41 42 43 44 45 46 47 48 49 4a 4b 4c 4d
Remaining 13 (50%)
The message “Remaining 13 (50%)” means that the last 13 bytes of the
contained file were not found in the containing file (that’s 50% of
the contained file).
If the contained file starts with a byte sequence not present in the
containing file, nothing will be found. Example with file
contained-2.txt:
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ
Nothing is found:
find-file-in-file.py -x contained-2.txt containing-1.txt
Remaining 36 (100%)
If you know how long that initial byte sequence is, you can skip it.
Use option rangebegin (-b) to specify the position in the contained
file from where you want to start searching.
Example:
find-file-in-file.py -x -b 10 contained-2.txt containing-1.txt
0x00000004 0x0000000d (50%)
41 42 43 44 45 46 47 48 49 4a 4b 4c 4d
0x00000015 0x0000000d (50%)
4e 4f 50 51 52 53 54 55 56 57 58 59 5a
Finished
If you want to skip bytes at the end of the contained file, use option
rangeend (-e).
If you don’t know how long that initial byte sequence is, you can
instruct find-file-in-file to “brute-force” it. With option partial
(-p), one byte at a time will be removed from the beginning of the
contained file until a match is found.
Example:
find-file-in-file.py -x -p contained-2.txt containing-1.txt
File: containing-1.txt (partial 0x0a)
0x00000004 0x0000000d (50%)
41 42 43 44 45 46 47 48 49 4a 4b 4c 4d
0x00000015 0x0000000d (50%)
4e 4f 50 51 52 53 54 55 56 57 58 59 5a
Finished
“(partial 0x0a)” tells you that the first 10 bytes of the contained
file were skipped before a match was found.
There are some other options:
-m minimum: find-file-in-file will search for byte sequences of 10
bytes long minimum. If you want to change this minimum, use option -m
minimum.
-o overlap: find-file-in-file will not let byte sequences overlap. Use
option -o overlap to remove this restriction.
-v verbose: be verbose in batch mode (more than one containing file).
-O output: besides writing output to stdout, write the output also to
the given file.
-q quiet: do not output to stdout.