CUSTOM HEURISTICS GUIDE

Hello, guys today I came up with another exciting topic of DiskDigger namely “CUSTOM HEURISTICS”The topics covered in this article are about creating a custom filter and finding the file type that which you are looking for. Now DiskDigger extends its support to custom heuristics that means you can make DisKDigger scan for any type of file. With the help of custom heuristics, you can easily scan the files which aren’t supported internally by the program. 
You can create one or more custom filters just by creating a file called custom.xml. This file should be in the same directory as the DiskDigger executable. 
<?xml version="1.0"?>
<diskdigger>
<customfilter>
<extension>MyExtension</extension>
<description>Description of this file format</description>
<beginbytes type="ascii" offset="0">BeginBytes</beginbytes>
<endbytes type="hex" offset="0">01020304</endbytes>
</customfilter>
<customfilter>
<extension>AnotherExtension</extension>
<description>Another description</description>
<beginbytes type="ascii" offset="0">Test1234</beginbytes>
<size offset="8" width="4" endian="little" adjust="8" />
<customfilter>
...
</diskdigger>
The file contains one or more <customfilter> elements, each of which describes a custom file format. 
Firstly, DiskDigger requires only two pieces of information to recover a file: a starting sequence of bytes, and the size of the data that follows it (or alternatively an ending byte sequence).

The starting byte sequence is expressed using the <beginbytes> tag. The contents of this tag can be written as an ASCII string (type=”ascii”) or a string of hexadecimal numbers (type=”hex”). The “offset” attribute specifies the byte offset where the specified byte sequence should be found. If the “offset” is set to -1, then DiskDigger will search the entire sector for the specified byte sequence, instead of expecting it at a certain offset.

The ending byte sequence is expressed using the tag <endbytes>. This tag’s contents can similarly be written as an ASCII string or hex values.

Some file formats embed the actual size of the file somewhere in the file header. In this case, you can use the tag <size> instead of the tag<endbytes> . The <size> tag contains the following attributes: “offset” tells the offset from the beginning of the file where the embedded file size occurs; “width” specifies the byte width of the embedded file size (some file sizes might be 4 bytes, 8 bytes, etc); “endian” specifies the byte order of the embedded size (can be “big” or “little”); and “adjust” specifies how much to add or subtract from the embedded size (some files embed the total size minus the header, or something similar).

If a <size> tag is provided, then a <endbytes> tag is not necessary.

Both the <size> and <endbytes> tags are optional. However, if you don’t provide either, DiskDigger won’t know how much data to recover for your file type, so it will ask you to manually enter how many bytes you want to save.

The <extension> tag provides the file extension that you would like to use for this file type, and the <description> tag provides a short description of the file format. Both of these are optional.

Example:

Let me know what will you do if Diskdigger didn’t support the PNG format? You don’t have any idea? Anyhow, simply follow this article, here you will find some file type code which is implemented by using a custom filter.

<?xml version=”1.0″?>
<diskdigger>
<customfilter>
<extension>png</extension>
<description>Portable Network Graphics</description>
<beginbytes type=”hex”>89504E47</beginbytes>
<endbytes type=”hex”>49454E44AE426082</endbytes>
</customfilter>
</diskdigger>

Let’s break down what the above tags mean. We know that a PNG file begins with a byte sequence of 89 50 4E 47, so we write a <beginbytes> tag with the contents “89504E47”, and a “type” attribute set to “hex”.

We also know that a PNG file ends with a byte sequence of,49 45 4E 44 AE 42 60 82 so we similarly write a <endbytes> tag with the contents “49454E44AE426082”, and the “type” attribute also set to “hex”.

Moreover, we have written the tags using “ascii” notation, since we know that the starting sequence of bytes contains the letters “PNG”, and the ending sequence contains the letters “IEND”:

<beginbytes type=”ascii” offset=”1″>PNG</beginbytes>
<endbytes type=”ascii” offset=”4″>IEND</endbytes>

In the above tags, behold that there’s also an “offset” attribute. In the starting tag, it means that the specified sequence occurs 1 byte after the beginning of the file, and in the closing tag, it means that the sequence occurs 4 bytes before the end of the file.

One more example

Now let’s check weather DiskDigger will support WAV audio format or not? If No, Don’t worry! We implement the file type using a custom filter:

<?xml version="1.0"?>
<diskdigger>
<customfilter>
<extension>wav</extension>
<description>Wave audio</description>
<beginbytes type="ascii" offset="8">WAVEfmt</beginbytes>
<size offset="4" width="4" endian="little" adjust="8" />
</customfilter>
</diskdigger>

We know that a WAV file contains the characters “WAVEfmt”, which appear 8 bytes from the beginning of the file, so we write exactly that in the tag<beginbytes>.

We also know that a WAV file has its own size embedded at an offset of 4 bytes from the beginning of the file. We express this using a <size> tag with an “offset” of 4. We also know that the embedded size is 4 bytes wide, and little-endian. In addition, the embedded size is actually 8 bytes short of the total file size (it doesn’t take into account the first 8 bytes of the file, which is the RIFF header), so we include an “adjust” attribute to compensate for this.

More examples:

Visit Custom Heuristics Catalog page for a list of a customs file type that has been already implemented!

Making sure it’s working

To be certain that DiskDigger is actually using your custom filter, make sure the filecustom.xml is in the same directory as the DiskDigger executable, and launch the program in “deeper” mode.

Your custom filter(s) should show up in the list of supported file types:

If your filter does not show up in the list, then you may have an error in the formatting of the XML file. DiskDigger should display errors if it detects any inconsistencies in your custom filters.

Considerations: 

There are several things to keep in mind when using custom filters:

  • The beginning and/or ending byte sequence mustn’t be too short. Normally it should be at least four bytes or more. If the beginning sequence is too short, you’ll likely get thousands of false positive results. And if the ending sequence is too short, your recovered files will probably be truncated prematurely.
  • Before using your custom filter, make sure to test it on a known existing file of the correct type. For example, take a newly-formatted flash drive and load just that file onto it, then make sure that DiskDigger can detect it using your filter.
    You might also like :

Trackbacks

  1. […] You may want to read: CUSTOM HEURISTICS GUIDE […]

Speak Your Mind

*