Gamasutra is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Gamasutra: The Art & Business of Making Gamesspacer
Asset Recovery: What to do When the Data is Gone!
View All     RSS
January 16, 2021
arrowPress Releases
January 16, 2021
Games Press
View All     RSS







If you enjoy reading this site, you might also want to check out these UBM Tech sites:


 

Asset Recovery: What to do When the Data is Gone!


August 3, 2001 Article Start Previous Page 2 of 5 Next
 

Know Your Data!

Ok, so you've determined that you really need to change or extract this unknown data, and you currently don't have any tools to do it. Maybe you've looked at the game source and still don't quite understand what is going on in the file, or need confirmation of what you think you know. It might be a good idea to have a look at your data.

The ASCII Dump
The easiest way to look at your data is the ASCII dump. This is most appropriate when your data is supposed to be text, however, even when your data is graphics data you can still discern that there are graphics patterns in it. Many files have some ASCII data in them for block headers and file types. This is usually a good place to start. If your file is small enough, you can load it into "Wordpad", or you can use the "type" command from the DOS prompt. I like to use JP Software's 4DOS command prompt, and their "list" command makes it very easy to browse files as both ASCII and Hex data. I would suggest that you don't edit binary files in your text editor though because many text editors convert linefeeds and OEM characters. Generally, this will corrupt your data if it's not in the same format of ASCII as the editor. Viewing as ASCII text is purely for exploratory purposes.

When viewing the ASCII dump, you are usually looking for one of three things:

  • First, is this an ASCII file, or does it contain an ASCII section? You might be surprised how many programs still parse ASCII data at runtime, especially PC games.
  • Second, does this file have ASCII headers that betray it as a common format? Windows DLLs and executables usually have "This program cannot be run in DOS mode" stuck in right near the top, no matter what they are named. PKZip format files usually start with "PK" as the first two letters. MP3 files may have ID3 ASCII tag data at the end. Windows WAV files generally have "RIFF" as the first four letters. If you see an unfamiliar header on the file, try looking at files that have a familiar format to see if there is a match.

    Viewing a PKZIP file with Notepad.


    Viewing a WAV file with 4DOS list (ASCII mode).

Finally, if you are looking at graphics, you may be able to discern patterns in the data that are useful for determining the size and format, especially if you can control how many letters are displayed before the line automatically wraps. For small sprites and font data, you often start to slightly see the images in the ASCII dump. This also works with game map data.

RAW Mode Bitmap Reading in Paint Programs

When you are trying to find graphics data, paint programs like Adobe Photoshop and JASC's Paint Shop Pro can be very helpful. Even if your data isn't in a standard graphics data format, these programs allow you to read a binary file as RAW graphic data using the "Open As" command. However, the methods these programs allow you in interpreting your data, are quite limited. For example, here is the Photoshop 4 dialog for Open As RAW:

Open As RAW Options from Photoshop 4.

As you can see, although we can specify the width and height of our target bitmap, we can't use any bit depth less than 8 bits per pixel, and planar data is non-existent. However, this doesn't mean that these tools are completely useless to us in the case of data that is less than eight bits per pixel. First of all, since most data will be a multiple of eight bits, you can still use the tool to determine the exact file specifications since you can see something, which resembles what you want.

The most important point when doing this kind of analysis is to pick good values for width. You should pick your width based on what you expect the data being extracted to be. If it is a screen image, choose the horizontal resolution of a screen on your system. If it is texture data, try different texture page sizes that are common for your system. If it is tile data, try a single tile width. If it is map data, guess the size of the map. Then set the height to as high a value as your program will allow you to set that keeps you within your file size. When you zoom in on the data, you can scroll up and down to see if its patterns jump out at you. Here is an example of a 256x256, 4 bits per pixel bitmap loaded as 128x256, 8 bits per pixel:

128x256 four bpp bitmap loaded as 8bpp raw single channel.

The garbage near the top left of the image is the image's header information, which is responsible for the right offset of the image. This can easily be skipped by changing the header field. If we are trying to guess the image format and guess incorrectly, we still have clues to help us determine the proper format.

The same loaded at 64x256 eight bpp single channel.

The alternating scan lines and overlaid images in this image show us that we have loaded the data at half its normal horizontal size.

Again, loaded at 256x128 8 bpp single channel.

The doubled image here indicates that we have loaded the data at double its normal size.

Loaded at 130,252

The skew to the left in this image indicates that the width we have chosen is slightly too thin. A skew to the right indicates that it is slightly too wide.

Of course, determining the graphics format is useful, but this data is clearly unusable. However, it is easy to write a tool to take nibble-packed data and convert it to byte-packed data. Then you can convert the file to byte-packed data and read it into your art tool as raw data, which can then be saved in a reasonable format for editing. The advantage of this rather than just writing a raw converter yourself, is that you can scroll through the data in your art tool, it's a lot quicker to code up, and the raw mode of your art program can be set to any width you'd like. The problem is you are still missing the color palette data. Your image is still being imported as a single channel grayscale image. Later, I'll explain how to identify palette data using a hex dump.

At this point, I've only talked about palette images. You can also use the same techniques for RGB data, and it usually works much better as long as you have at least eight bits per channel. In order to actually extract the data, you may need to reorder the RGB bytes in the final output. You can identify RGB data because the image will have vertical stripes in it when viewed as a single channel. Here is an example of 16 bit RGB data viewed as an eight-bit single channel image.

16bit RGB data viewed as an eight-bit single channel image.

 

For planar data and data less than four bpp, you will need to write a program to convert it to byte-packed data in order to do any graphical browsing.


Article Start Previous Page 2 of 5 Next

Related Jobs

Sucker Punch Productions
Sucker Punch Productions — Bellevue, Washington, United States
[01.15.21]

Producer
Remedy Entertainment
Remedy Entertainment — Espoo, Finland
[01.14.21]

Senior Development Manager (Gameplay)
Square Enix Co., Ltd.
Square Enix Co., Ltd. — Tokyo, Japan
[01.12.21]

Experienced Game Developer





Loading Comments

loader image