|
It's obvious
when asset recovery is necessary. For instance, a disk crashes while you're
mid-project, and lack of proper backups has eaten source files for some
of your data. What if you've been put in charge of localizing a game from
another team (or another company), but they can't (or won't) find all
of the necessary data files for you? Perhaps you are porting a project
developed on another platform and don't have the necessary hardware or
software tools to edit or convert the files in their native format. Alternatively,
you have the source data but you don't have the tools or scripts necessary
to convert the data to its in-game format. No matter what your situation,
you don't have many choices; you can cancel the project, recreate the
data from scratch, or recover it from what you have. Usually, recovery
is the preferable option.
I spent
over four years as the lead programmer at Working Designs, handling many
east-west localization projects. In that time, not once was the project
source data we got from Japan anywhere close to complete. Asset recovery
was a large part of my job. In this article, I'll share with you my secrets
for handling missing data.
What are your goals?
The first
thing to determine is what your goals are. Does this screen, that you
can't find the bitmap for, really need to change, or can you simply reuse
the post-processed binary file? Do you actually need to get the original
version of this text file, or will all the data change anyway? Do you
need to change every sound in the file, or just the one in the middle?
If you can
hack the change in with a hex-editor in an hour, why bother writing tools
to extract every file back to it's original form, and then put them back
together again? Always remember that the goal is to ship your product,
not to have a perfect set of data!
What have you got?
The next
step in any recovery effort is to determine what you have to work with.
Usually you've got some sort of binary file, which might be the commercially
available version of a game to be localized; the CD build you gave to
test last week, or what was left in the object directory after the disk
crash. You have to be creative when thinking of places your data might
be hiding. A lot of times you might have intermediate files laying around,
which would be easier to decipher than the final binary data. If you are
doing localization work, the data files might be hiding in some data format
that you are unfamiliar with.
If you were
sent incomplete data as part of a porting or localization kit, it's always
good to try to contact the original team and see if they can help you.
Maybe they have an older version of the file and you can simply redo the
revisions. The best asset recovery is to not have to do it after all.
Once you
know what kind of data you do have, the next thing is to look at what
resources you have for understanding the format of the data.
If you are
localizing a game, you may have documentation in a foreign language, which
may or may not be applicable. This is a situation where it is good to
use machine translation (MT) software. Even though the translations given
by MT software are usually terrible, you can often get an idea of whether
the document you are looking at will be helpful or not, before you pay
real money to a real translator for a real translation. Good examples
of machine translation software include Digital River's Sys Tran (the
software that powers Altavista's online Babel Fish translations), and
Fujitsu's Translingo and Atlas products. These utilities often do strange
things to the formatting of their output, so you'll have to write pre
and post processing utilities if you wish to use them for source code
comments. Still, if these products save you (or your translator) a few
hours (or days) of work, they've paid for themselves!
If you are
using utilities and data from another group, don't forget to check to
see if they sent you any useful utility software. There may already be
a utility to return the binary file to a usable format, or you might have
source code for the original conversion program that converted it to that
format. If you don't have either of those, at least you have the game
source code that reads or processes the file data. Sometimes the easiest
thing to do is to drop extra code into the game, which runs on the target
that will echo the decompressed data back to a file on the host.
If you are
working on a product that is or has a sequel, you might inquire about
utilities, data, or documentation from the other versions. They often
use a compatible format, a very similar data format, or are backwards/forwards
compatible with other versions. Sometimes the source code for the tools
have comments in them about what has changed from the previous versions.
Sometimes,
with ports and localizations you may have data, which was simply generated
by an unknown tool. When I was working on Princess Maker 2, the art was
delivered in files with the ".ZIM" extension. Researching on
the Internet, I found that this was a format of "Kid98", an
art program that exists only on NEC's PC-98 computer line. Source code
for a tool that converts ".ZIM" files into the Maki-chan ".MKI"
format was found. Although ".MKI" is similarly obscure, we added
capacity for good old ".PCX" files to the tool and were back
in business! Sometimes, it may even be more economical to buy, borrow,
or emulate the strange computer that the data came from so that you can
run the native editing software, rather than rewriting it yourself. Also,
always check to see if there is a third-party tool that can do the job.
Finally,
remember again to always ask the providers of data if it is at all possible.
In addition, make sure you tell them the complete situation, rather than
just telling the providers your current plan of pursuit for retrieving
the data. Their methods of creating the data maybe completely different
from what you thought, and if you focus on something strange, they might
not understand what it is you really need, especially if you are operating
through a translator. Sometimes, your request may go unanswered for a
long period of time before your data suddenly arrives!
|