Gamasutra is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Localizing MMOGs
August 19, 2019
August 19, 2019
Press Releases
August 19, 2019
Games Press

If you enjoy reading this site, you might also want to check out these UBM Tech sites:

# Localizing MMOGs

September 12, 2003 Page 3 of 3

Implementing the Meta-Language

The meta-language runs on the client in real-time - that is, the grammar parses every string right before it is displayed to the player. It's implemented as a simple black box interface that takes a StringInfo as input and returns a string as output. I implemented the meta-language with a yacc-like tool. I was unable to use a lex program as my tokenizer because the input was Unicode text and none of the lex tools I could find supported Unicode input very elegantly. So I just wrote my own tokenizer.

The yacc grammar started off as astoundingly simple, almost not worth the effort of using yacc at all. A top-down parser would have taken only slightly longer to write. However, near the end of the project when I needed to add a few last-minute features to the meta-language very quickly, the yacc grammar proved its worth.

I spent a bit of time working on the tokenizer to make it as user-friendly as possible. For instance, when someone wants to display a string with a '#', '{', '}', etc. in it, they are supposed to put backslashes in front of those characters - entering '\#' to display '#', for instance - so that the meta-language can discern which '#' are part of a meta-language expression and which are characters that are supposed to be displayed. However, since '#' can only be used in one place as part of the meta-language, the parser automatically decides that if the '#' is out of place, the user probably just forgot to precede it with a backslash. The meta-language was configured in such a way that you very rarely need to actually remember to insert the "\" before a special symbol.

Also, the meta-language automatically combines whitespace. If given

"You die!"

it will return

"You die!"

This proves useful for allowing arbitrary spacing in between blocks of meta-language to improve legibility. Thus the designers can type

"When you pick up {the[!n]} $ITEM$, {he [m] | she [f] | it } explodes in your hand!"

without having to worry that there will be two spaces after "he". They can put extra spaces wherever they want and the string will come out okay.

One feature we thought we would use a lot turned out to be very rarely needed. If you precede a variable with a ^ symbol, the first letter of the variable is automatically capitalized (using the towupper() C function). That way you could say

"^$NAME$ bows gracefully."

And the first letter of $NAME$ would always be capitalized correctly. However, it turned out that we almost never began a sentence with a variable, and when we did, it was usually in situations where we knew the first letter was going to be capitalized already anyway. In all, we needed to use the ^ feature exactly once out of all of our thousands of strings. On the other hand, we didn't expect to need the negatively numbered variable feature (which lets variables be invisible in the string) much at all. It was really just added for completeness, because it was trivial to add and it might prove handy. We ended up using that feature quite a lot, and in many creative ways. In general, my guesses as to which parts of the meta-language would be most useful were completely wrong.

The documentation, however, proved very useful all the time. We gave it to the translators, modified it based on their comments, and kept it up to date as we added features. Good documentation is completely mandatory for something that you're going to hand off to another team - in this case the translators. However, one mistake I made was to use very technical terminology. Some of the translation team quickly understood how it worked, but others took a while to understand it. It pays to remember that translators tend to be brilliant people but NOT programmers. Hence the documentation should be as approachable as possible and use lots of relevant examples. Meta-languages are relatively rarely used in games because they are only appropriate for games with lots and lots of text; thus it is quite possible that your translation team will never have worked with one.

 AC2 in French.

That points out another possibility - that your translation team will balk at the idea of using a meta-language. This is an understandable position because a lot of the effort will be put on the translator's shoulders. Your designers need only write the strings once; the translators will need to rewrite those meta-sentences for every other language. This will definitely increase the translation time somewhat, and your project simply might not be able to afford it. In this case you can at least use the meta-language for English. The translated versions would not have correct pronouns and modifiers, but you can still get them right in the English sentences. Simply write a script that goes over all your strings and condenses any meta-language into the last case from each section:

"You take {the[!n]} $ITEM$ and store {him[m] | her[f] | it} away."

Becomes

"You take the $ITEM$ and store it away."

And then give these resulting strings to the translators. Of course, if you do this you can only use the meta-language for simple grammar bits, such as "a", "an", "the", "he", or "she." You cannot insert alternate verbs or adjectives, like one of the earlier examples that allowed both "You murder him" and "You destroy it" in the same string. Anywhere that you would have used the meta-language to insert verbs or adjectives like that, you must instead create separate sentences for each verb.

Talk The Talk

The importance of high quality sentences is a subjective thing and varies between individuals. To a large extent, then, the need for good grammar depends on your target audience. It certainly isn't mandatory, especially if your game doesn't rely heavily on text. On the other hand, it isn't as difficult as it may seem.

The biggest concern people had when we were developing the meta-language was that it sounded like it would greatly increase the time needed to author strings. Actually, though, only the combat messages and item usage strings need to have meta-language in them. Big blocks of NPC dialog, exposition, or descriptions tend to need absolutely no meta-language because they aren't interactive - those strings don't contain any variables.

The meta-language was perhaps the most interesting part of the localization effort, but not the hardest. A fair amount of time was taken in integrating the IME to support Korean. Even though our publisher (Microsoft) provided us with plenty of assistance in this area, we still had to integrate it into our engine, which proved tricky. But the most time-consuming task of all was the documentation. The translators needed a dictionary of game-specific terms, of course, but more than that, they needed to know the context and possible values for every variable in every string. They needed to know when various ambiguous strings actually appeared in the game so that they could understand how to translate them correctly. They needed, in short, documentation on the strings themselves. Our string editor allows comments to be added, but most people didn't bother - we didn't realize their importance until the translators started sending us long emails daily, each with 20 or 30 ambiguous strings that needed comments. Then we'd have to dive into the code and figure out where that string was actually used. It would have been a lot less stressful and time-consuming if we had just added some quick comments when the strings were originally added. The hardest part of localization on our end was just preparation and organization - a task that will come easier next time.

Resources

The Unicode Standard Version 3.0, from the Unicode Consortium. Good to have for the reference CD, if nothing else. Partially available online at http://www.unicode.org/unicode/standard/standard.html

The Inform Designer's Manual, 4ed, by Graham Nelson - Inform is a system for writing text adventures, and has multilingual support. The designer's manual has some interesting discussions on meta-grammar concepts. Available on the web at:
http://www.inform-fiction.org/manual/html/index.htm

______________________________________________________

Page 3 of 3

### Related Jobs

HB Studios — Lunenburg/Halifax, Nova Scotia, Canada
[08.19.19]

Experienced Software Engineer
iGotcha Studios — Stockholm, Sweden
[08.18.19]

Technical Artist
Deep Silver Volition — Champaign, Illinois, United States
[08.16.19]

Senior Engine Programmer
Square Enix Co., Ltd. — Tokyo, Japan
[08.16.19]

Experienced Game Developer