Reflections On Building Three Scripting Languages

By bruce wilcox

My role in the games industry is “supposed” to be an AI expert. But since the year 2000, I have built three scripting languages for three companies. This article looks at my experiences doing so.

In 2000, I wrote ICE for 3DO. It was a traditional game scripting language. Key virtues were its built-in support for game programming (events, timers, script multi-tasking, save & restore) and its ability to intermix compiled and hot-loaded interpreted scripts.

In 2005, I wrote HIPE for Radical Entertainment. It was a real-time Hierarchical Task Network planner language specifically for next-generation game AI.

In 2006, I wrote FLIRT for LimeLife. With cell-phones coming in various screen sizes and flawed operating systems and some using J2ME and others using BREW, porting applications is a nightmare. FLIRT allows you to write your application once and run it on all these platforms with almost no additional porting work.

Build, Buy, or API?

The question really comes down to: do you make common library routines accessed via an API to your existing programming language, do you buy (or use free) existing software, or do you build it yourself?

An API is suitable when you are a programmer (no fear of coding), and when the library routines are relatively independent areas of code. It is NOT suitable for non-programmer scripters or for when you really need an entirely new runtime system model.

The Case FOR a Scripting Language

  1. Scripting allows you to express what you want the game to do with less code and often with less-skilled personnel. Rapid development is usually the main reason for a scripting language which incorporates into the language those things most commonly done and usually has a simpler syntax requiring less typing.

  2. Scripting lets you create a virtual machine within your game, where scripter’s code can be executed much more safely than real code. This reduces engine complexity and allows you to share significant debugged code across multiple projects. In particular you can normally control memory management automatically so that scripters need not concern themselves with that sinkhole.

  3. Scripts can often be hot-loaded, speeding up the compile, load, debug cycle.

The Case AGAINST Writing Your Own

Existing languages allow you to ramp up faster, take less maintenance, have better documentation and often come with significant libraries of pre-existing code.

Writing a scripting language is a significant undertaking. Management should definitely try to make you NOT do it. You have to create a translator, a run-time system, documentation, regression test code and build scripts. You have to train people to use it. Then there are tools. Tools, tools, and more tools. If it takes you a lot longer to debug script code than normal code, you are losing your rapid coding advantage.

Here are the size comparisons (in code lines) of the three systems I built:

The runtime system of FLIRT is largest, because it encompasses both scripting and screen layout. Its translator is smallest because FLIRT started life entirely without a translator almost as an assembly language and so its translation needs are simple. (It performs a lot of validation at runtime when running on a phone emulator). To run under BREW, FLIRT requires a second runtime system (not shown).

I can’t really say if documentation turns out to be of a similar size because it makes sense that way or merely that I become exhausted after a certain point.

HIPE was a contract project, so I know it took five months of full-time work. ICE and FLIRT were both written as an employee and doing other tasks as well, so it’s hard to know how long they really took to build, but 4-5 months is a reasonable guess of how long it took to build the J2ME version of FLIRT.


Language 1: ICE

In 2000, when 3DO (now extinct) was beginning its multi-platform common library strategy, one of the elements was a scripting language. 3DO was tired of each project designing and building its own language, then training the level designers in it.

When our project (Green Rogue for PS2) started, we, of course, wrote in our design doc that we would adhere to the new 3DO policy- i.e., that we would use the common library code, including its scripting language. Then I read the scripting language documentation. Or rather, tried to read it.

It seems the new scripting language was really just a macro preprocessor for C. I struggled to make sense of it and I knew no level designer was going to read it and comprehend it. The language presumed some event handling mechanism and timer, but didn’t supply one. That meant each project would create their own macros and timers and event mechanisms. Really, each project would be completely unique in its scripting language with just a minimal common core.

Finally, all the language did was preprocess C. That meant compiling the script into the load file for the PS2. If you wanted to change a script, you had to recompile and redownload the new ELF.

In those days, the Ethernet to the PS2 DevStation was EXTREMELY slow. The turnaround for designers when they wanted to change their scripts would have been horrendous. Which meant we had conflicting needs. On one hand, sometimes scripts needed to be compiled for maximal speed. On the other hand, mostly we needed interpreted scripts hotloaded during gameplay to speed up game production.

Build, Buy, or API?

We needed a scripting language, as this was for our level designers, not programmers, so forget API. Various people suggested using existing off-the-shelf scripting languages, but there were problems associated with each. All of the existing scripting languages were prepared to be general purpose programming languages. That meant scripters could try to write code that could hang our game.

We needed to tightly control just what things scripters could do and insure no crash was possible. With complex scripts we knew there was no hope of ever completely testing all paths of the script. So it was best to insure it could "do no harm."

Existing scripting languages also each had their own independent weaknesses. Lua, for example, had a built-in garbage collector. This is not something you wanted triggering during a real-time game. Python was strictly interpreted, which would be too slow for some scripts. And none of these languages addressed our primary need for compiled AND hotloaded interpreted. With the continued evolution of these languages, nowadays I would probably have chosen one of the two, but at that time neither was appropriate.

So I wrote up a two-page paper detailing the flaws of the library scripting language and got approval to build yet another scripting language. It helped that I spent part of a month building a prototype system to prove the basic concepts of ICE worked before submitting my request. But my goal went beyond merely writing a scripting language for our project.

That huge an investment can only be recouped if other projects also used it. So I intended to completely supplant the existing library language with my own, which would achieve what 3DO really wanted.

Many people have written scripting languages for their companies, only to have the language disappear after zero or one use. To accomplish making my language truly useful and universal, I needed to do the following: a) build a better language b) document it so people could learn to use it c) support it so that any project that needed new features could get them d) promote it in lunch lectures and by other means so that people knew why they should want to switch.

Without all of those steps, it would become yet another scripting language used on yet another project. And right up there in importance was creating a great name, something that people would react well to (this is a marketing exercise). I chose ICE.

ICE combined a simplified C-like syntax with runtime support for multitasking, events, timers and interpreted code. All code was “protected” from really dumb scripter errors. This made it much more than any simple API to common library code could be and is a classic example where a scripting language is needed and a bunch of shared library routines you can call from your programming language just won’t be as valuable.

The resulting system consisted of a C translator that generated either C source or interpretable byte-code (depending upon whether you wanted speed or hotloading).

ICE supported events with arguments and used those with all asynchronous calls instead of callbacks. The system supported “listening” for an event with a pattern match on its two arguments. I greatly prefer this extremely loose coupling mechanism to callbacks. So when, for example, the game engine’s animation system finished running some animation on a character, it would broadcast a completion event naming the animation and the character. Then ANY code, including the code that initiated the animation, could react and do things.

One could write independent routines that did different things with the information, without ever changing the original code. So special effects code could be added independently later. This is much better than virtual functions of class objects, because with those you still have to have planned a call in advance, knowing what you intend to do. So if you want to do something more later, you have to go back and modify existing code.

Because ICE needed visibility over arbitrary game engine functions, a registration interface was provided whereby the engine registered the name, code address, argument description, category, and documentation for routines being made available to its copy of ICE. ICE also had a command to dump out the documentation for all routines and engine variables so registered, so scripters ALWAYS had up-to-date documentation even as new capabilities were added daily to the game engine’s interface to ICE.

ICE supported a range of relevant data types: objects, strings, events, integers, floats, enumerations, arrays, and records. Two predefined record types: XY and XYZ were built in, to directly support location and vectors. Type checking was done at compile time. Objects could be used by name in a script, and as soon as that name was given an object at run time (perhaps much later in the game), the object was “registered” and the places in the script that used that name would be filled in appropriately. So scripts could easily refer to "Sarge" or "PickupTruck1" and not be concerned about finding the object involved.

ICE could directly read simplified C header files. This allowed it to share constant names and values with the engine instead of having to copy names and values into the scripting language by hand, a source of potential error that happened frequently on earlier projects.

ICE became the defacto scripting language for 3DO and the one people were initially required to use never saw real use.


Language 2: HIPE

My connection with Radical started when I interviewed for employment with them. They made me an offer but I turned them down. Nonetheless, they kept in touch and nearly a year later asked me to write a White Paper on what areas of AI to develop further. They were interested in creating revolutionary new AI to take advantage of the increased horsepower of next generation consoles.

The typical game engine uses explicit scripting and/or finite state machines (FSM) to control its AI.

The disadvantage with scripting is that you must explicitly write out every control transition. That is, if under some conditions you want the AI to do X and other under conditions you want it to do Y, you must write all the ifs and all the thens in one place, creating a long nightmare piece of code. And that assumes that you can know, in advance, the one right way to do things. But situations are rarely that simple. Often times you must generate a solution, try it out hypothetically, discard it if it fails, and try another solution. Scripts are not good for that.

FSM’s improve upon the act of writing scripts by providing orderly areas of understandable code to carry out an activity, with explicit tests to decide when to go to which piece of code next. It’s an improvement in organizing code and the underlying FSM mechanism can become shared code among games. Still, FSM’s have no ability to explore hypothetical paths and you must still explicitly code which piece of code to move to next. When FSM's become complex, they still become a nightmare spaghetti-tangle of control paths.

If you want really interesting AI for a game like Command and Conquer -- if you want a system that can figure how to build the next widget by stringing together mining for materials, manufacturing components, defending the transportation along the way, etc, then what you want is a planner. It’s all about how to do something based on what you have or can do.

So I wrote them a paper which started out: This white paper covers the application of planning, goals, drives, emotion, personality, and historical memory to AI in games. What these all have in common is that they provide one of the three things needed to decide what to do next. They provide the bias in the selection of actions. The other thirds are current sensory input and actual capabilities.

Later I wrote: In the last five to seven years, academic planning systems have evolved significantly. Biennial planning competitions started in 1998 with five programs competing. Consequently, planners have been handling ever larger and more complex problems. The original competitions covered pure “strips-based” planning. They were still working on the blocks world in which all facts are true or false. In 2002, planners were pushed into handling time-based domains and numeric-based domains (ones with constraints on numerically-valued variables) as well. In 2004 they added probabilisitic domains and had 19 competing systems. A quote from the results page of that competition: “For runtime performance, the observations to be made are much more interesting and diverse. Indeed, we were stunned to see the performance that some of the planners achieved in domains that we thought were completely infeasible!!”.

I argued that a planning language would become the next High Order Language for AI. Radical agreed.

Build, Buy, or API?

The features of a real-time planner require tight integration and are not suitable as an API.

Planners are not built to the demands of the video game industry. They are strongly academic, written in LISP, with an obscure syntax. The planner operates in an isolated world environment, where it controls all the actions and the world is completely predictable as a consequence. The planner does not pay attention to memory usage issues, takes all the time it wants, and is not prepared to integrate seamlessly into any game engine. So buy was out of the question.

Radical asked me to design one. After that, they asked me to build it -- HIPE. Another name chosen for its marketability within Radical.

HIPE is a totally ordered Hierarchal Task Network (HTN) planner built for video games. As a totally ordered HTN planner, it can be efficient in backtracking and make good use of existing world state. Built for video games, HIPE also allows the planner to initiate real world action immediately, if desired, even before planning is complete.

HIPE is not the first planner I wrote. I did one for my own use early on at 3DO for a galactic conquest type game. If you told it that to have a ship of a particular class you could either build it, or trade for it, or steal it, and that to build a ship required various materials that could be acquired by mining, trading, scrapping, or stealing, then it could discover for itself that in some situation where it desperately needed a class of ship, it could go steal an enemy ship, scrap it, and build a new one. But that planner could not do lookahead and was designed for a turn-based game. HIPE had to go way beyond that.

A complete plan specification contains three parts: the model, the logic, and the problem.

The model is the definition of the world: what entities the world contains, what acts can be taken, and what effects those acts have. The model is a common shared ground in which acts can take place. It does not specify how or why useful acts arise. The model may either be a complete self-sufficient world (all information can be found within the planner's internal world model) or it can interface to an external world (in which case some information may come about as the result of queries to that external world).

The model consists of these things:

1. user-defined types (categories of objects)
2. objects in the world
3. numeric constants
4. events that can happen asynchronously, arising from the game engine or plans
5. priorities that can be used to reorder plans according to different criteria
6. how objects are described (object attributes and relations among objects)
7. external functions available from the game engine or simulator
8. allowable acts supported by the world (acts change the world)
9. interrupt functions that can respond to events independently of a plan

The logic defines how to reason with the model to accomplish something useful. The logic consists entirely of plans and acts (descriptions of things to do if certain conditions prevail, so as to achieve an effect that will typically take many actions over time).

Acts are atomic actions. Plans are composites of plans and acts.

The problem is the current configuration of the world and the future configuration you want to accomplish and/or the particular goal or goals to achieve.

The initial demo scenario Radical proposed was a “Zookeeper” problem. Given a zoo with a couple of exits, two cages, and some random placement of chickens and foxes, keep as many chickens and foxes alive and within the grounds of the zoo as possible. Animals slowly move toward their nearest exit (if not caged) but if a chicken is within small range of a fox it will back away from the fox and if a fox is in somewhat larger range of a chicken, it will make a beeline to it and eat it. The zookeeper can move next to an animal and pick it up or drop it (including into a cage) or stake it (which temporarily immobilizes it until the animal gnaws through the tie to the stake.

So the zookeeper is given this problem, and must control the game engine and save animals. Then for spice, the HUMAN is allowed to arbitrary teleport any animal anywhere at any time and watch the zookeeper suddenly have to change behavior to still save as many as possible given the new configuration.

Here is a quick illustration of HIPE script to code Zookeeper.

Type declarations define the organization of entities in the world. Entities can be concrete objects or abstract concepts. You can name specific objects and/or name types as being members of your type. Hence in the code below, is a type consisting of two named chickens. is a type consisting of the type (zookeeper) and the type (chicken1 and chicken2 and fox1 and fox2).

TYPE : chicken1 chicken2
TYPE : fox1 fox2
TYPE :
TYPE : zookeeper
TYPE :
TYPE : cage1 cage2
TYPE : exit1 exit2
TYPE : nothing
TYPE : free immobile escaped

Most types are “obvious”. The type includes all the animals and the “nothing” object, so as to be able to represent the zookeeper being empty-handed. The type is an abstract concept type naming conditions of the animals. HIPE also allows you to declare dynamic types, reserving a range of objects to be created on demand at runtime. So you could reserve room for 100 chickens with only a couple of them having current existence.


Once you have named the types, you then name the attributes and relationships objects can have. Attributes are tuples of up-to-5, where there can be any number of “selector arguments” but only one “value argument”. That is, attributes are mutually exclusive. For example, objects of type vehicle might have an attribute fuel-level, which can be a specific number, but only one number at a time. Relationships are like attributes, but are not mutually exclusive. A human might have a relationship of loves, and can love any number of people or things at the same time.

Attributes and relationships are propositional data that gets stored in a world database. You assert and retract facts in this database to describe the current world and how it evolves over time (or in lookahead). Some of these facts match facts of the game engine itself, so you can reason about a world that is not pretend.

Attributes are defined by a name, and an argument list. The only essential things in the argument list are the type and order of arguments. The names supplied are there only to act as documentation for the user.

ATTRIBUTE holding( what)
ATTRIBUTE status( who state)

The zookeeper’s hands are represented by attribute holding (he can only hold one animal or nothing at a time). Each animal can exist in a single state at a time.

The basic actions available in the world are called ACTs and can be predeclared as they are here. Eventually you have to supply the code for them or link them up to game engine routines. is the predefined supertype over all types.

ACT intercept( who what)
ACT pickup(what)
ACT putincage( where)
ACT Intercept( where)
ACT stake( who)

Once you have acts, you can write plan pieces. A plan is a name, an IF section (consisting of anding all its clauses) and a THEN clause consisting of ACTS and or PLANS to be executed in order.

Below is a quick rendition of a way to save animals that depends primarily on automatic lookahead to find the order in which to save them. The plan only goes to an animal, picks him up, goes to a cage and then dumps him in the cage. It does not look at staking an animal, or carrying one around and placing it down on the ground elsewhere.

Plans can call functions that return values, including making queries upon the built-in database system. Whenever you use a ?variable, it means you are seeking an answer to be bound to that variable. A function like FINDLOW will iterate over the set of possible answers to its query (here it iterates over the set of animal objects) and perform the AND of all the tests in brackets. So SaveAnimals looks at each animal and first asks the database using the Attribute Status, whether the animal is free or not. If not the find fails and goes on to the next instance. If the animal is free, the FINDLOW calls a function ObjectDistance, which computes how far the animal is from the zookeeper and binds that onto ?dist. FINDLOW is a filter that will return the animal (?what) with the lowest distance (?dist). It is also backtrackable, so if the plan later fails, it can try again with the second closest animal, etc.

Taking a cue from the design of ICE, HIPE supports events and filtered waiting for some event/argument combination to happen.

EVENT Arrived(from,to)
Event Escaped( who )

In Zookeeper the interesting events are Arrived, You can’t do much about Escaped.

The following code illustrates acts that are defined in this example, but there is no point in going into tremendous detail about what they do. Infer it from context and comments.

In addition to the basic translator/runtime package, there was also a source-level debugger. Debugging something running through the loop of an interpreter is a big pain if you haven’t got source-level debugging.


Language 3: FLIRT

LimeLife didn’t know it at the time, but they needed a technology to manage their application porting and they readily accepted it when offered. Writing code in J2ME (Java) and then recoding it in Brew (C ) is tedious and the multiple screen sizes of phones make generating code for each screen size effortful. An API doesn’t work so well when your code needs to be in two different languages.

Build, buy, or API?

Some mobile application companies have written standardized UI generators and some have written things like a scripting language that rehosts Java just so they can download their code as data at runtime. The primary commercial candidate for the J2ME/Brew network portable language was the UIEvolution system. But it was expensive, only handled scripting (not layout), and generated code similar in size to Java itself. I thought we could do much better, hence I built FLIRT.

FLIRT is not just a scripting language repackaging java/brew code into a one-to-one-correspondence. FLIRT integrates scripting and display in a language designed specifically for cell-phones. Size is the paramount concern. While Java is designed to produce highly compact code, FLIRT script saves 50% over Java code. Java tries to be a universal language across machines big and small. But cell phones imply significant limitations and a well-designed scripting language can take advantage of that.

Since a major “reusable” component of mobile applications is code to manage the UI, I organized FLIRT around a state description of each screen. The state describes what softkeys are to be used, what script code to execute in response to incoming events, and what graphical elements to draw on the screen in what order.

The first design decision I made was to restrict the range of all script arguments to one byte, using indirection as needed to access tables of things taking more than a byte (e.g., 32-bit integers, colors, fonts, strings, graphics).

The second design decision was to eliminate local variables. All user data is normally kept in global two-dimensional arrays, one for bytes, 32-bit integers, strings, and graphics. The [0][] is for transient unrelated values, while other first dimensions hold sets of related information. For example all the strings of a menu would be in, for example, strings[3][]. So a script element defining a menu could refer to that using the single byte value 3. Of course the language allowed you to automatically label dimensions so you never used the number.

For example, defining your global strings array might look like this:

This would define stringSets[0] to be anonymously labeled but its members have labels (labels are named followed by colon) and sometimes initial values. StringSets[1] is called sTRENDS_MENU and is a list of anonymously named strings which would be the names of menu choices.

The user script will have named intial values for the data arrays, as well as arrays describing fonts, colors, and softkeys. All of which can then be accessed using a single byte value (name) used in the appropriate instruction or element context.

For example, a TextRect display element can be thought of as a subroutine call to display a set of text somewhere on the screen. It’s actually more than that, because it is not code but data to be interpreted, and the data can be modified by the scripting language. This means the script can scroll the content, move it around, change what text it points to, etc. Be that as it may, this call takes 15 arguments to completely define the element. Since each argument is always a byte, it means it takes 15 bytes (half the size it would take J2ME to do it as a function call).

These bytes tell it:

  1. the kind of element (TextRect)

  2. an ID label so that script can interact with it and other elements can lay themselves out relative to it

  3. a bunch of modifier bits to control its behavior (including left, right or center alignment, should scrolling it wrap around the ends or pin)

  4. which text string to display

  5. at what location relative to some other element on the screen (which means it automatically adjusts to different size screens when the other elements are smaller or larger graphics).

  6. how much size to reserve for this element in x and y

  7. what font to use (a font includes custom graphics for regular and highlight state)

  8. the current scroll and limit (text too big for a field can be scrolled by line or by page)

The third design decision was to make screen layout relative. FLIRT display elements are designed to be laid out in a relative manner, autolaying themselves out as appropriate for different size screens and data.

Events include softkey events, keypad events, and pseudo events the user can create. Since there are no local variables, there are no “calling arguments” either, except that you set up some reserved spots in the global data arrays to be used in the call. Hence triggering an event and making a subroutine call to event code of some state use the same mechanism.

FLIRT acts as our common library resource, in that sections of it can be compiled in or out, including custom fonts and textwrapping, networking, sorting, and RMS (file) access. Interface elements include textblocks, horizontal and vertical menus, marquees, graphics, scrollbars, screen keyboard, clipping areas, and camera abilities.

The fourth design decision was to make scrolling a fundamental metaphor over all graphical elements. In combination with scripts Flirt supports scrolled graphics (including animation), scrolled text, scrolled shapes (e.g., progress bars). Scrollbars following a display element automatically reflected the scroll state of that element, allowing you to have classic vertical scrollbars or horizontal scrollbase with text labels (like captions) for the current state of a scroll.

LimeLife products, being aimed at the women’s market, have a heavy amount of esthetic design, so simple menus just are plain unacceptable. The emulator screen at right is of LimeLife’s Instyle product. The menu is not a single menu element, it is composed of display elemets including character elements (the numbers in blue), text elements (the menu entries in black), and colored shape elements (highlights and separator bars). A router element controls behavior among the display elements. It names a list of other elements and as you scroll the router, you change which other element is selected (highlighted) by it.

This screen with its routers takes about 70 individual elements to compose it (where half of them are control or non-visible layout elements like an element which evenly spaces out a collection of elements vertically). The screen will automatically re-lay itself out if the graphic size or font size changes (or even if the graphic is missing because the network connection failed). It works on all screen sizes and on BREW or J2ME phones.

The script code for this state reacts to arrows, numbers for quick access, softkeys, and caches the main graphic in RMS if it can. The entire state description is 1093 bytes (script + display elements), which makes it one of the more complex states (since the average is 219 bytes per state). The product has 83 states taking 18,169 bytes (that’s display elements and script, and does not include text or images).

Having code and display information in one place (the state description) is a lot handier than our old J2ME-coded applications, where display code went in one big switch statement and control code went in different switch. It was hard to see how one related to the other. ( J2ME size issues incline one to write non-object oriented code in massive switch statements.)

Because all screen elements and script code is just “data”, new or revised states can be downloaded from the server if desired.

New product mockups can be rapidly created with FLIRT. It took a week to create a full-fledged visual demo of our second Flirt-based product (visual in that the controls only acted to take you from screen to screen and had no other functionality). When the BREW FLIRT engine was built, three completely different FLIRT-based products ran on it immediately. One merely took the data files built for the J2ME engine and ran them on the BREW one. Autolayout screens and all scripts ran correctly.

Conclusion:

Writing a new scripting language is a major undertaking, which goes far beyond merely coding it up. But when you have a really compelling reason to do so, you are not reinventing the wheel. You are creating a way to concisely express your thoughts in a new language. And language is what gives humans enormous leverage over the universe.

About the Author

I am both a consultant and an employee (in all cases, as a long-distance telecommuter, living in Hawaii in the past to Gloucester, UK by the time this article is uploaded). Want to reach me? gowilcox@gmail.com or look at my resume on Gamasutra.

Return to the full version of this article
Copyright © UBM Tech, All rights reserved