The following is an excerpt from A K Peters, Ltd.'s COLLADA: Sailing the Gulf of 3D Digital Content Creation, specifically the book's opening chapter. It is reproduced here by permission of its original publisher and authors.
This chapter explains why the COLLADA technology has been developed. It provides a global view, defines the problems addressed by the technology, the main actors, and the goals and perspectives for this new technology. It provides an historic overview and information on how COLLADA is being designed and adopted. The goal is to give the reader an insight into how the design choices are made and how this technology might evolve.
An interactive application is composed of two major components:
COLLADA focuses on the domain of interactive applications in the entertainment industry, where the content is three-dimensional and is a game or related interactive application. Therefore, the user of the application will be referred to as the player.
The types of information that can be provided to the player depend on the output devices available. Most games use one or several screens to display the visual information, sometimes a system with stereo visualization, and a set of speakers for the audio information. Often, some physical sensation can be rendered, as simple as a vibrating device embedded in the joystick, or as sophisticated as a moving cabin in arcade settings. The application may output, or render, several sensors at the same time, often in different places. An observer is a term that defines a group of sensors that move together. For example, an observer in a (virtual) car may have at least two visual sensors to represent the out-of-the-windows view (in this case, the view through the windshield) and the rear-mirror view. Several observers, which may be simultaneous users, can be displayed together by the same application, for example, in split-screen games where two or more players are sharing the same screen.
The content must include all data required by all the sensors that the application wants to use. The content can have multiple representations stored, partially sharing some of the elements. For example, if the content represents a landscape, different materials may be needed to represent the four seasons. The different representations of the data are called scenes.
Another type of interactive application is a training simulator in which the goal is to teach the user how to behave in real situations. These applications are not games, and the trainees’ reactions when they make a deadly mistake in a simulation make this quite clear. COLLADA does not focus on simulation applications, but it could certainly be used in this domain as well .
The computer-generated animation movie industry is also very interested in COLLADA. This application is completely scripted, not interactive. That industry’s goal is to be able to produce previsualization of scenes that look as final as possible, in a very small amount of time, which is why they are interested in integrating game technology in their production process.
Other types of applications may also profit from COLLADA, but its goal is to concentrate on interactive game applications, not to expand the problem domain. Other applications that require the same type of technologies will indirectly benefit from it.
The first interactive real-time applications rendering three-dimensional graphics required very expensive dedicated hardware and were used mainly in training simulation. Physical separation between content and runtime did not exist in the early applications (such as in the GE Apollo lunar landing trainer) . The content was embedded in the code, or more specifically, some subroutine was coded to render a specific part of the content. Eventually, effort was made to store the embedded content as data arrays, and the code became increasingly generic so it could render all kinds of data.
The next logical step was to separate the data physically from the code. This allowed creating several products with the same application, but with different data. More products were defined by the data itself, and content creation soon became a completely separate task.
The real-time application was then referred to as the runtime, and the content for the runtime was stored in the runtime database. In the game industry, the runtime is called the game engine.
Digital content creation (DCC) tools were created, but the data structures and algorithms used for modeling did not match with the data that can be processed in real time by the application. DCC tools were also used by the movie industry for the production of computer-generated movies in which an advanced rendering engine was attached to the tool to produce a set of still images that compose the frames of the movie.
DCC tools and advanced rendering techniques, such as ray tracing  or shader languages such as RenderMan , required more advanced concepts than a real-time application could handle. Mathematical descriptions of surfaces such as splines and Bézier surfaces became necessary in the computeraided design (CAD) market .
Interactive applications needed both the advanced modeling techniques and the simpler representation usable in real time. Because of this, compilation techniques, used to create binary executable code from high-level languages, were adapted for the content processing. The database used in the DCC tool was therefore called the source. The data compiler takes the source data and creates the runtime data.
The runtime database soon became too large to fit in the memory of the target system. The content had to be sliced and paged in real time, depending on where the observers were. This quite challenging problem sometimes required specific hardware assistance and necessitated a very specific encoding of the content. Specific algorithms had to be developed for terrain paging  and texture paging .
Because of economic constraints, severe limitations exist in hardware targeted by the game industry. The data must be organized in the most optimal way possible. For performance optimization, many game applications use their own file system and combine the various elements in a single file. Some developers run optimization programs for hours to find the best placement of the elements for better interactivity. The idea is to optimize seek time and place data accordingly. This is very similar to the optimization section of a compiler, with a complexity similar to the NP-complete salesman optimization problem .
Another example of paging technology used outside the game industry is the Google Earth application . This application enables the user to look down on the planet from any altitude and render a view based on satellite altimetery and imagery information. It is the result of generalizing the terrain- and image paging technology developed for high-end simulation applications .