|
The following is an excerpt from A K Peters, Ltd.'s COLLADA: Sailing the Gulf of 3D Digital Content Creation, specifically the book's opening chapter. It is reproduced here by permission of its original publisher and authors.
Overview
This
chapter explains why the COLLADA technology has been developed. It
provides a global view, defines the problems addressed by the
technology, the main actors, and the goals and perspectives for this
new technology. It provides an historic overview and information on how
COLLADA is being designed and adopted. The goal is to give the reader
an insight into how the design choices are made and how this technology
might evolve.
Problem Domain
An interactive application is composed of two major components:
- the application, which provides information in real time to the user and the means to interact with it;
- the content, which contains the information through which the application navigates and provides a view to the user.
COLLADA
focuses on the domain of interactive applications in the entertainment
industry, where the content is three-dimensional and is a game or
related interactive application. Therefore, the user of the application
will be referred to as the player.
The
types of information that can be provided to the player depend on the
output devices available. Most games use one or several screens to
display the visual information, sometimes a system with stereo
visualization, and a set of speakers for the audio information. Often,
some physical sensation can be rendered, as simple as a vibrating
device embedded in the joystick, or as sophisticated as a moving cabin
in arcade settings. The application may output, or render, several sensors at the same time, often in different places. An observer is
a term that defines a group of sensors that move together. For example,
an observer in a (virtual) car may have at least two visual sensors to
represent the out-of-the-windows view (in this case, the view through
the windshield) and the rear-mirror view. Several observers, which may
be simultaneous users, can be displayed together by the same
application, for example, in split-screen games where two or more
players are sharing the same screen.
The content
must include all data required by all the sensors that the application
wants to use. The content can have multiple representations stored,
partially sharing some of the elements. For example, if the content
represents a landscape, different materials may be needed to represent
the four seasons. The different representations of the data are called scenes.
Another
type of interactive application is a training simulator in which the
goal is to teach the user how to behave in real situations. These
applications are not games, and the trainees’ reactions when they make
a deadly mistake in a simulation make this quite clear. COLLADA does
not focus on simulation applications, but it could certainly be used in
this domain as well [1].
The computer-generated
animation movie industry is also very interested in COLLADA. This
application is completely scripted, not interactive. That industry’s
goal is to be able to produce previsualization of scenes that look as
final as possible, in a very small amount of time, which is why they
are interested in integrating game technology in their production
process.
Other types of applications may also
profit from COLLADA, but its goal is to concentrate on interactive game
applications, not to expand the problem domain. Other applications that
require the same type of technologies will indirectly benefit from it.
Separation between Content and Runtime
The
first interactive real-time applications rendering three-dimensional
graphics required very expensive dedicated hardware and were used
mainly in training simulation. Physical separation between content and
runtime did not exist in the early applications (such as in the GE
Apollo lunar landing trainer) [2]. The content was embedded in the
code, or more specifically, some subroutine was coded to render a
specific part of the content. Eventually, effort was made to store the
embedded content as data arrays, and the code became increasingly
generic so it could render all kinds of data.
The
next logical step was to separate the data physically from the code.
This allowed creating several products with the same application, but
with different data. More products were defined by the data itself, and
content creation soon became a completely separate task.
The real-time application was then referred to as the runtime, and the content for the runtime was stored in the runtime database. In the game industry, the runtime is called the game engine.
Digital
content creation (DCC) tools were created, but the data structures and
algorithms used for modeling did not match with the data that can be
processed in real time by the application. DCC tools were also used by
the movie industry for the production of computer-generated movies in
which an advanced rendering engine was attached to the tool to produce
a set of still images that compose the frames of the movie.
DCC
tools and advanced rendering techniques, such as ray tracing [3] or
shader languages such as RenderMan [4], required more advanced concepts
than a real-time application could handle. Mathematical descriptions of
surfaces such as splines and Bézier surfaces became necessary in the
computeraided design (CAD) market [5].
Interactive
applications needed both the advanced modeling techniques and the
simpler representation usable in real time. Because of this,
compilation techniques, used to create binary executable code from
high-level languages, were adapted for the content processing. The
database used in the DCC tool was therefore called the source. The data compiler takes the source data and creates the runtime data.
Figure 1.1. Content pipeline synopsis.
The
runtime database soon became too large to fit in the memory of the
target system. The content had to be sliced and paged in real time,
depending on where the observers were. This quite challenging problem
sometimes required specific hardware assistance and necessitated a very
specific encoding of the content. Specific algorithms had to be
developed for terrain paging [6] and texture paging [7].
Because
of economic constraints, severe limitations exist in hardware targeted
by the game industry. The data must be organized in the most optimal
way possible. For performance optimization, many game applications use
their own file system and combine the various elements in a single
file. Some developers run optimization programs for hours to find the
best placement of the elements for better interactivity. The idea is to
optimize seek time and place data accordingly. This is very similar to
the optimization section of a compiler, with a complexity similar to
the NP-complete salesman optimization problem [8].
Another
example of paging technology used outside the game industry is the
Google Earth application [9]. This application enables the user to look
down on the planet from any altitude and render a view based on
satellite altimetery and imagery information. It is the result of
generalizing the terrain- and image paging technology developed for
high-end simulation applications [10].
|