Gamasutra is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Gamasutra: The Art & Business of Making Gamesspacer
Shader Integration: Merging Shading Technologies on the Nintendo Gamecube
arrowPress Releases
September 28, 2020
Games Press
View All     RSS

If you enjoy reading this site, you might also want to check out these UBM Tech sites:


Shader Integration: Merging Shading Technologies on the Nintendo Gamecube

October 2, 2002 Article Start Previous Page 2 of 5 Next

Shading subsystem

To clarify the term shader, Figure 3 shows the data structure that defines one. A shader is a data structure that describes “how to compute colors” for rendering polygons. The term building a shader refers to the process of transforming such a data structure into a stream of GX commands, which configure the hardware to produce the desired output. One should note that during the shader build many features are activated dynamically. For instance, if an object should get tinted a color multiplication is added to the final output color whatever shader was setup before. In addition, layered fog adds another stage at the end of the color computation that blends depending on the pixel’s height in world space and distance from the camera in eye space between a fog color and the original pixel color. Of course, it would be nice, if the complete shader subsystem would be dynamic like that. However, experience shows that many shaders can only be built by hand into an optimal (i.e. least cycle usage) setup.

typedef struct tShader {
char mName[16]; // shader name, as in Maya...

tShaderLimit mShaderLimit; // shading limit, i.e. type...
tShaderFlags mShaderFlags; // additional flags...

tShaderColor mColor; // material color, if used...
tShaderColor mSpecularColor; // specular color, if used...

f32 mReflectivity; // reflectivity strength...
f32 mEmbossScale; // emboss depth...
f32 mMovieFade; // movie shader weight...

tShaderDataDescriptor mDataInfo; // texture info...
} tShader;

Figure 3: A ‘tShader’ data structure describing how to render polygons.

Mostly the structure’s members describe various properties of a shader (like the material color, the specular color and cosine power for phong shaders, the reflectivity for reflective phong shaders, etc.). The most important member is the mShaderLimit , which describes what kind of shading is actually performed. The term limit illustrates at what point the color computation actually should stop. Roughly, the following shading limits were implemented:

  • diffuse
  • mapped
  • illuminated mapped
  • phong
  • phong mapped
  • phong mapped gloss
  • phong illuminated mapped
  • phong illuminated mapped gloss
  • reflective phong
  • reflective phong mapped
  • reflective phong mapped gloss
  • reflective phong illuminated mapped
  • reflective phong illuminated mapped gloss
  • bump mapped
  • bump illuminated mapped
  • bump phong mapped
  • bump phong mapped gloss
  • bump phong illuminated mapped
  • bump phong illuminated mapped gloss
  • bump reflective phong mapped
  • bump reflective phong mapped gloss
  • bump reflective phong illuminated mapped
  • bump reflective phong illuminated mapped gloss
  • emboss mapped
  • emboss illuminated mapped

This clearly shows, that an automatic way of generating code to generate shader setups would be very nice, since all the shaders listed above need to be maintained. However, as mentioned above, a couple of features where added in automagically already, like self-shadowing and tinting for example.

typedef struct tConfig {

// resources currently allocated...
GXTevStageID stage; // current tevstage...
GXTexCoordID coord; // current texture coordinate...
GXTexMapID map; // current texture map...

// ...

} tConfig;

void shadingConfig_ResetAllocation(tConfig *pConfig);
void shadingConfig_Allocate(tConfig *pConfig, s32 stages, s32 coords, s32 maps);
void shadingConfig_Flush(tConfig *pConfig);

Figure 4: Data structure ‘tConfig’ describing resource allocation.

Since a shader setup is built by various functions, one needs to keep track of various hardware resources. For example, texture environment stages, texture matrices, texture coordinates, texture maps and such need to be allocated in a sequential manner. Some resources require special order requirements; texture coordinates for emboss mapping always need to be generated last. Therefore, infrastructure is needed to deal with the allocation problems. The solution here is another tiny data structure tConfig (c.f. figure 4).

This structure holds information about the resources used during setup. Before any resources are used, the allocation information is reset using the shadingConfig_ResetAllocation(); call. For each tevstage, texture coordinate and etcetera used, a call to shadingConfig_Allocate(); is made, which marks the corresponding resources as being allocated. When one now calls a subroutine that inserts additional GX commands, the tConfig structure is passed along as a parameter. The called function can now have a look at the structure’s members and knows what tevstage, texture coordinate and etcetera to use next. When the shader construction is done, the function shadingConfig_Flush(); is called, which actually passes the number ofresources used to the hardware. Error checking can be preformed here as well.

Figure 5: Shading subsystem client control flow.

The shading subsystem needs to maintain a global GX state as well. This is because the shading subsystem is not the only client to GX. Other parts of the game program will issue GX commands and setup their own rendering methods in specific ways. However, since the shading subsystem has to initialize quite a bit of GX state to work properly, a protocol needs to be introduced to reduce the amount of redundant state changes (c.f. figure 5). The straightforward solution of initializing the complete GX state as needed by the shading subsystem is of course way to slow. A boolean variable keeps track if the shading subsystem has initialized GX for its usage. Every time it is about to be used, a function shading_PreRenderInit(); is called. This function checks the flag. If it’s false GX is initialized and the flag is set to true. The next time some shading needs to be done, the boolean is already true and the setup can be skipped. On the other hand, when other parts of the game program do some ‘hard coded’ GX usage, they need to reset that boolean by calling shading_GxDirty();.

Finally, the shading subsystem keeps books about various default settings for the texture environment. If subsequent shader setups share some settings that are the same, a couple of GX commands can be skipped, since it is in an already known state. If another shader is setup, the function shading_CleanupDirtyState(); cleans the marked dirty state and leaves GX in the expected way behind. Those optimizations helped quite a bit in the end to maintain a reasonable frame rate.

Lighting Pipeline

The actual purpose of the so-called lighting pipeline is to deliver light color values per shaded pixel. As mentioned before, all lights are classified into either global or local lights and the methods of computing light color values vary. The results of global lighting can be computed in three different ways: per vertex, per pixel using emboss mapping, and per pixel using bump mapping.

All three of these methods come in two variants one with self-shadowing and one without. When self-shadowing is enabled, the directional component of the global light is not added to the output color value if the pixel to be shaded falls in shadow. The ambient color is the only term that then contributes to the global lighting. The conditional add is facilitated using the two different color channels GX_COLOR0 and GX_COLOR1. The first one carries the directional component of the global light whereas the second one is assigned to all local lights and the ambient term.

Local lights are always computed per vertex using the lighting hardware and are fed into GX_COLOR1. Note that both channels are combined when self-shadowing is not enabled. There is a tiny problem when color per vertex is used for painting and two channels are used. The hardware is not able to feed one set of color per vertex data without sending the same data twice into the graphics processor. Therefore, one needs to decide if the local lights are computed unpainted (which only leads to visible artifacts, if local lights are contributing) or if color per vertex data is sent twice into both color channels, eating up vertex performance. Experience showed that not painting the local lights was quite ok, and nobody really noticed.

The control flow of the lighting pipeline is a bit tricky. The problem here is that at an object’s rendering time all local lights possibly intersecting the object’s bounding sphere, need to be known (c.f. figure 6).

Figure 6: Lighting pipeline, control flow.

This requires the creation of a local light database that holds information about all local lights influencing the visible geometry. The game program needs to add all local lights to the database before any rendering takes place. Care needs to be taken when it comes down to culling lights for visibility against the view frustum. The reason is that the distance attenuation function as used per default by GX has no precise cutoff point and therefore setting up a point light with a 50m radius does not mean that no light will contribute to any polygons starting at any distance > 50m. Light will pop on and off if the lights are collected by software culling assuming a 50m radius. A fludge factor of 2.0f proved to be quite successful here.

Once all visible lights are added, rendering can begin. First, all logical lights are transformed into an array of GXLightObj objects. This array is double (or triple) buffered to let the graphics processor always have a private copy to read from while the CPU is generating the array for the next frame. An array is constructed since each object is likely to receive it’s own set of local lights. Instead of storing many copies of GXLightObj objects in the FIFO using GXLoadLightObjImm(); we instruct the graphics processor to fetch lights indirectly from the constructed array using the faster GXLoadLightObjIdx(); function.

As rendering of an object starts, all intersecting lights are collected (up to a maximum number of eight lights) and loaded from the array. Note that one should remember which lights are loaded to avoid unnecessary loads. The light mask is computed (an eight bit value as sent to GXSetChanCtrl(); ) that describes what lights actually contribute to the lighting calculation in hardware. This mask value is the interface between the shading subsystem and the lighting pipeline. This is because GXSetChanCtrl(); not only specifies what lights are enabled but also how color per vertex is used and what kind of light calculation is performed. Therefore, parameters to this GX call are coming in from the two different subsystems at different times. By storing the light mask value as an easy accessible variable and making sure, that no bogus values are loaded in case of the first shader setup (i.e. when no lights have been collected yet), this problem can be solved.

Article Start Previous Page 2 of 5 Next

Related Jobs

Deep Silver Volition
Deep Silver Volition — Champaign, Illinois, United States

Senior Engine Programmer
Deep Silver Volition
Deep Silver Volition — Champaign, Illinois, United States

Senior Technical Designer
Random42 — London, England, United Kingdom

UE4 Technical Artist
Disbelief — Cambridge, Massachusetts, United States


Loading Comments

loader image