|
Features

Implementing Modular HLSL with RenderMonkey
One
of the largest problems with getting shaders into a game seems to
be the learning curve associated with shaders. Simply stated, shaders
are not something that your lead graphics programmer can implement
over the weekend. There are two main issues with getting shaders
implemented in your game:
1.
Understanding what shaders can do and how they replace the existing
graphics pipeline.
2. Getting the supporting code implemented into your game so that
you can use shaders as a resource.
In
this article we're going to continue the series of Gamasutra articles
about shaders by examining how to make shaders work. The actual
integration of shader support is the stuff for a future article.
(Note: You don't need a high-end video card to try your hand at
writing shaders. All you need is the DirectX 9.0 SDK installed.
With that you can select the reference device (REF). While this
software driver will be slow, it'll still give you the same results
as DirectX 9 capable video card.) RenderMonkey works on any hardware
that supports shaders, not just ATI's hardware.
If
you have already read Wolfgang
Engel's article, Mark
Kilgard's and Randy Fernando's Cg article or you've perused
the DirectX 9 SDK documentation, then you've got a fairly good idea
of the capabilities of the High-Level Shader Language (HLSL) that's
supported by DirectX 9. HLSL, Cg, and the forthcoming OpenGL shading
language are all attempts to make it as easy to write shaders as
possible. You no longer have to worry (as much) about allocating
registers, using scratch variables, or learning a new form of assembly
language. Instead, once you've set up your stream data format and
associated your constant input registers with more user-friendly
labels, using shaders in a program is no more difficult than using
a texture.
Rather
than go through the tedious setup on how to use shaders in your
program, I'll refer you to the DirectX 9 documentation. Instead,
I'm going to focus on a tool ATI created called RenderMonkey. While
RenderMonkey currently works on DirectX high and low-level shader
languages, ATI and 3Dlabs are working to implement support for OpenGL
2.0's shader language in RenderMonkey that we should see in the
next few months. The advantage of a tool like RenderMonkey is that
it lets you focus on writing shaders, not worrying about infrastructure.
It has a nice hierarchical structure that lets you set up a default
rendering environment and make changes at lower levels as necessary.
Perhaps the biggest potential advantage of using RenderMonkey is
that the RenderMonkey files are XML files. Thus by adding a RenderMonkey
XML importer to your code or an exporter plug-in to RenderMonkey
you can use RenderMonkey files in your rendering loop to set effects
for individual passes. This gives RenderMonkey an advantage over
DirectX's FX files because you can use RenderMonkey as an effects
editor. RenderMonkey even supports an "artist's mode"
where only selected items in a pass are editable.
Using HLSL
While
HLSL is very C-like in its semantics, there is the challenge of
relating the input and output of the shaders with what is provided
and expected by the pipeline. While shaders can have constants set
prior to their execution, when a primitive is rendered (i.e., when
some form of a DrawPrimitive
call is made) then the input for each vertex shader is the vertex
values provided in the selected vertex streams. After each vertex
shader call, the pipeline breaks that vertex call into individual
pixel calls and uses the (typically) interpolated values as input
to the pixel shader, which then calculates the resulting color(s)
as output from the pixel shader. This is shown in Figure 1, where
the path from application space, through vertex processing then
finally to a rendered pixel is shown. The application space shows
where shaders and constants are set in blue text. The blue boxes
show where vertex and pixel shaders live in the pipeline.
The
inputs to the vertex shader function contain the things you'd expect
like position, normals, colors, etc. HLSL can also use things like
blend weights and indices (used for things like skinning), and tangents
and binormals (used for various shading effects). The following
tables show the inputs and output for vertex and pixel shaders.
The [n] notation indicates an optional index.

The
output of vertex shaders hasn't changed from the DirectX 8.1 days.
You can have up to two output colors, eight output texture coordinates,
the transformed vertex position, and a fog and point size value.

The
output from the vertex shader is used to calculate the input for
the pixel shaders. Note there is nothing preventing you from placing
any kind of data into the vertex shader's color or texture coordinate
output registers and using them for some other calculations in the
pixel shader. Just keep in mind that the output registers might
be clamped and range limited, particularly on hardware that doesn't
support 2.0 shaders.

DirectX
8 pixel shaders supported only a single color register to specify
the final color of a pixel. DirectX 9 has support for multiple render
targets (for example, the back buffer and a texture surface simultaneously)
and multi-element textures (typically used to generate intermediate
textures used in a later pass). However you'll need to check the
CAPS bits to see what's supported by your particular hardware. For
more information, check the DirectX 9 documentation. While RenderMonkey
supports rendering to a texture on one pass and reading it in another,
I'm going to keep the pixel shader simple in the following examples.

Aside
from the semantics of the input and output mapping, HLSL gives you
a great deal of freedom to create shader code. In fact, HLSL looks
a lot like a version of "C" written for graphics. (Which
is why NVIDIA calls their "C" like shader language Cg,
as in "C-for-Graphics"). If you're familiar with C (or
pretty much any procedural programming language) you can pick up
HLSL pretty quickly. What is a bit intimidating if you're not expecting
it is the graphics traits of the language itself. Not only are there
the expected variable types of boolean, integer and float, but there's
also native support for vectors, matrices, and texture samplers,
as well as swizzles and masks for floats, that allow you to selectively
read, write, or replicate individual elements of vectors and matrices.
This
is due to the single-instruction multiple-data (SIMD) nature of
the graphics hardware. An operation such as;

results
in an element-by-element multiplication since type vector
is an array of four floats. This is the same as:

where
I've used the element selection swizzle and write masks to show
the individual operations. Since the hardware is designed to operate
on vectors, performing an operation on a vector is just as expensive
as performing one on a single float. A ps_1_x pixel shader can actually
perform one operation on the red-green-blue elements of a vector
while simultaneously performing a different operation on the alpha
element.
In
addition to graphics oriented data types there is also a collection
of intrinsic functions that are oriented to graphics, such as dot
product, cross product, vector length and normalization functions,
etc. The language also supports things like multiplication of vectors
by matrices and the like. Talking about it is one thing, but it's
much easier to comprehend when you have an example of in front of
you, so let's start programming.
______________________________________________________
|