|
Sh is a free, open-source system developed at the University
of Waterloo that lets you program graphics processors
(GPUs) in C++. Basically, it consists of a C++ library
that supports run-time specification and compilation
of GPU shader programs. This library uses operator
overloading to build a clean, high-level API, so that
defining Sh programs is as straightforward as defining
functions in C++, and as expressive as writing shaders
in a specialized shading language. In addition, Sh
integrates with the scope rules of C++ in such a way
that all the capabilities of C++ can be used to manipulate
and modularize GPU code, including classes, templates,
functions, and user-defined types. No additional glue
code is required to bind Sh programs to the host application:
they act like an extension of it. For instance, shader
parameters and textures can just be declared as variables,
then used inside shader definitions, and Sh will do
the rest. Sh can be used as a shading language, for
complex multipass rendering algorithms, or to implement
general-purpose stream computations (such as simulation).
Why should you, as a game developer, be interested
in Sh? First of all, Sh is a much more powerful, modular,
and complete programming system than other available
real-time shading languages. It's more than a shading
language: it also tracks textures and shader parameters,
and the associations between these and shaders. Using
the object-oriented features of C++, you can share
code between shaders, and encapsulate complex algorithms
and data representations so that they can be more
easily be reused. You can also easily use Sh to build
custom compilers to convert your own data into shaders
(metaprogramming). In general, you can write more
sophisticated shaders far more efficiently in Sh than
in other systems. Second, Sh can improve your productivity
by eliminating a lot of the annoyances and glue code
requirements of other shading languages. Third, you
can use Sh to accelerate other game engine computations,
such as simulations and AI. Since Sh compiles to both
the GPU and the CPU, writing these components in Sh
does not commit you to running them on the GPU. You
can even defer that decision to runtime. For instance,
you could profile the GPU and CPU at install time
and decide to run simulation components on whichever
processor leads to a load-balanced system. In general,
Sh makes all the computational capabilities of a system
available to you with a common interface.
In the future, we hope that Sh's ability to encapsulate
data representations and algorithms will lead to a
large set of implementations of advanced algorithms
being made available as Sh classes and functions by
researchers. Sh is also a useful platform for shader
compiler research since it is completely open source.
Finally, we plan to extend Sh to a number of other
compilation targets, including parallel machines and
game platforms. Sh's conceptual model is platform,
vendor, and API independent. Ideally, this will ease
porting between different platforms and allow greater
reuse of code.
The Sh Architecture
Let's begin by looking at how Sh is structured. The
library is built around a set of classes, such as
ShPoint3f,
ShVector3f,
or ShMatrix4x4f,
that can be used directly as a graphics utility library.
A number of useful operators and functions are defined
that act on objects of these classes. You can add
or subtract vectors, take dot or cross products, do
matrix/vector and matrix/point transformations, normalize
vectors to unit length, and so forth. Sh also supports
swizzling (extraction and rearrangement of elements
of a tuple or matrix) and writemasking (assignment
to only some elements of a tuple or matrix).
You can specify operations on Sh objects in two modes.
In immediate mode, which is the default, operations
take place as soon as they are specified. In retained
mode, rather than executing a sequence of operations,
Sh records them in a program object. Retained
mode is indicated by wrapping a section of code in
the keywords SH_BEGIN_PROGRAM
and SH_END.
Recorded operation sequences can then be compiled
for a specified target (usually the GPU, although
Sh can also dynamically generate code for the host
CPU). Program objects can be loaded into the vertex
and fragment shader units of GPUs, in which case they
affect rendering with standard graphics APIs. Alternatively,
they can be used directly as stream functions for
general-purpose computation, without any need to invoke
a graphics API.
In addition to supporting the dynamic generation of
code, Sh also manages textures and streams. Textures
act like arrays, and like other parameters are bound
to Sh programs using the scope rules of C++. This
means that data abstractions can be built around textures.
For instance, suppose you want to build a special
compressed texture type that is decompressed by a
particular sequence of shader code. You can declare
a class that encapsulates a built-in texture class
to store the compressed data, but redefines the access
operators to insert the necessary code into the calling
shader. If your new class supports the same interface
as one of the built-in textures, it can be used anywhere
they can be used.
Streams are used to support a general-purpose computational
model on GPUs. A stream program is like a shader:
it is a function that maps a certain number of inputs
to a certain number of outputs. Stream objects in
Sh are similar to textures. They refer to a sequence
of data in memory that can be acted upon or generated
by stream programs. Stream programs can be applied
to streams with a simple operator or function call
syntax. Streams can also be decomposed into or constructed
from individual channels of data. A sophisticated
stream syntax is provided that supports many advanced
features, such as shared substreams, conversion of
parameters to inputs and the reverse, program composition,
and currying.
Example: Blinn-Phong Shader
The simplest way to introduce Sh is with some examples.
The following code defines a Blinn-Phong shader for
a single point source (the shader equivalent of "Hello
World") by defining a vertex shader and a fragment
shader. This shader will also transform vertices into
view space for lighting and into device space for
rendering. A rendering produced with this shader is
given in Figure 1.
First we will define a number of global variables giving
the transformation matrices and the parameters of
the lighting model:
|
|
ShMatrix4x4f
modelview; // MCS to VCS
transformation
ShMatrix4x4f perspective; // VCS
to DCS transformation
ShColor3f
phong_kd; // diffuse color
ShColor3f phong_ks; // specular
color
ShAttrib1f phong_spec_exp; //
specular exponent
ShPoint3f phong_light_position;
// VCS light position
ShColor3f phong_light_color; //
light source color
ShProgram
phong_vert, phong_frag;
|
 |
 |
 |
|
We will build the vertex and fragment shaders themselves
in an initialization function as follows:
|
|
void
phong_init () {
// Create vertex shader
phong_vert = SH_BEGIN_PROGRAM("gpu:vertex")
{
// Declare
shader inputs
ShInputNormal3f
nm; // normal vector (MCS)
ShInputPosition3f
pm; // position (MCS)
// Declare
shader outputs
ShOutputNormal3f
nv; // normal (VCS)
ShOutputVector3f
lv; // light-vector (VCS)
ShOutputVector3f
vv; // view vector (VCS)
ShOutputColor3f
ec; // irradiance
ShOutputPosition4f
pd; // position (HDCS)
// Specify
shader computations
ShPoint3f pv =
(modelview | pm)(0,1,2);
vv = normalize(-pv);
lv = normalize(phong_light_position
- pv);
nv = normalize(modelview
| nm);
ec = phong_light_color
* pos(nv|lv);
pd = perspective
| pv;
} SH_END; // End
of vertex shader
// Create fragment shader
phong_frag = SH_BEGIN_PROGRAM("gpu:fragment")
{
// Declare
shader inputs
ShInputNormal3f
nv; // normal (VCS)
ShInputVector3f
lv; // light-vector (VCS)
ShInputVector3f
vv; // view vector (VCS)
ShInputColor3f
ec; // irradiance
// Declare
shader outputs
ShOutputColor3f
fc; // fragment color
// Specify
shader computations
vv = normalize(vv);
lv = normalize(lv);
nv = normalize(nv);
ShVector3f
hv = normalize(lv + vv);
fc = phong_kd
* ec;
fc += phong_ks
* pow(pos(hv|nv), phong_spec_exp);
} SH_END; // End
of fragment shader
} // End of phong_init
|
 |
 |
 |
|
We have wrapped Sh shader program definitions in the
SH_BEGIN_PROGRAM
and SH_END
keywords. The SH_BEGIN_PROGRAM
returns a program object that will represent the recorded
sequence of operations. Inputs and outputs to the
program objects are indicated by appropriate Input
and Output prefixes on instances of Sh types.
The "|" operator is used for dot product
and matrix multiplication, although you can also use
a dot function for the former and "*" for
the latter.
Once defined, the program objects phong_vert
and phong_frag
can be loaded into the vertex and fragment shading
units of the GPU using the shBind
API call. You can now use a normal graphics API to
specify geometry, and the shaders will be applied
to that geometry. Right now, Sh only supports OpenGL,
although we are working on a DirectX binding and it
should be available soon. In your graphics API, you
need to set up the correct vertex attributes for the
shaders you have loaded, and fragment and vertex shader
pairs need to be consistent in their inputs and outputs.
A set of rules based on type and order of declaration
defines how shader inputs map onto vertex attributes.
You can also ask program objects for a human-readable
string describing the interface binding.
The "uniform" parameters of these shaders,
that is, the values that are the same for all shaded
vertices or fragments such as phong_kd,
are simply referenced directly by the shader definitions.
No additional glue code is required to set up these
parameters, and a simple assignment (outside of a
shader definition) is all that is needed to modify
one. Which parameters get bound to each shader program
is controlled by the scope rules of C++. For instance,
we could have made the parameters data members of
a class and defined the shader program objects in
a member function. Then the member function creating
the shader programs would have picked up the data
members and an encapsulated shader would have been
created. In general, Sh is designed to integrate with
C++ cleanly, and most C++ modularity constructs can
be used to with Sh programs. Many other programming
techniques are enabled by this integration, and by
the fact that C++ can manipulate Sh programs in arbitrary
ways at runtime.
If we wanted to texture map this shader, instead of
ShColor3f
for phong_kd
we could have used ShTexture2D<ShColor3f>.
Then we would have to modify the shader definitions
to pass in a texture coordinate, and then index the
texture object. The bindings of textures work in exactly
the same way as uniform parameters, so as with parameters,
we can create data abstractions using the object-oriented
features of C++.
The following code example encapsulates parameters
as data members in a class, uses template arguments
and construction-time arguments to parameterize the
shader, uses a template class to coordinate the vertex
shader outputs and fragment shader inputs (incidentally,
also demonstrating the more generic, template-based
mechanism for declaring Sh types, which can also be
used to declare tuples of arbitrary length), and finally
uses C++ control constructs to manipulate shader code:
in this case, by unrolling a loop to support multiple
light sources, using C++ arrays to hold multiple light
source properties. We can also use ordinary C++ functions
to implement functions in shader code. This is roughly
how the standard library functions such as normalize
(and, in fact, the operators) are implemented. A rendering
produced with this shader is also given in Figure
1.
|
|
template
class BlinnPhong {
public:
// Declare
parameters and textures as data members
ShTexture2D<ShColor3f>
kd;
ShTexture2D<ShColor3f>
ks;
ShAttrib1f
spec_exp;
ShPoint3f
light_position[NLIGHTS];
ShColor3f
light_color[NLIGHTS];
// Declare
I/O type to coordinate vertex and fragment
shaders
template <ShBindingType
IO> struct VertFrag {
ShPoint<4,IO,float>
pv; // position (VCS)
ShTexCoord<2,IO,float>
u; // texture coordinate
ShNormal<3,IO,float>
nv; // normal (VCS)
ShColor<3,IO,float>
ec; // total irradiance
};
// Declare
program objects for shaders
ShProgram
vert, frag;
// Constructor:
parameterized by texture resolution
BlinnPhong (int
res) : kd(res,res), ks(res,res) {
//
Create vertex shader
vert
= SH_BEGIN_PROGRAM("gpu:vertex")
{
//
Declare shader inputs
ShInputNormal3f
nm; // normal vector (MCS)
ShInputTexCoord2f
u; // texture coordinate
ShInputPosition3f
pm; // position (MCS)
//
Declare shader outputs
VertFrag<SH_OUTPUT>
vf;
ShOutputPosition4f
pd; // position (HDCS)
//
Specify shader computations
vf.pv
= modelview | pm;
vf.u
= u;
vf.nv
= normalize(modelview | nm);
pd
= perspective | vf.pv;
for
(int i=0; i
ShVector3f
lv =
normalize(light_position[i]
- vf.pv(0,1,2));
vf.ec
+= light_color[i] * pos(vf.nv|lv);
}
}
SH_END; // End of vertex shader
//
Create fragment shader
frag
= SH_BEGIN_PROGRAM("gpu:fragment")
{
//
Declare shader inputs
VertFrag<SH_INPUT>
vf;
//
Declare shader outputs
ShOutputColor3f
fc; // fragment
color
//
Specify shader computations
ShVector3f
vv = normalize(-vf.pv(0,1,2));
ShNormal3f
nv = normalize(vf.nv);
fc
= kd(vf.u) * vf.ec;
ShColor3f
kst = ks(vf.u);
for
(int i=0; i
ShVector3f
lv =
normalize(light_position[i]
- vf.pv(0,1,2));
ShVector3f
hv = normalize(lv + vv);
fc
+= kst * pow(pos(hv|nv),spec_exp)
*
light_color[i];
}
}
SH_END; // End of fragment shader
} // End of
constructor
}; // End of BlinnPhong class
|
 |
 |
 |
|
In the fragment shader, notice that a texture read
is indicated with the "()" operator on a
texture object, as in kd(vf.u)
and ks(vf.u).
This operator treats the texture as a tabulated function
with a normalized texture coordinate range of 0 to
1 in each coordinate. Sh also supports the "[]"
operator for texture lookups, which works the same
but places a texel at each integer. The "[]"
lookup operator is useful when textures are being
used as arrays to hold data structures (for instance,
a ray-tracer accelerator). Sh also supports several
additional texture types for rectangular textures,
1D and 3D textures, and cube textures.
Sh programs can write to inputs and read from outputs.
Writing to an input does not change the original data;
Sh inputs are pass by value. When mapping to a backend
that does not support these operations, Sh will introduce
an additional temporary automatically. Temporaries
(including automatically introduced temporaries) are
also always initialized to zero if their value is
used before they are assigned to. These transformations
are included as conveniences to simplify the input
code. For instance, the ability to use += on a zero-initialized
output is very useful if you want to accumulate several
sources of light in an output color but don't want
to keep track of which source is "first".
Several additional shader examples are given in Figure
2. Many of these examples use the Perlin and Worley
noise functions built into Sh. Wood, for example,
adds some Perlin noise to a quadratic function and
then feeds the result through a periodic sawtooth
function stored in a texture map. The Worley noise
functions are based on the kth nearest distance
to a set of procedurally generated feature points
(we use a jittered grid). The library function computing
the Worley noise functions can be parameterized with
different distance functions. The giraffe shader shown
here uses a Manhattan distance metric with the Worley
function and passes the distance to the closest feature
point through a threshold function.
______________________________________________________
|