|
Features

The Sh GPU Metaprogramming Toolkit
Stream Functions
Stream functions are generalizations of shaders compiled
with gpu:stream
or cpu:stream
as a target. Streams and channels are containers for
sequences of data that can be acted upon by stream
functions. Channels hold a sequence of data from a
basic tuple data type while streams represent particular
combinations of channels. Special operators are used
for stream function application and construction of
streams from channels: "<<" for application
and "&" for stream construction. For
instance, suppose we have a stream function f that
takes three input channels (a point, a normal, and
a color) and two output channels (a color and a scalar).
We could declare some channels as follows:
ShChannel<ShPoint3f>
p;
ShChannel<ShNormal3f> n;
ShChannel<ShColor3f> c1, c2;
ShChannel<ShAttrib1f> d;
We can then create some streams as follows:
ShStream
input_s = (p & n & c1);
ShStream output_s = (c2 & d);
After initializing the channels with data, we can apply
the stream function f
to generate data for the output streams:
output_s =
f << input_s;
We don't have to create the intermediate streams if
we don't want to, we can just use the "&"
operator inline:
(c2 & d)
= f << (p & n & c1);
The "<<"
operator can also be used to apply programs to single
tuples, which then act like a channel all of whose
elements have that value. Partial evaluation (currying)
is also supported, so you don't have to provide the
inputs to a program all at once, you can give them
one at a time. The following is equivalent to the
above expression:
ShProgram
g = f << p;
(c2 & d) = g << (n & cl);
For partial evaluation, to avoid data copies, the channel
p is
read only in the second line of the above example,
when the stream function actually executes. This "deferred
read" is also used when an input is a tuple rather
than a stream channel. This is actually an interesting
feature: "<<"
can be used to convert a "varying" input
attribute to a "uniform" parameter! Since
that's a useful operation, we also define its inverse,
specified with the ">>"
operator, that converts a parameter to an input. You
give the program on the left and an existing parameter
that program uses on the right. The result is a program
in which that parameter dependence has been removed
and with an additional input. This actually leads
to some interesting programming techniques. For instance,
if you have a value (like a texture coordinate) that
is used in a lot of places, rather than passing it
around everywhere you can just declare a global parameter,
then convert it into an input later.
The "<<"
and "&"
operators can also be applied directly to program
objects to create other program objects. On program
objects, "<<"
feeds the outputs of one program object into the inputs
of another, creating a new program object. In other
words, it performs functional composition. The "&"
operator concatenates all the inputs, outputs, and
operations in two program objects, creating a new
program object. Because of the way Sh syntax works,
this is equivalent to concatenating the source code
of the two program objects (in two separate name scopes).
Eventually, after applying any number of such operations,
the resulting composite program object is fed through
the complete compilation chain, including optimization,
which reduces it to a single optimized implementation.
These operators turn out to be incredibly useful, especially
when you combine preexisting program objects with
small glue programs. For instance, using "<<"
you can reorganize the inputs and outputs of program
objects, delete outputs (an operation called program
specialization; the optimizer will delete any unnecessary
computations), or replace inputs with texture accesses.
The Sh library includes a number of generators for
common glue program patterns like these.
An application of stream processing is shown in Figure
3. This figure shows the result of a particle system
simulation running on the GPU using a stream program
to update the state of each particle and a shader
to convert the updated particle state to a visual
rendering. Code for the state update part of this
application is given in our recent SIGGRAPH paper
[3]. That paper also includes an example that converts
a Phong lighting model to a wood shader using the
program manipulation (shader algebra) operators.
Future Work
Sh is a work in progress. It is well past the research
prototype phase and can now certainly be used for
the robust and flexible specification of shaders.
However, we are in the midst of adding some additional
features to better support stream computing and large
commercial game development projects (including integration
with asset management systems, file externalization,
and additional backends). Formally, Sh is still in
alpha but is in final testing before a beta release
planned for August 2004. This beta status is not meant
to indicate that Sh is unstable, only that it has
not yet reached its final feature set. We are particularly
interested in hearing feedback from the game community
to identify any important missing features that would
block the adoption of Sh in a commercial game project.
We are committed to addressing any such issues by
the end of the year.
Sh is distinguished from other shading languages both
by its close integration with a C++ host application
and by its direct support for stream computing. Both
of these attributes are aimed at applications in which
a combination of CPU control and GPU computation are
necessary to implement a complex algorithm. However,
additional stream operations not yet directly supported
by Sh, such as reduction and indexing, are potentially
useful to implement some stream processing algorithms.
Unfortunately, efficient implementation of some of
these multipass operations on GPUs requires driver
support for flexible buffer and memory management
on the graphics accelerator, an area which is still
in state of flux. Assuming the driver issues get sorted
out, these additional features should also be available
soon.
We are also looking at targeting additional backends.
Currently, Sh supports ATI and Nvidia floating-point
GPUs and the host CPU (stream functions can be dynamically
compiled to CPU code). Game platforms and parallel
distributed-memory clusters via MPI are also interesting
targets that we are working on. The semantics of Sh
have been kept intentionally simple to make an efficient
mapping onto such platforms possible, but these machines
have significantly different architectures than GPUs
and so additional development work will be necessary.
Our ultimate goal is to provide a unified computational
model for all platforms that supports both stream
computing and shaders, and then to build a library
of useful algorithms on top of that. We plan to keep
as much of this development as possible in an open
source form, except for components (such as game platform
backends) that might require licensing fees to develop.
Further Reading
AK Peters will publish a book [2] on the Sh system
in August 2004, called Metaprogramming GPUs with
Sh. This book includes a detailed user tutorial,
a reference manual, and a guide to the internals of
the open-source distribution available from the Sh
SourceForge web site. The website contains additional
information, such as links to sample shaders, research
papers [3,4], and videos.
The creators of Sh will be running a programming contest.
Four high-end video cards, two from ATI and two from
NVIDIA, will be offered as prizes. The winning entries
of this contest will be the programs that best exploit
the novel capabilities of Sh, and will not be limited
to shaders. Details are available on the Sh web site
[1].
References
1. Sh Web Site, http://libsh.org
2. Michael McCool and Stefanus Du Toit, Metaprogramming
GPUs with Sh, AK Peters, 2004, http://www.akpeters.com
3. Michael McCool, Stefanus Du Toit, Tiberiu Popa,
Bryan Chan and Kevin Moule, Shader Algebra, ACM Transactions
on Graphics (Proceedings of SIGGRAPH), August 2004.
4. Michael McCool, Zheng Qin, and Tiberiu Popa, Shader
Metaprogramming, Proceedings of SIGGRAPH/Eurographics
Conference on Graphics Hardware, September 2002, pp.
57-68.
______________________________________________________
|