|
Features

Building a Million-Particle System
Conclusion
Of course processors are never
fast enough, so up to now the implementation on the
first generation of floating point GPUs simulates
and renders a 1024×1024 texture of particles in real-time
only with few effects and without sorting. Sharing
the GPU with other techniques and using the full feature
set currently allows up to 512×512 particles. The
performance is expected to improve significantly with
the upcoming generation of PC graphics hardware.
This paper has shown how to design
and implement a state-preserving physical particle
simulation on current programmable graphics hardware.
The simulation can use either Euler or Verlet integration
to update the particle positions. Other particle attributes
are simulated with less complex algorithms. Without
permanent storage they are always evaluated on demand.
Additionally, an efficient parallel sorting algorithm
for particles has been introduced.
The main strength of GPU-based particle
systems is the low cost of individual operations on
the data set. Once a basic algorithm is implemented,
endless ideas for manipulating velocity and position
come up and are easily implemented in higher level
shading languages. New, yet unimplemented ideas include
e.g. collision with arbitrary geometry and local forces
being attached to the particles themselves. These
forces basically lead to a second order particle system
(cf. [Ilmonen2003]).
Other topics for further discussion
are the application of the algorithm to constraint
PS (e.g. to simulate cloth or hair) and its modification
for other rendering techniques instead of individual
geometry.
The only part of the particle simulation
remaining on the CPU is the allocation and deallocation
of particles as these do not map well to current parallel
hardware. Future graphics hardware might support simple
allocation algorithms with special-case serial registers
that could be used as stack pointers.
Further research should also be
done on improving the exploitation of frame-to-frame
coherence by the sorting algorithm. Currently, the
full sorting sequence of the algorithm is divided
evenly over several consecutive frames. Since the
sorting result is not necessarily exact, it is possible
that certain parts of the sorting sequence are visually
more important than others and ought to be executed
more often.
How applicable to upcoming video
game console hardware the introduced state-preserving
particle simulation is, remains to be seen. But the
trend towards increased multi-processing hardware
is a good indication that the parallel computation
of particle systems will grow in importance.
Acknowledgments
I would like to thank my colleagues
at Massive Development, especially Ingo Frick, Dr.
Christoph Luerig and Mark Novozhilov, and Prof. Andreas
Kolb from the University of Siegen for the fruitful
discussions and support in writing this paper. I also
thank very much Sieggi Fleder for never giving up
on my Germanic English. Furthermore, I am grateful
to Matthias Wloka and his colleagues at NVIDIA, who
helped the demo implementation to stay on the cutting
edge of technology.
References
Batcher1968:
Batcher, Kenneth E.; Sorting Networks and their Applications.
In Spring Joint Computer Conference, AFIPS Proceedings
1968
Buck2003:
Buck, Ian; Data Parallel Computing on Graphics Hardware,
2003,
http://graphics.stanford.edu/~ianbuck/GH03-Brook.ppt
Burg2000:
van der Burg, John; Building an Advanced Particle
System, Game Developer
Magazine, 03/2000
GPGPU2003:
Harris, Mark et al.; GPGPU Website, 2003-2004, http://www.gpgpu.org/
Green2003: Green, Simon; Stupid OpenGL Shader Tricks,
2003,
http://developer.nvidia.com/docs/IO/8230/GDC2003_OpenGLShaderTricks.pdf
Harris2003:
Harris, Mark, Real-Time Cloud Simulation and Rendering,
Department of
Computer Science, University of North Carolina at
Chapel Hill, 2003
Ilmonen2003:
Ilmonen, Tommi; Kontkanen, Janne; The Second Order
Particle System. In WSCG Proceedings 2003
Jakobsen2001:
Jakobsen, Thomas; Advanced Character Physics. In GDC
Proceedings 2001
Lang2003:
Lang, Hans W.; Odd-Even Merge Sort, 2003, http://www.iti.fhflensburg.de/lang/algorithmen/sortieren/oemen.htm
Mark2003:
Mark, William R.; Glanville, R. Steven; Akeley, Kurt;
Kilgard, Mark J.; Cg: A System for Programming Graphics
Hardware in a C-like Language. In SIGGRAPH Proceedings
2003
McAllister2000:
McAllister, David K.; The Design of an API for Particle
Systems, Technical Report, Department of Computer
Science, University of North Carolina at Chapel Hill,
2000
Microsoft2002:
Microsoft Corporation; DirectX9 SDK, 2002, http://msdn.microsoft.com/directx/
NVIDIA2001:
NVIDIA Corporation; NVIDIA SDK, 2001-2003, http://developer.nvidia.com/
NVIDIA2002:
NVIDIA Corporation; OpenGL Extension NV_pixel_data_range,
2002,
http://oss.sgi.com/projects/ogl-sample/registry/NV/pixel_data_range.txt
OpenGL2003:
OpenGL ARB; OpenGL Extension ARB_vertex_shader, 2003,
http://oss.sgi.com/projects/ogl-sample/registry/ARB/vertex_shader.txt
Percy2003:
Percy, James; OpenGL Extensions, 2003,
http://www.ati.com/developer/SIGGRAPH03/Percy_OpenGL_Extensions_SIG03.pdf
Purcell2003:
Purcell, Timothy J.; Donner, Craig; Cammarano, Mike;
Jensen, Henrik W.;
Hanrahan, Pat; Photon Mapping on Programmable Graphics
Hardware. In Graphics Hardware Proceedings 2003
Reeves1983:
Reeves, William T.; Particle Systems Technique
for Modeling a Class of Fuzzy Objects. In SIGGRAPH
Proceedings 1983
Sims1990:
Sims, Karl; Particle Animation and Rendering Using
Data Parallel Computation. In SIGGRAPH Proceedings
1990
Verlet1967:
Verlet, Loup; Computer Experiments on Classical Fluids.
I. Thermodynamical Properties of Lennard-Jones Molecules,
Physical Review, 159/1967
______________________________________________________
|