We are Mario Palmero and Norman Schaar, from Tequila Works, though for the time this article is out there, Norman will probably have left the company to face a new adventure.
Mario is a game programmer with interests in art, graphics and game design that tries to get involved in everything and bothers everybody with his curiosity at the end of the day. And Norman is a former artist converted into technical artist. We both share an eagerness for creating new things and pushing the limits of what we can do.
Mario knew about the Norman’s work in the field of Vertex Shader and used some ideas from his work to apply them in Rime months before Norman joined Tequila Works. So Norman was a respected professional for him, though that changed once they met. And Mario was an unknown person for Norman before he arrived at Tequila Works, but unluckily for him that isn’t the case anymore either.
Previously, before DirectX 11, there was no fast and easy way to sample textures in the vertex shader. You could still do a lot of fancy things in the vertex shader, you could use vert attributes to store additional data in your 3d package and use that in the vertex shader. So you could store essentially morph targets in the vertex color to do fancy mesh deformations, pivot points in the UV channels to move and rotate subobjects independently and I’ve used these techniques in the past to generate thousands of GPU particles on mobile two years ago. It is still very powerful! But you are limited by how many attributes you can store in a geometry.
Things however are different when textures are involved...
What textures allow you to do is to store an incredible amount of data that is easily retrievable by the GPU. Positions XYZ can be represented as RGB colors. A rotation can be saved as a quaternion, which can be saved as an RGBA texture.
With that in mind, you can create crazy particle animations in Pflow for example and bake those out. You can store particle attributes as colors: scale can be the red channel in a texture, while “temperature” can be the green channel of the texture, etc. You just need to make sure that your shaders samples those colors and knows what to do with them.
In a 4096x4096 texture that uses RGB for particle position and A for scale you could have a 120 frame long animation of 139810 particles. For a smaller particle system with just 2048 particles you could have 8192 frames of animation. At 15 frames per second that would be over 9 minutes of particle animation. We take advantage of the bilinear filtering of the texture to interpolate between our values
Similarly, you can bake out rigid body physics animations to texture. Instead of baking each verts position as a pixel, you can store the elements position and rotation to achieve the same result with a much smaller texture. You can use two 1024x1024 textures (one has the position, the other has the rotation) to store a brick wall made out of 1024 bricks collapsing over the span of 1024 frames. Additionally you have to store the element’s pivot position in the UVs or vertex color. We can still make use out of the vert attributes!
Another technical challenge that is solved with this approach is bringing baked fluid simulations to a game. Of course the problem here is that topologies are different, vert counts are different, etc. So the mindset here is to think of a mesh sequence as a cloud of triangles that are arranged differently in every frame. Sometimes there will be triangles left unused and those we will simply collapse to [0,0,0]. Here is the gist of the workflow:
Mesh Optimization. Each mesh in the sequence is optimized to something reasonable. Let’s say something around 3000 to 10.000 triangles
Triangle pairing is used to attempts to group triangles together. The more triangles are paired, the better. As two paired triangles only need 4 verts to be defined, whereas two unpaired triangles would need 6 verts. So we are using fewer pixels on the texture.
Textures. Two textures are generated, one that contains the vert’s position and another that contains the vert’s normals.
Mesh. A mesh is generated with the paired and unpaired triangles. The mesh UVs are used to sample in the proper position of the texture.
Shader. The shader only has to retrieve the vert position and the normals from the texture and make sure that the UVs jump from mesh sequence to mesh sequence in the texture, as interpolation doesn’t make sense. We can also use the texture’s nearest filtering method for this.
As a bonus you can use the alpha channel of both textures to store additional data such as temperature, thickness, or even the percentage of two colors to fake the mixing of two different colored liquids.
Textures as data containers has some cool advantages but some drawbacks also. Let’s take a look at the advantages first:
Access. The feeding of the information is done right into the GPU without the intervention of the CPU. Usually the communication between those two systems can become a bottleneck. With textures we can avoid that problem.
Share. Textures have very standardized formats and the engines usually have some tools to configure them easily. It’s a natively portable format that can be edited with several software packages.
Order. The textures provide an intrinsic order based on spatial coherence and we can take advantage of that.
Now let’s check the drawbacks:
Debugging. If anything fails is more difficult to follow the track of the problem. We can assure you that you are going to sharpen your wits to find any bug in the pipeline.
Precision. There is a limitation of precision on the format that has to be dealt with. Try to keep things local as much as you can. You can use newer texture formats that allow for higher bits in the worse case scenario.
Creation. To convert the data into colors we need to create some custom tools. The most popular suites don’t have proper ways to do this conversion.
The time has come to talk about the idea that inspired us to write this article. We were working in setting up the parameters for a VR project here at Tequila Works and, taking into account the previous work of Norman in the Vertex Shader field, we were discussing about baking animation and cloth simulation into textures using the Vertex-Count-Agnostic Morph Targets.
"Could we bake cloth simulation into textures for cinematic scenes? Could we bake amazing facial animation?”
"Yes! Of course! But the amount of data is going to be huge.”
"So, can we compress the data depending on the amount of movement? Maybe we can store just the data when we can’t interpolate.”
More or less those were our thoughts. We already had the script to bake the vertices into a texture (due to Norman’s previous work), we were working some numbers (several Gigabytes of data textures) and trying to solve some compression improvements, and also flirting with an idea taken from Hans Godard skinning solver when…
When Mario came out of a pee break with an idea:
Mario: “What if, instead of storing vertex positions, we store the data of the bones in the textures?”
Mario: “The vertex shader already does the skinning, we just need the same info stored in different places.”
Norman: “It’s so insane it actually could work.”
And after some hours full of excitement and adrenaline playing with the puzzle we had a good picture of what we could do. And everybody around the office thought that those two guys went crazy talking about putting animation into textures.
How We Actually Do It? It’s simple, we just need to feed the information that the skinning algorithm gets through texture and mesh information, so the CPU doesn’t have to pass all the information. Let’s take a look:
Where to store the translation of bones? Let’s put those vectors in a texture. Each pixel will be the position of a bone, being the rows for each frame and the columns for each different bone.
Where to store the rotation of bones? That’s an easy one, isn’t it? Another texture.
Where to store the weighting of the vertices? This is a tricky one. Two different extra UV channels.
Where to store the index of the relevant bones? Another tricky one. Vertex color. Or maybe even another UV channel if we needed more bones as it’s higher precision.
And a non-trivial question. Where to store the initial offset of the vertices? More textures!
Let’s break down the process we follow to bring animations using textures in the vertex shader:
For each vertex we need to know what bones are affecting it. We’ll get that information from the vertex color, four colors (RGBA), four bones affecting that vertex. This is a limitation that we can solve using more textures.
Having the index of the bones we can read the position and rotation of each one of them from both textures. We have to use the object position and the original offset from the offset texture to calculate the final positions.
Once we have the bones that affect each vertex and the position and rotation of each one of them we are only lacking the influence of those bones over the final transformation of the vertex. As we said, the influence is stored in the extra UV channels. The first channel U coordinate stores the weighting of the bone with the index stored in the R value of the vertex color and so on (R ⇒ U coordinate of channel 1, G ⇒ V coordinate of channel 1, B ⇒ U coordinate of channel 2, A ⇒ V coordinate of channel 2).
We already have all the information so only applying the linear skinning algorithm is left.
And this is an example of how it looks:
We can take a look at the textures that create the previous animation.
The one on the left represents the rotation over time (rows are frames and columns rotation of bones). The one on the right represents position in a similar way.
With this technique we can store 166 minutes of facial animation (56 bones) in two 4096x4096 textures (rotation and translation) running at 30 frames per second.
If we want to have an awesome facial animation (450 bones) we can have 21 minutes of it.
Imagine packaging all the animations of your main characters into a single pair of textures!
The main improvements we think we can apply to our current works are:
Normal solving. Rotating the normal by the quaternion of the bone is the simplest way to recalculate normals, but there are more accurate and also efficient solutions that we want to investigate and try.
Animation blending. As is, the presented solution is only useful for cinematic animations, but some improvements can be made to support animation blending. We could store all the animations of a character into a texture and read from several positions to blend them. Some issues arise from that approach, but we think that can be addressed properly.
Support bone scaling. With another texture for scaling this feature is very straightforward to implement.
Enhance compression. We are storing information for each frame, but that can be improved with a smarter compression and interpolation system.
Pipeline improvements. The pipeline can be made in such a way that all the transformation happens when packaging your game so that artists and programmers can ignore the fact that the animation is being stored into a texture.
We want to thank a lot of people for inspiring and supporting us in this travel but there is a special mention for Hans Godard that blow our mind with his solver and for the team at Tequila Works that had to endure our enthusiasm and long talks and dissertations about the subject.
For more details or questions contact us at: