It's free to join Gamasutra!|Have a question? Want to know who runs this site? Here you go.|Targeting the game development market with your product or service? Get info on advertising here.||For altering your contact information or changing email subscription preferences.
Registered members can log in here.Back to the home page.

Search articles, jobs, buyers guide, and more.

By Ronen Zohar & Haim Barad
Gamasutra
April 16, 1999

Letters to the Editor:
Write a letter
View all letters


Features

 

Contents

Introduction

An Introduction to Streaming SIMD Extensions

The 3D Pipeline Structure and Body

Simple Lighting

Ideas for future Improvements

The 3D Pipeline Structure and Body

Let’s look at a rudimentary 3D pipeline. In the simplest terms, a 3D pipeline usually takes this form:

For each vertex do:

// transform.

1. Calculate w (dot product of the fourth matrix column & the vertex position).

2. Calculate we (the reciprocal of w).

3. Calculate x, y, z (dot product of the first, second and third matrix columns and the vertex position).

4. Multiply x, y, z by we.

// light

1. Color <- 0.

2. For each light do:

a. Calculate dir - the direction vector from the light source to the vertex in the world space (for directional light this vector is the light vector).

b. Normalize dir.

c. Calculate d - the dot product between dir and the vertex normal (assuming the vertex normal is already normalized, this gives the cosine of angle between the vertex and the light direction).

d. If (d > 0) then

color <- color + light_color * material_color * d

3. Write x, y, z, color and other vertex data to the output buffer.

 

Initializing the SIMD pipeline

After the main program calculates the transformation matrix and the lighting data, all these data elements should be expanded to the F32vec4 class so that all data elements have the same value. The simplest way of performing this data expansion is by using the F32vec4 float constructor. The typical code to perform the data expansion looks like this:

x_mat->_11 = F32vec4(float_matrix->_11);

x_mat->_12 = F32vec4(float_matrix->_12);

where float_matrix is a structure similar to the SsimdMatrix, except that its data elements are all floats.

Transforming SIMD Coordinates

Since the transformation part of the algorithm is purely computational, there is no problem converting it to SIMD. The main issue in this part of the pipeline is calculating we (the reciprocal of w). One way to perform this calculation is by dividing an expanded "one" value by the calculated w, but in this method we pay a performance penalty for the divide operation.

There’s also second technique for transforming SIMD coordinates, which makes use of the new Pentium III SIMD and scalar instructions (rcpps, rcpss) that estimate the reciprocal of a given value. Since these instructions only provide estimations however, we’ll also use one iteration of the Newton-Raphson method to improve the instruction’s result accuracy. Using this method we still come out ahead – the result of these steps still takes fewer clock cycles than performing a divide. The reciprocal and the reciprocal with Newton-Raphson method are already encapsulated in Intel’s SIMD classes and their corresponding functions as rcp and rcp_nr.

The transform part of the pipeline looks like this:

// calculating w.

w = pos->x * x_mat->_41 + pos->y * x_mat->_42 + pos->z * x_mat->_43 + x_mat->_44;

// calculating we = 1/w.

we = rcp_nr(w);

// transforming x ,y ,z and multiplying by we.

x = (pos->x * x_mat->_11 + pos->y * x_mat->_12 + pos->z * x_mat->_13 + x_mat->_14) * we;

y = (pos->x * x_mat->_21 + pos->y * x_mat->_22 + pos->z * x_mat->_23 + x_mat->_24) * we;

z = (pos->x * x_mat->_31 + pos->y * x_mat->_32 + pos->z * x_mat->_33 + x_mat->_34) * we;


Simple Lighting


join | contact us | advertise | write | my profile
news | features | companies | jobs | resumes | education | product guide | projects | store



Copyright © 2003 CMP Media LLC

privacy policy
| terms of service