| |
|
|
||||
![]() |
||||||
| |
|
|||||
|
The 3D Pipeline Structure and Body Let’s look at a rudimentary 3D pipeline. In the simplest terms, a 3D pipeline usually takes this form: For each vertex do:
3. Write x, y, z, color and other vertex data to the output buffer.
Initializing the SIMD pipeline After the main program calculates the transformation matrix and the lighting data, all these data elements should be expanded to the F32vec4 class so that all data elements have the same value. The simplest way of performing this data expansion is by using the F32vec4 float constructor. The typical code to perform the data expansion looks like this: x_mat->_11 = F32vec4(float_matrix->_11); x_mat->_12 = F32vec4(float_matrix->_12); where float_matrix is a structure similar to the SsimdMatrix, except that its data elements are all floats. Transforming SIMD Coordinates Since the transformation part of the algorithm is purely computational, there is no problem converting it to SIMD. The main issue in this part of the pipeline is calculating we (the reciprocal of w). One way to perform this calculation is by dividing an expanded "one" value by the calculated w, but in this method we pay a performance penalty for the divide operation. There’s also second technique for transforming SIMD coordinates, which makes use of the new Pentium III SIMD and scalar instructions (rcpps, rcpss) that estimate the reciprocal of a given value. Since these instructions only provide estimations however, we’ll also use one iteration of the Newton-Raphson method to improve the instruction’s result accuracy. Using this method we still come out ahead – the result of these steps still takes fewer clock cycles than performing a divide. The reciprocal and the reciprocal with Newton-Raphson method are already encapsulated in Intel’s SIMD classes and their corresponding functions as rcp and rcp_nr. The transform part of the pipeline looks like this: // calculating w. w = pos->x * x_mat->_41 + pos->y * x_mat->_42 + pos->z * x_mat->_43 + x_mat->_44; // calculating we = 1/w. we = rcp_nr(w); // transforming x ,y ,z and multiplying by we. x = (pos->x * x_mat->_11 + pos->y * x_mat->_12 + pos->z * x_mat->_13 + x_mat->_14) * we; y = (pos->x * x_mat->_21 + pos->y * x_mat->_22 + pos->z * x_mat->_23 + x_mat->_24) * we; z = (pos->x * x_mat->_31 + pos->y * x_mat->_32 + pos->z * x_mat->_33 + x_mat->_34) * we; |
|
|