Our Properties: Gamasutra GameCareerGuide IndieGames Indie Royale GDC IGF Game Developer Magazine GAO
My Message close
Contents
Video Applications For the Pentium III Processor
 
 
Printer-Friendly VersionPrinter-Friendly Version
 
Latest News
spacer View All spacer
 
February 9, 2012
 
What Nintendo's 2011 sales mean for Wii U, third parties [4]
 
Rift heading to China, in 'biggest game deal ever' for a Western MMO
 
DICE 2012: Culture, pride lead to success at Skyrim maker Bethesda [4]
spacer
Latest Jobs
spacer View All     Post a Job     RSS spacer
 
February 9, 2012
 
Toys for Bob / Activision
QA Tester - Temporary
 
Radical Entertainment / Activision
AI Programmer (Senior)
 
Sony Computer Entertainment America LLC
Senior On-line Programmer
 
2K Marin
FX Artist - XCOM
 
Visual Concepts
Software Engineer, VC China (Shanghai)
 
Visual Concepts
Senior Producer, VC China (Shanghai)
spacer
Latest Features
spacer View All spacer
 
February 9, 2012
 
arrow Principles of an Indie Game Bottom Feeder [5]
 
arrow Postmortem: CyberConnect 2's Solatorobo: Red the Hunter [1]
 
arrow Jerked Around by the Magic Circle - Clearing the Air Ten Years Later [33]
 
arrow Building the World of Reckoning [4]
 
arrow SPONSORED FEATURE: TwitchTV - How to Build Community Around Your Game in 2012 [13]
 
arrow Happy Action, Happy Developer: Tim Schafer on Reimagining Double Fine [9]
 
arrow Building an iOS Hit: Phase 1 [11]
 
arrow Postmortem: Appy Entertainment's SpellCraft School of Magic [5]
spacer
Latest Blogs
spacer View All     Post     RSS spacer
 
February 9, 2012
 
Double Fine's Kickstarter Windfall: Will Patronage Supplant Traditional Game Publishing?
 
Did DoubleFine Just break the publishing model for good? [1]
 
The Devil Is in the Details of Action RPGs - Part One: The Logistics of Loot [2]
 
Xbox LIVE Indie Games at it Again
 
Merging Waterfall and SCRUM [3]
spacer
About
spacer Editor-In-Chief/News Director:
Kris Graft
Features Director:
Christian Nutt
Senior Contributing Editor:
Brandon Sheffield
News Editors:
Frank Cifaldi, Tom Curtis, Mike Rose, Eric Caoili, Kris Graft
Editors-At-Large:
Leigh Alexander, Chris Morris
Advertising:
Jennifer Sulik
Recruitment:
Gina Gross
 
Feature Submissions
 
Comment Guidelines
Sponsor
Features
  Video Applications For the Pentium III Processor
by Asi Elbaz [Programming]
Post A Comment Share on Twitter Share on Facebook RSS
 
 
November 5, 1999 Article Start Page 1 of 3 Next
 

After the great success of Intel's MMX technology, the increasing demand for more complex algorithms based on floating-point calculations drove Intel to define yet another new technology. This time around, it defined a new set of instructions and data types for floating-point based algorithms, such as 3D and advanced signal & image processing algorithms, and extended MMX technology support for integer-based algorithms, all while maintaining compatibility with the existing software designed for the Intel architecture. It also included new memory operations that could accelerate any memory-based algorithm - especially multimedia applications, which typically use large blocks of memory.

Subsequent projects in 3D and video applications have demonstrated that the Pentium III processor is an excellent processor for multimedia applications. One of the most impressive such projects is the high resolution, real-time MPEG2 Encoder. This paper describes how the Pentium III processor and Streaming SIMD Extensions can improve the performance of integer-based applications, using examples from the MPEG encoder application.


Motion Estimation & Motion Compensation

For a better understanding, the following examples introduce two of the most basic operations in video compression techniques applications: Motion Estimation (ME), and Motion Compensation (MC).

ME is performed during encoding. It makes use of the fact that the next frame in a sequence is almost the same as the previous frame. The technique looks for the location of a given block in the previous frame by comparing the block to certain related blocks in the previous frame. The output of this operation for each block is a motion vector.

MC is the opposite operation. Given a certain motion vector and a difference block, MC builds a new block by taking the block, which can be located by the motion vector from the previous frame, and adding it to the difference block.

Streaming SIMD Extensions

The Streaming SIMD Extensions meet the demand for specific, advanced, and yet basic operations for video and communication.

The Streaming SIMD Extensions include the following instructions:

pavgb - SIMD averaging of two absolute byte-sized operands. A crucial operation in MC & ME algorithms

psadb - Absolute subtract and sum of two byte-sized operands. Crucial for block matching algorithms

pmin & pmax - SIMD minimum or maximum of two signed operands.

As the following examples show, these new instructions ease and speed up a lot of the basic kernels in video applications and other integer-based algorithms.

The following example shows the basic loop for MC using MMX technology:

Motion_Comp_Loop:

Movq mm0,[eax+ecx] // read eight pixels from one block.
Movq mm4,[eax+ecx+8] // next eight pixels.
Movq mm1,[ebx+ecx] // read eight pixels from second block.
Movq mm5,[ebx+ecx+8] // next eight pixels.
Movq mm2,mm0
Movq mm3,mm1
Movq mm6,mm4 // No MMX registers left.
// mm7 was initialized to be zero.
Punpcklbw mm0,mm7 // convert the first four pixels
Punpcklbw mm1,mm7 // from byte format to short format.
Punpcklbw mm4,mm7
Punpckhbw mm2,mm7 // convert the second four pixels
Punpckhbw mm3,mm7 // from byte format to short format.
Punpckhbw mm6,mm7

// Calculate the average values.
Paddw mm0,mm1 // after add values are 9 bits.
Paddw mm2,mm3

Movq mm1,mm5 // Now mm1 is free.
Punpcklbw mm5,mm7
Punpckhbw mm1,mm7

Paddw mm4,mm5
Paddw mm6,mm1

Psrlw mm0,1 // divide by two.
Psrlw mm2,1 // after division values are 8 bits.
Psrlw mm4,1 // divide by two.
Psrlw mm6,1 // after division values are 8 bits.
Packuswb mm0,mm2 // convert back to byte format.
Packuswb mm4,mm6 // convert back to byte format.

Movq [edx+ecx],mm0 // store results.
Movq [edx+ecx+8],mm4 // store results.

// Increment pointer to the next line.
Jmp back while not end of macro block

 

Example 1. Motion Compensation Using MMX Technology

Since the data range after adding two pixels is more than eight bits, you have to convert the values to short format and then calculate the average. Although we could do this with a shift (divide by 2) before the adding, this would reduce one bit of accuracy.

 
Article Start Page 1 of 3 Next
 
Comments


none
 
Comment:
 




UBM Techweb
Game Network
Game Developers Conference | GDC Europe | GDC Online | GDC China | Gamasutra | Game Developer Magazine | Game Advertising Online
Game Career Guide | Independent Games Festival | Indie Royale | IndieGames

Other UBM TechWeb Networks
Business Technology | Business Technology Events | Telecommunications & Communications Providers

Privacy Policy | Terms of Service | Contact Us | Copyright © UBM TechWeb, All Rights Reserved.