Gamasutra: The Art & Business of Making Gamesspacer
arrowPress Releases
July 22, 2014
PR Newswire
View All





If you enjoy reading this site, you might also want to check out these UBM Tech sites:


Nintendo shares its tricks for reducing latency on Wii U's GamePad
Nintendo shares its tricks for reducing latency on Wii U's GamePad
October 17, 2012 | By Eric Caoili

October 17, 2012 | By Eric Caoili
Comments
    14 comments
More:



Minimizing wireless latency is critical for Wii U, as the console's GamePad tablet controller needs to display gameplay that matches what's on TV screens, without any noticeable delay.

"[We] had to take on a challenge that no one else had before," said Nintendo head Satoru Iwata in a roundtable discussion with the company's R&D team. "With the usual wireless video transfer methods, even if a slight latency occurred, it was okay as long as it didn't get stuck along the way.

"So with ordinary video playback, [when the flow of data is interrupting smooth playback,] the system buffers a certain amount of data before it plays... With the Wii U GamePad, however, Mario has to jump as soon as you press the button. So if there's latency, it's fatal for the game."

Nintendo's Product Development Department employed a variety of tricks to reduce latency. For example, traditionally when sending video wirelessly, a single-frame of image data is compressed, sent and decompressed at the receiving end, then displayed on a screen.

"We thought of a way to take one image and break it down into pieces of smaller images," said the team's Kuniaki Ito. "We thought that maybe we could reduce the amount of delay in sending one screen if we dealt in those smaller images from output from the Wii U console GPU on through compression, wireless transfer, and display on the LCD monitor."

He adds, "Generally, compression for a single screen can be done per a 16x16 macroblock. And on Wii U, it rapidly compresses the data, and the moment the data has built up to a packet-size that can be sent, it sends it to the Wii U GamePad."

This solution worked out particularly well for the team because it solved multiple problems -- it not only minimizes latency by reducing the amount of data that needs to be buffered before sending out the data, this method requires less memory and less power consumption than traditional techniques.

The team notes that it was able to reduce latency with this technique and other tricks to the point where images can be transmitted and displayed on the GamePad faster than some TVs connected to a Wii U with a cable, because many newer TVs have latency due to their video processing components.

Michel Ancel, who is leading development on Rayman Legends for Wii U, recently praised the system's GamePad in an interview with Nintendo Power: "The response time is crazy, in fact, and I think the competitors will need some time to [get their controls] this responsive.

"It's crazy because the game is running in full HD [on the television], we are streaming another picture on the GamePad screen, and it's still 60 frames per second. The latency on the controller is just 1/60 of a second. ... It's almost instant. That’s why it responds so well. So it can be used as a real game-design thing."


Related Jobs

FitGoFun
FitGoFun — Mountain View, California, United States
[07.22.14]

Unity 3D Programmer
Vicarious Visions / Activision
Vicarious Visions / Activision — Albany, New York, United States
[07.22.14]

Concept Artist (Temporary)-Vicarious Visions
Treyarch / Activision
Treyarch / Activision — Santa Monica, California, United States
[07.22.14]

Cinematic Animator (temporary) - Treyarch
Treyarch / Activision
Treyarch / Activision — Santa Monica, California, United States
[07.22.14]

Associate Cinematic Animator (temporary) - Treyarch










Comments


Chris Hendricks
profile image
Wow... hadn't even considered that issue. Nicely done, Nintendo!

Michael Pianta
profile image
I don't think Nintendo gets enough credit. They are routinely criticized for being behind the times technologically, when, in actuality, I feel that they just focus on different areas of technological innovation. I suspect that it's down to their history as a toy company, rather than a tech company. They aren't going to just upgrade the cpu/gpu and be done. In the end that road leads to death because manufacturing and development costs just spiral out of control, as we are seeing. Instead, Nintendo focuses on making a compelling new "toy" - a physical object that you physically interact with in an interesting way.

Jeremy Alessi
profile image
This is sweet! I read an interview a while back and the executive called it the "secret sauce" and wouldn't talk about it (I'm sure he didn't know how it worked either). With wireless transmission becoming so crucial it's nice to get a read on Nintendo's solution.

Langdon Oliver
profile image
Can someone clear something up for me? I'm gathering that they're saying it's somehow faster to send twenty-five 10x10 images than it is to send a single 250x10 image? How is that possible? Isn't there some sort of overhead with each transmission? Is it a tcp/udp packet size limitation I don't understand?

Merc Hoffner
profile image
What I understood from it is that if you used an off the shelf compression/decompression routine then the compressor would compile the entire image into a buffer and then send it, meaning the first bit of the image to be compressed has to wait around for the last bit to be done before being sent. Presumably this means that if they have tight control of the packaging and downstream wireless communication hardware and protocols (which they apparently do, and remember, it appears the link may be entirely proprietary and need not conform to any existing encoding standard) they can immediately send small early pieces over and immediately start decoding even before the image has finished encoding.

Moreover, the reduced buffer size would mean that it's cheap enough to integrate the compression silicon and associated buffers (and possibly the error tolerance/wireless encoding etc.) directly into the GPU, cutting down latency further. It's not clear that's what they've done, but considering the discussion in last week's Iwata asks, and given the IO control and extra curricular functions built into the original Flipper chip that this is backwards compatible with, it seems part of their MO.

Ian Uniacke
profile image
I also assume that you get better compression on smaller fragments because you can use looser optimisations in a similar way that DXT compression works. Obviously I don't know the specific details but I suspect they have worked out some advantage that means they need to send less data.

Stanley de Bruyn
profile image
Just guessing. Compressing is also a computation delay. The connection has a bandwidth limit so it crusial to start fast streaming result off compression results.
I think the key benefit is instead waiting for full frame compresion wich is long idle bandwidth. Start as soon as posible with sending the first tile, then the next so using time and bandwidth efficently. Computation of fraction of the whole thing,would be short so starting fast with using the bandwidth.
It come to this, concurrently compressing and sending and decompression.

So it might be with the overhead it might be 1.2 time longer to compute but 3/4 is already at destination and 5/8 of it decompressed.

Merc Hoffner
profile image
@ Ian, while it's possible, I doubt they used S3 type compression algorithms on ultra-small fragments. While those are designed for ultra high speed (and low latency) they're so simplistic that they only get 6:1 ratios at best. There's also no temporal component to those algorithms. For the Game-Pad's resolution that would work out at a minimum of 98 MBps, which would just barely fit down a single antenna 802.11n connection with error resilience if the conditions are really good. On top of that, while S3 type compression is generally useful for textures which get transformed all over the place, for a flat raw image their artifacting can become very apparent as their simplistic nature actually means they have a low coding efficiency. More likely they're using some traditional combination Fourier/vector type technique which can readily hit 100:1 with excellent appreciable quality: If they throw enough data at it for say 30:1 then they can readily fit in the dependable bandwidth and offer essentially artifact free imagery. Their references to macroblocks which is a common term in that type of compression lends some weight. How they can get around I-frames without spikes in the data rate is anyone's guess (they even mention this as a type of 'error carried forward' problem). Perhaps they use some kind of 'rolling' I-frame fragments with extra redundancy code.

Merc Hoffner
profile image
Considering we know that Sony can't have retrospectively gone to these hardware lengths with Remote Play, I was wondering if anyone had any measurements on Sony's latency? AFAIK it does at least form an ad-hoc connection for direct linkage.

Roderick Hossack
profile image
I've used Remote Play. It's laggy to the point of being unplayable. That's why high-performance games tend not to support it, even though little/no extra work needs to be done to add that functionality.

Merc Hoffner
profile image
I see. Thanks. Incidentally, I did hear that for Onlive to make their systems work as well as they did (or not well, whatever the case may be), their custom servers racks had a custom hardware compression chip tightly integrated on their boards, and full control over GPU processing allocation and shuttling of data on that board to reduce latency. It's no wonder that you can't just make it work after the fact. Same problems apparently plague Apple TV mirroring.

Derek Kowaluk
profile image
Sounds a lot like the method used in VNC. Apple's AirPlay works similarly leveraging on onboard mpeg4 hardware to encode blocks.

Justin Meiners
profile image
Compression? Wouldn't that be the first thing you would try to reduce latency with wireless packets?

Aiden Eades
profile image
The only problem I can really imagine with this would be artifacts. The problem they aimed to fix was buffering of the image, because lets face it, that 0.2 of a second does count in most games these days. But by splitting the image into thousands of little squares, which then get sent out, probably at random based on a first compressed first served basis, and then decoded on the other side. If there are certain squares missing, then they probably just dont' get displayed.

Wouldn't be overly noticeable, I mean it's a 10x10 square, but if there's a couple of them then somebody might pick up on it.

But really I think this is an either or situation. You either have buffering / skiped frames. OR you have a few rather un-noticable artifacts.

For videogames, I suppose the artifacts are preferable.


none
 
Comment: