Microsoft's Xbox Live Vision Camera
launched in the US on September 19th and in Europe and Australia on the 6th of October. The unit, which can is capable of 640x480 video at 30 frames per second, and can take 1.3 megapixel still photos, uses an SDK and gesture sensing technology licensed from computer camera technology company GestureTek, Inc.
The company has been working in the field for over 15 years now, and have previously licensed technology to companies like Hasbro and BMW, and as well as licensing Video Gesture Control patents for Sony's EyeToy unit. The technology is also used in the Vision Camera, with GestureTek releasing a library of VGC tools to Xbox developers in the hope that more sophisticated uses of gesture control will emerge in the future.
Gamasutra contacted Chief Technology Officer and company co-founder Francis MacDougall via email to ask about the camera, and GestureTek's hopes in regards to its usage.
When did GestureTek begin working with computer camera technology?
GestureTek began as Very Vivid, Inc. in 1986, working on the Amiga computer with monochrome cameras and a simple video capture board. Our initial goal was to create an interactive musical instrument where the performer could reach their hand into predefined regions to trigger musical sounds on MIDI driven synthesizers. We accomplished this in about 6 months and my partner, Vincent John Vincent, performed live on stage at Siggraph.
We were seen there by a woman who was setting up a new exhibit at the Smithsonian Institute, and she invited us to be a part of it. This launched us into the Museum and Exhibit market. We have since installed our systems in over 1000 museums, science centers and location based entertainment locations around the world.
What advances have been made in the field in that time?
That is a huge question. When we began, we were able to capture only a single "bit" per pixel in real time, meaning that a 320x240 camera image took up only 40 bytes of data per row. We used a "threshold" to identify what was "on" and what was "off" to create a silhouette image that we used for the tracking and interaction. Today we have full color realtime high resolution imagery with enhanced processing speeds allowing for complex computer vision analysis of these images.
What features does the camera offer?
The camera can run at 60 frames per second, which is faster than most webcams, and can provide a native 640x480 resolution.
What does the Xbox camera offer over something like EyeToy?
The frame rate is a big plus for rapid interaction, while the resolution is a plus for face tracking and other applications.
What can GestureTek offer to developers and publishers to make working with the camera easier?
Through our deal with Microsoft, we have provided a basic library of functions for initializing and allowing basic interaction with onscreen objects. In addition, as a third party tools provider, we are providing an advanced set of tracking libraries including our face tracker and motion libraries. These are thoroughly tested algorithms that are optimized for the Xbox 360 platform, and are designed to be dropped onto a specific thread to run in the background.
How long do you expect it will be before we see full length titles using the camera?
There is a title (TotemBall
) launching at the same time as the camera, so immediately.
How long do you expect it will be before we see the camera being used to its full potential?
This is tricky. The current set of tracking libraries will take 6 months or a year to get into some really prime titles. Our hope is to see the face tracker used in a premier title to add an extra "control" in a first person shooter, where leaning your head to one side would be a "peek around a corner" or other subtle advantage to the player.
As far as full potential goes, that depends on how many algorithms are embraced. We see our "funny nose and glasses" face augmentation routines as a simple thing to add to network games, but some of our advanced 3D object trackers and spell casting software could take longer to find a home in the right games.
What do you expect, and hope, to see from developers in terms of compatible software for the camera?
It would be great if certain metaphors were used repeatedly for implementing control in games. For instance, for a game that is a "full body" game (no game controller used, only camera), then it would be great if all games used a uniform control model for "moving through an environment". A natural model is to use the face as the centerpoint of a joystick.
Think of a snowboarding game. If you crouch down, then you should accelerate down the hill. If you stand up then you should slow down. If you lean left and right then you should turn. It would be great if the level of control could be selected by the user. This would allow someone with limited space or with fine motor control issues to play at the same level as other players.
Why do you think we rarely seen console based cameras viewed as anything more than novelties in the past, and what role do you think the Xbox 360 camera will fill?
The style of interactions that have been put into the games, and the lack of variety have been the biggest problems to date. All of the games so far have used simple control models like "wipes" and "touches". Better tracking models will generate better games. The Xbox 360 camera is being marketed first as a communication and entertainment device, and second as a game control device.
I think this is a great approach. The goal is to get the attachment rate of the camera to the console to be as high as possible, and this will be facilitated by adding more reasons for people to purchase the camera. Applications like video conferencing, cool video dance effects, and intelligent home workout software will help to broaden its use. With that kind of penetration, the larger game makers will take the effort to integrate it more broadly into their game offerings.
Your press release mentions a "proprietary stereoscopic vision chip" - what is this, and what will it offer to gamers?
We have been working with various stereoscopic solutions for over 10 years. We feel that this is the next generation for evolving the interactive camera model. It is still 3 years away from any kind of market readiness, however the capability it brings is the true full 3 dimensional tracking of an individual for controlling an onscreen avatar. This is the technology that will literally let you make kung fu moves in your living room and have an onscreen avatar replicate them in a game environment.
What peripherals do you see as the main competitors for the camera?
I don't think people will see other peripherals as competition for the camera. If a gamer loves driving games they will buy an excellent racing peripheral. The camera will be bought for a collection of uses. The hope is that it is viewed as indispensable.
Finally, where will we see computer camera technology heading in the next five years?
Higher resolutions, faster frame rates, and of course, realtime 3D capture.