|
Does that have more overhead than other solutions?
DQ: You'd be surprised. 1600 clips sounds like a lot, but the thing is we record them quite quickly. I just wrote a tool, basically, that we ran on five dev kits at once. We had family days at Rare, so everyone would bring in their kids and partners. We wanted a wide cross-section of people doing the swings. Everyone would stand in front of their dev kits, and we would say, "Turn to the side. Do a golf swing." And we would just record them all onto the server.
The other interesting thing is, once we had all those clips, the engine doesn't really need to tag them up. We actually gave it to our testers and said, "Here's a hundred clips. Spend the next hour tagging them." They can just go through in a video-editing tool and say, "Here it is. Here it is. Here it is."
So it's not really an engineering-driven problem. That really helps as well. That's basically how we did all those tags. Now we've done that with golf, we're actually doing that with all of our events.
Is it a more effective way to determine natural motion -- the kind of motions players will do?
DQ: It's another tool in the tool belt, basically. The machine learning system we use in golf is very discrete; it's good at detecting specific events: the ball should release now. For example, table tennis is a very analog, skill-driven system, so it's a different kind of gesture.
You have to look at what you're trying to detect and then pick the right tool. Machine learning is just another one of those tools -- a very powerful one. I don't think we could have done golf to the level that we did without having that system.
You've worked with Kinect since the Project Natal days. Has Kinect come further in terms of recognizing people's movements and recognizing multiple people in front of the camera than you actually anticipated?
DQ: Yeah, I think so. Since the Kinect launched, we've had two upgrades to the tracking system: more data sets, more training. Every time that's happened we've seen it getting better and better. Whether it's beyond my expectations, I was pretty blown away the first time I saw it (laughs), so it's a very high bar.

I know you're working on Sports, and that sort of does limit things. You're going to pick specific sports. When you're working with the designers, do they have to come to you and say, "This is the idea we want to do. Can you figure out a way to do this in engineering?" Or is it more of a back-and-forth where you're like, "This is the kind of tracking that is possible"?
DQ: It's definitely a back-and-forth. I'd say for Sports 2, they picked a ton of tough ones for us: darts, baseball, and golf. When they first suggested darts, I was almost in disbelief.
Because your hand's going to be right in front of your face.
DQ: Absolutely. For the precise motion that they wanted, I was almost one of the guys going, "No, no, no. We can't do that." But then you look at it and you kind of look at the how could we do this kind of stuff. Darts is actually brilliant; it's one of my favorite games in Sports 2.
Darts uses a system nobody used at all in Sports 1; it's almost entirely around the depth feeds, that image feed of how far everyone is from the thing. We actually don't use the skeleton as much.
That's not something we really did much in Sports 1. It's just looking at all the information Kinect gives you and working out which bits you should look at to run the system that you want. The skeleton tracking, the depth feed -- all kinds of stuff.
Is it as much about excluding information as it is about including information?
DQ: It's definitely working out the context -- exactly what you're looking at. An example of tailoring the information was the boxing punch in Sports 1. Initially, we were looking at the skeleton feed, thinking that would be the best way to detect it; obviously, as the hand launches forward, that's a punch.
But since the hand's in front of your body, it's one of those occlusion issues. The skeleton feed can struggle with occlusion. So in the end, we turned to the depth feed and painted these panes of glass in front of the player. When you punch through the panes, if they all broke; that's how we did the punch.
It's one of those instances of taking the consistent information the game is receiving, and trying to look at specific bits [of that information]. That can vary from sport to sport depending upon if it's an analog-y moving game, or a precise dart throw, or a specific moment like a golf swing for when we want to release the ball. So they're all quite different problems.
|
At some point, I'd love to have "semi-intelligent conversations" with NPCs by using my voice (as apposed to the hackneyed dialog wheel/tree where none of the proposed choices presented are options I would ever do/say).
Dynamically interacting with NPCs has been around since the old Kings Quest and Ultima Games (where you could type in keywords and have players respond to queries) and now that the technology is here can't we improve upon it ?
I'm not asking for a completely revolutionary artificial intelligent avatar system (a la Milo), I'd just like to be able to interact in a way that is less "mechanical" (static dialog trees) and more natural (in a way that resembles a "conversation")
Wouldn't it be cool to play an open world detective game (a la LA Noire) where one component of the game would be interviewing people (witnesses, suspects) to find clues using your voice, and NPC may respond to specific queries/keywords (i.e. "Where were you Friday Night?", "What do you know about Fredrick Pierce?")
...Or have a way of "bartering" with NPCs over price in an RPG? ("I'll give you $500... How about $580...$520 is my final offer")
Seems to me the voice aspect of Kinect has the most potential and the one most criminally underutilized.
On a different note, it's interesting that articles like this one and the one a few days ago from Harmonix are coming out just as Steel Battalion is released to terrible reviews that are calling it flat broken and a black mark on kinect itself.
You might be right (i sure hope not). Google does have a developer API which dictates as you talk:
http://android-developers.blogspot.com/2011/12/add-voice-typing-to -your-ime.html
(but similar to Siri, it does require internet access). The utilization of language on Kinect I've seen seems half-baked most of the time. (I still laugh at the memory of "Lightsaber...ON!" as presented at E3.) That being said, I've been very pleased with Google's recognition accuracy (even without a grammar to choose from "options" like Kinect does)
I'm not sure if the previous Kinect voice enabled games (i.e. Mass 3) suffer from a "limited grammar" due to technical reasons (i.e. the recognition accuracy is just not up to par for anything advanced) or lack of "inspiration" (Bioware didnt want to invest much time adapting the experience for Kinect and in effect making 2 separate games).
I'd just like to see something from Microsoft moving toward the Natal/Kinect vision they sold us 3/4 years ago. (I'm not talking full fledged "Milo" here, I'm just asking for some
rudimentary non-critical character interactions). Its one of those things where if you can show us something compelling (and provide us the tools) we'd jump at the chance to offer new interesting experiences with the tech.
http://www.xbitlabs.com/news/multimedia/display/20120620215832_John_Carmack_Virt
ual_Reality_Gaming_Is_The_Next_Big_Step.html
which I think was by far the best thing at E3.
http://itunes.apple.com/us/itunes-u/linguistics-lectures/id425738097
On top of context, there's a chaotic pattern of cadence, speed, tonal sweeps and etc that humans use to understand each other.
If you listen to isolated words from natural language speech it's freakin hilarious.
That said, if SIRI was a DARPA project licensed to private industry, I wonder what the military is using right not to monitor phone conversations? I wonder when they will allow private industry to license that? Did IBM's WATSON use a speech recognition system or did they fake it? I wonder if they are reaching out to the game developer community?
Gaming studios have been disappearing over the course of this generation: it's quite startling, and I can't help but fear that Rare is next. In an ideal world, I feel Rare should go the way of Bungie. With Scott Henson at the helm, I doubt it's possible at this point, but in my eyes, it's probably their best bet to survive.