Gamasutra.com - Statistically Speaking, It's Probably a Good Game, Part 2: Statistics for Game Designers
It's free to join Gamasutra!|Have a question? Want to know who runs this site? Here you go.|Targeting the game development market with your product or service? Get info on advertising here.||For altering your contact information or changing email subscription preferences.
Registered members can log in here.Back to the home page.

Search articles, jobs, buyers guide, and more.

Gamasutra
January 24, 2007

Statistically Speaking, It's Probably a Good Game, Part 2: Statistics for Game Designers

arrowrightPage One
arrowrightPage Two
arrowrightPage Three
arrowrightPage Four
arrowrightPage Five


Printer Friendly Version




[Submit Letter]

[View All...]
  


Statistically Speaking, It's Probably a Good Game, Part 2: Statistics for Game Designers


You Can Never Be Sure

This brings up another rule of statistics (and probability, actually):



The Customary Pose of Probablists and Statisticians

100% Does Not Exist: You will never achieve a confidence interval of 100%. You can never guarantee through inferential statistics that a predicted data point will be of a certain specified value.

The only sure things in life are death, taxes, and the inability to find the last Yeti Hide you need when trying to complete a World of Warcraft quest. Accept these facts and move on.

Misappropriation
I mentioned earlier that statistics works as a skill of villainy. To illustrate why, I wrote this short, bullet-form love poem:

Sonnet 1325: Beautiful statistics, let me count the ways that I abuse and misuse you.

  1. Misunderstanding
  2. Not stating confidence intervals
  3. Discarding valid conclusions because you don’t like them
  4. Drawing conclusions based upon flawed or influenced data
  5. Sportscaster errors – blending errors of probability and statistics
  6. Drawing conclusions based upon unrelated factors

Misunderstanding
People misunderstand statistical statements all the time. I know, it’s hard to believe.

Not Stating Confidence Intervals or Margins of Error
Confidence intervals and margins of error are vital pieces of information. There is a huge difference between saying 43% of PC owners have purchased a downloadable game in the past 30 days (Margin of Error 40%) and the same statement with a MoE of 2%.  When MoE is left out, always assume the worst. Remember, small sample = high MoE.

Discarding Valid Conclusions Because You Don’t Like Them
When used properly, statistics don’t lie. But people lie to themselves all the time. We see this a lot in politics, where statistical studies will be ignored simply because the conclusions don’t match those that were hoped for. Same thing sometimes happens with focus groups. Of course, we also see statistics misused terribly in politics, so it’s a wash, I guess.

Drawing Conclusions Based Upon Flawed Data
This one happens a lot, especially in market research. Your statistical conclusions are only as good as the data you make them from. If the data is flawed, then the conclusions are worthless. Flawed data can come in a variety of forms, with causes ranging from honest errors to severe manipulation. Asking loaded questions is one easy way to get flawed data that supports whatever conclusion you were hoping to make anyway. “Do you prefer Product X, or that crappy Product Y that only idiots use?” quickly leads to seemingly bullet-proof statements like “95% of consumers prefer Product X!”

Sportscaster Errors


“Let me consult the oracle…”

Sportscasters are the shamans of our day. They take a little statistics, a little probability, a little gut feel and then mix them together to make something terrible. If you ever want to see a bunch of statistics thrown around with tenuous conclusions that typically have no basis, just watch a football game.

For instance, an announcer might say that “Team A hasn’t blocked a kick against Team B in the last 5 games.” The dangling conclusion is that Team A is less likely to block a kick than if they had done so in the last 5 games versus Team B. But you could say the same about the reverse--maybe they are more likely since they haven’t blocked one in a while!

The truth is, there isn’t enough information to say either one. And it’s probably more a matter of probability, anyway. Does the chance of blocking a kick really depend on whether one was blocked the game before? They are probably independent events, unless there are recognizable interrelated factors.

This is not to say that all sports conclusions are flawed. Statistics is very important to baseball, for instance. Statistical analysis sometimes guides what pitch is thrown or what the batting lineup will be.

It all comes down to data: when you have a lot of data, you get better statistical conclusions. Baseball supplies a lot of data: almost 200 games per season! With football, there almost just aren’t enough games to go around. Margins of Error are bigger. I’m not exactly saying statistics is never useful for football...it is just harder to mine useful, contextual data.

Drawing Conclusions Based Upon Unrelated Factors
People misunderstand statistical statements all the time. Rather, using compared relationships, it’s easy to infer deeper relationships that don’t actually exist. My all-time favorite example of this is the well-known Pirates vs. Global Warming graph featured in the CHURCH OF THE FLYING SPAGHETTI-MONSTER’S Open Letter to the Kansas School Board:

http://www.venganza.org/about/open-letter/

Please, for the love of all that is statistical, go look at the graph contained in that article. PLEASE, I BEG YOU!




join | contact us | advertise | write | my profile
news | features | companies | jobs | resumes | education | product guide | projects | store



Copyright © 2006 CMP Media LLC

privacy policy
| terms of service