Prediction in the Gaming Industry, Part 1: All About Prediction
The thoughts and opinions expressed are those of the writer and not Gamasutra or its parent company.
This article is the first in a three-part series on prediction and using predictive metrics in the gaming industry.
As a data scientist, I deal with predicting the future. Sounds fishy, right? Well, it is true that no one can truly predict the future. There is no “Minority Report.” What’s real are probabilities and patterns that play out.
I’ve already written about data and how analytics are used in games (and you can read that series here), and I briefly touched on predictive analytics. This series of articles will dive more into the topic.
With today’s advances in big data analytics, the ability to accurately predict user behavior with a fairly high degree of certainty is a reality -- that’s not just hype, and it’s what allows me to do my research at the intersection of social and computer science. In the gaming industry, specifically, this means predicting player actions and using that information to inform strategy, improve product, retain key players, and increase monetization opportunities.
We’re getting a little ahead of ourselves now, though. First, let’s back up and look at the fundamentals of predictive modeling, how it works, and how we can be sure it works.
Though the details can get confusing, there’s a fairly simple way of understanding predictive modeling. Using a gaming example, let’s say you record and monitor all of the events that happen in a game, and your computer starts to recognize patterns. Some patterns repeat, others don’t. When the computer recognizes repeated patterns, though, it “learns” and starts to look for the pattern to occur again, then begins making predictions about the next sequence.
For example, when a sequence (like A-B-C-D) repeats over and over, the computer will start to recognize it. From here, you can ask the computer to make a prediction: When an A-B-C pattern comes up, the computer will predict what will come next (in this case, “D,” of course). Beyond that, though, it can also tell you how likely this prediction is to be correct. “D” could be any number of things that are of interest to design, retention or monetization. Imagine “D” is quitting, or completing a level, or spending in the store.
How can it do that, and why is it important? Well, looking at all the previous sequences, “A-B-C” wasn’t always followed by “D” -- occasionally, the pattern is A-B-C-X. So, the more data the algorithm has access to, the more it understands likelihood, and the more it can tell you how often that guess has turned out right.
And that’s it. That’s prediction. Skeptics are right to now ask how accurate these predictions are. Well, that’s where science backs prediction up. There are a few tests that data scientists use to verify predictive models--keeping in mind that verify means “yes it worked” rather than “I can read the future and know it will work tomorrow.” If nothing changes, there’s no reason it won’t work tomorrow. Failures happen when tomorrow is different in an unaccounted for way. If there is a terrorist attack, or an end to the school year, that can throw predictions off, and we can’t see everything coming as well as everything else. The end of the school year can be built into the pattern, and the terrorist attack not as much.
One common way of testing accuracy is called cross-fold validation. Here’s how it works: a computer splits a very big data set (like your gaming data) into two halves. It takes the first half and does its analysis, looking for patterns and building its model. By the end, it comes up with a figure, like “We see A-B-C and then D happens 75% of the time.” Then it takes that model and checks its accuracy against the second, totally untouched half of the data. If A-B-C-D happens 75% in this data as well, we start feeling pretty good about the prediction. Well, we feel 75% good about it, and we report that % as a confidence number.
Why isn’t it 100%? The concept that you can’t be 100% certain might sound vaguely familiar from a high school statistics class, but there’s a slightly different reason behind predictive certainty. We’re predicting human behavior in the real world, and there are a lot of moving pieces that a computer can’t account for. For example, you might notice a pattern that John buys a large coffee every morning, and make a prediction that he’ll do the same tomorrow. Pretty safe bet, right? But what if something happens that prevents him from buying his coffee -- like poor John gets in a car accident? You had no way of knowing that will happen, and certainly didn’t plan for this in your model, but it wasn’t an incorrect model. It was just incomplete, and therefore not 100% accurate.
There’s also another reason we don’t count predictive models as 100% accurate: it’s very easy for some pseudo-scientists to cheat the system. For example, they’ll include everyone in their prediction, ignoring any false positive or false negatives. This isn’t science, because anyone could just include all the players in their model and predict that they’ll buy $10 in credits tomorrow. Since everyone is included, they’ll be 100% accurate in predicting which players will spend -- but also incorrectly predict the behavior of most of the player base.
This casts a shadow on predictive analytics, and happens more often than it should. So, to be responsible, you need proof, which comes in the form of an “F-score” (there are other metrics worth using as well). This takes one stat that allows for false positives and another that allows for false negatives and simply averages them. The result is expressed as a percentage and is extremely trustworthy. It can’t cheat, and will weed out anyone who isn’t really predicting. Games have really good data, but high scores are achieved by a team that knows what variables to include. Computer scientists are traditionally not very good at that job, by the way. Social scientists are, but usually don’t understand the tech, so the best solutions and teams combine both approaches. For reference, a good F-score in the telecommunication industry is .40. In gaming we can do better because we have richer data. Here a good score is .50 to .70. Anything over that is pretty amazing. Remember, it’s not a straight percentage since it’s an average of two scores. It’s an inherently conservative stat.
At the end of the day, seeing that confidence value in a predictive model is key. You need to know how certain you can be taking actions on it. That’s going to depend on the business case and how much is at stake. The nice thing about these scores is that they get you to the scientific values of transparency and provability. There’s no “take my word for it.” I like to say that we should let the data tell the story, and then we should listen to what it says rather than what we want it to say.