In the first part of this "big data and games" blog series, we described the characteristics of big data, how it will challenge the culture of asking and knowing why before acting. Part 2 covered how big data with algorithms creates a new type of expert, and how management must utilize both these algorithmic experts and human experts in the appropriate contexts. In this final part 3, we will discuss the utility value of big data and games. As before we will rely heavily on Mayer-Schonberger and Cukier, as well as the colossal "Games Analytics" edited by El-Nasr, Drachen and Canossa.
First let us talk through how data is different from other types of assets. Many resources such as coal or oil can only be used once. Additionally some assets such as cars diminish in value and utility the more you use them. Some digital assets such as software may have license restrictions only allowing one person to use it at any one time. Data does not have these restrictions. It can be used repeatedly without losing its value; it can be used by many different people simultaneously.
These unique properties of data make it very difficult to estimate the actual value of the data. This is because the data can be used for purposes which we have not yet thought of. In the example of predicting which manholes in New York City would explode, "[f]ew executives at Con Edison in New York could have imagined that century old cable information and maintenance records might be used to prevent future accidents." (p. 103) In effect data has an future option value of which it is very difficult to estimate.
In "Big Data", the authors describe three ways to unlease this option value: basic reuse; merging datasets; and finding twofers.
Basic data reuse and primary use
Data that is collected is usually applied for the immediate use. In games, there are many primary uses of captured data, including but not limited to troubleshooting and customer service to verify user actions, economy balancing, smoothing user level progression, identifying sticking points, ferreting out gold farmers and bot accounts, reducing fraudulent charges.
But this data can be reused for different purposes. Hitwise allows search query data to be used by marketers to learn about consumer preferences. Farecast uses air travel ticket prices to help consumers predict whether prices will rise or fall. SWIFT, the global interbank system for wire transfers, uses the wire payments info to offer GDP forecasts. In the case of games, game logs originally used for customer service queries can be used to target marketing campaigns and promotions; user in-game behaviors used to level up players can be used to predict likely churners and purchasers. Player churn predictions can be used for optimal server provisioning.
Recombinant data can make new things possible. Most of us have used "mashups", which are essentially merged different datasets or services. Let us example the example of Zillow, a real estate service. Zillow combines data about recent home purchases, information on for-sale listings, US census information about the geographic location, school information from the local governments, and places it all on an intuitive map. Zillow makes it very easy for home buyers because it combines all the different sources of data into a coherent service.
Game developers can therefore combine the data they are collecting with other external datasets. For example, most mobile game servers record the IP address of the player session. When you combine this IP information with a mapping table of geolocation or internet service provider, you get a new kind of data that can be very valuable. Mobile phone companies are willing to pay up to half million dollars to understand the switching patterns of their consumers. Game developers have the best view of these patterns from their game play logs, but only if it is combined with the IP - mobile operator information. The same applies for the switching behavior of home internet service providers.
One common problem in acquiring users likely to stay and pay, is that the advertising networks have different targeting criteria. So game developers need to use broad targeting criteria such as "income range" or zip code. By merging US census data on income and paying status, a game developer can arrive at a blended payer percentage for each zip code or each income range. This helps the marketing manager to estimate the value of targeting ads to different zip codes and income ranges.
When data capture is designed with multiple uses in mind, you have the opportunity to get twofers. This third method focuses more on the front-end of the process rather than after the data has already been captured. A good example is that when Google street view cars were driving around snapping pictures, it was also collecting wifi information and tagging it with GPS data. This allows Google to have its own WIFI-location database to use for geo-tagging whenever outdoor GPS coordinates were not available.
One example in games is that many customer service call systems allow the user to provide feedback on the quality of the customer service. This automated feedback system can be also used to gather some simple one question feedback about the game, thus helping development improve the game design.
Preparing to leverage data
How can a game developer or publisher prepare itself to take advantage of their data? Here are some areas to get started with.
This concludes the 3-part series on big data and games. Of course, there are many other facets which we have not explored and this is just the tip of the iceberg. Feel free to contact me at nick_at_sonamine_dot_com or check out the website www.sonamine.com Would love to hear from you and start a dialog.
Mayer-Schonberger, V and Cukier K. Big Data: A revolution that will transform how we live, work, and think. Houghton Mifflin Harcourt, Boston, New York 2013 (Amazon link)
Magy Seif El-Nasr, Anders Drachen, Alessandro Canossa. Game Analytics: Maximizing the value of player data. Springer-Verlag, New York 2013. (Amazon link)