GAME JOBS
Contents
Intro to User Analytics
 
 
Printer-Friendly VersionPrinter-Friendly Version
 
Latest Jobs
spacer View All     Post a Job     RSS spacer
 
June 6, 2013
 
Wargaming.net
Build Engineer
 
Gameloft - New York
Programmer
 
Wargaming.net
Build Engineer
 
Virdyne Technologies
Unity Programmer
 
Wargaming.net
Quality Assurance Analyst
 
Wargaming.net
Python Developer
spacer
Latest Blogs
spacer View All     Post     RSS spacer
 
June 6, 2013
 
Free to Play: A Call for Games Lacking Challenge
 
Cracking the Touchscreen Code [1]
 
10 Business Law and Tax Law Steps to Improve the Chance of Crowdfunding Success
 
Deep Plaid Games, one year later
 
The Competition of Sportsmanship in Online Games
spacer
About
spacer Editor-In-Chief:
Kris Graft
Blog Director:
Christian Nutt
Senior Contributing Editor:
Brandon Sheffield
News Editors:
Mike Rose, Kris Ligman
Editors-At-Large:
Leigh Alexander, Chris Morris
Advertising:
Jennifer Sulik
Recruitment:
Gina Gross
Education:
Gillian Crowley
 
Contact Gamasutra
 
Report a Problem
 
Submit News
 
Comment Guidelines
 
Blogging Guidelines
Sponsor
Features
  Intro to User Analytics
by Anders Drachen, Alessandro Canossa, Magy Seif El-Nasr [Business/Marketing, Design, Game Developer Magazine, Console/PC, Social/Online, Smartphone/Tablet, GD Mag, GD Mag Exclusive]
3 comments Share on Twitter Share on Facebook RSS
 
 
May 30, 2013 Article Start Previous Page 5 of 6 Next
 

Strategies Driven by Designers' Knowledge 

During gameplay, a user creates a continual loop of actions and responses that keep the game state changing. This means that at any given moment, there can be many features of user behavior that change value. A first step toward isolating which features to employ during the analytical process could be a comprehensive and detailed list of all possible interactions between the game and its players. Designers are extremely knowledgeable about all possible interactions between the game and players; it's beneficial to harness that knowledge and involve designers from the beginning by asking them to compile such lists.

Secondly, considering the sheer number of variables involved even in the simplest game, it is necessary to reduce the complexity through a knowledge-driven factor reduction: Designers can easily identify isomorphic interactions. These are groups of similar interactions, behaviors, and state changes that are essentially similar even if formally slightly different. For example "restoring 5 HP with a bandage" or "healing 50 HP with a potion" are formally different but essentially similar behaviors. The isomorphic interactions are then grouped into larger domains. Lastly, it's required to identify measures that capture all isomorphic interactions belonging to each domain. For example, for the domain "healing," it's not necessary to track the number of potions and bandages used, but just record every state change to the variable "health."



These domains have not been derived through objective factor reduction; there is a clear interpretive bias any time humans are asked to group elements in categories, even if designers have exhaustive expert knowledge. These larger domains can potentially contain all the possible behaviors that players can express in a game and at the same time help select which game variables should be monitored, and how.

Strategies Driven by Machine Learning

Machine learning is a field of study that gives computers the ability to learn without being explicitly programmed. More than an alternative to designer-driven strategies, automated feature selection is a complementary approach to reducing the complexity of the hundreds of state changes generated by player-game interactions. Traditionally, automated approaches are applied to existing datasets, relational databases, or data warehouses, meaning that the process of analyzing game systems, defining variables, and establishing measures for such variables, falls outside of the scope of automated strategies; humans already have defined which variables to track and how. Therefore, automated approaches individuate only the most relevant and the most discriminating features out of all the variables monitored.

Automated feature selection relies on algorithms to search the attribute space and drop features that are highly correlated to others; algorithms can range from simple to complex. Methods include approaches such as clustering, classification, prediction, and sequence mining. These can be applied to find the most relevant features, since the presence of features that are not relevant for the definition of types affects the similarity measure, degrading the quality of the clusters found by the algorithm.

Diminishing Returns

In a situation with infinite resources, it is possible to track, store, and analyze every user-initiated action -- all the server-side system information, every fraction of a move of an avatar, every purchase, every chat message, every button press, even every keystroke. Doing so will likely cause bandwidth issues, and will require substantial resources to add the message hooks into the game code, but in theory, this brute-force approach to game analytics is possible.

However, it leads to very large datasets, which in turn leads to huge resource requirements in order to transform and analyze them. For example, tracking weapon type, weapon modifications, range, damage, target, kills, player and target positions, bullet trajectory, and so on, will enable a very in-depth analysis of weapon use in an FPS. However, the key metrics to evaluate weapon balancing could just be range, damage done, and the frequency of use of each weapon. Adding a number of additional variables/features may not add any new relevant insights, or may even add noise or confusion to the analysis. Similarly, it may not be necessary to log behavioral telemetry from all players of a game, but only a percentage (this is of course not the case when it comes to sales records, because you will need to track all revenue).

In general, if selected correctly, the first variables/features that are tracked, collected, and analyzed will provide a lot of insight into user behavior. As more and more detailed aspects of user behavior are tracked, costs of storage, processing, and analysis increase, but the rate of added value from the information contained in the telemetry data diminishes.   

 
Article Start Previous Page 5 of 6 Next
 
Top Stories

image
Keeping the simulation dream alive
image
A 15-year-old critique of the game industry that's still relevant today
image
Here's the first list of Unreal Engine 4 integrated middleware
image
The demo is dead, revisited
Comments

Taylor Stallman
profile image
Great article. After just graduating from college last December with a focus on database marketing. This article is spot on. It even taught me a number of things! Thanks a lot for the article, I'll be sure to use this the next time I need to analyze data.

Henrik Strandberg
profile image
Excellent article! I'd also be very curious to learn about your approach to and experience from QAing the data sets; after all, if you haven't verified (via QA) that the data is correctly generated, aggregated, transformed and exposed, how can you trust the analysis?

In my experience that's one of the main constraints in game play related feature selection; QAing thousands of data points is simply unrealistic.

Lukasz Twardowski
profile image
Henrik, that's a really good question. Most of initial instrumentations of analytics generate false numbers. One, because people who instrument analytics are not the same people who use analytics. They rarely understand how to collect data, what for and how to check its integrity. Two, because games are complex and usually have plenty of small glitches or shortcuts that might not impact players experience at all but can totally corrupt your data sets.

Developers who are aware that their analytics may produce false numbers trust data only if results are in line with their assumptions. That makes any analytics kind of pointless. Those who are not aware of that and trust their data usually end up making expensive mistakes. It's really way better to go blind without data and trust your team experience in both cases.

Of course there are ways to deal with that problem and get reliable results from analytics:

1. make sure that engineer instrumenting analytics service is working directly with someone experienced with that specific service - either someone in-house or a support guy from analytics vendor who will explain the process and audit the integration.

2. having a real time analytics during integration really helps as you can record your session and check results instantly. It's also important that you have ability to clean up database (or filter out your most recent activity) to make sure that you are checking the data from the last session only. If your analytics doesn't give you that comfort, log every outgoing data point on your side and do the math by yourself.

3. Even if you are very diligent about the integration, chances that you will get it right from the beginning are low. If you don't want to get into troubles due to data misinterpretation double check it using qualitative approach. Some of analytics services allows you to export data points by session id to excel and some others* have full set of features to analyze individual sessions or users. This will help you identify mistakes in data collection but also better understand correctly collected data before you jump into conclusions.

*I know only UseItBetter Analytics (disclosure: I'm co-founder) that does that for games but I might be totally wrong about it. Maybe Mixpanel or Playnomics?


none
 
Comment:
 




UBM Tech