A phrase I come across quite frequently with regards to Game Analytics is that âthe simple stuff will get you 90% of the way thereâ. Whenever I hear or read this phrase, my immediate thought is âdo you personally know that? Have you personally gone as far as modern research in machine learning and statistics can take you and concluded that the additional insight into your data provided by such tools was only worth 10% of your resulting report, and that bar charts and histograms and heatmaps told you 90% of what your companyâs stakeholders needed to know about your dataset? Or are you presuming this to be true from what youâve read from other people? Furthermore, what do you mean by âThe Simple Stuff?ââ
Iâm not claiming that this statement isnât true in many cases: Iâve definitely seen games where there really wouldnât be much point in performing a logisticÂ regression for example, but when the games industry has some of the best data sources available, Iâm regularly surprised at the contrast between how game companies treat data compared to other industries. What Iâm trying to say is that from my experience of talking with developers and producers, Applied Game Analytics, as a whole, is not done well.
Now, I completely understand that there are many constraints on the quality of an analytics investigation: time and money are extremely scarce, but developers still want insight into data. The problem is that a bad report is a very dangerous thing. Sample bias, misuse of data mining tools, misinterpretation of results and many other factors can lead to report conclusions that actually harm the game design and production process. Itâs very rarely anyoneâs fault, itâs just that doing âproperâ data science is difficult, and itâs incredibly important to have the right balance of statistics and computer science whilst performing an analysis.
Having investigated Game Analytics quite extensively, Iâve found 4 recurring areas that are frequently overlooked when game data is analysed:
Gamers are highly variable creatures and regularly act in bizarre ways. As such, itâs likely that if you donât clean your data to remove outliers, the statistics of the gamers that youâre actually trying to design for will become distorted. Â Make sure you think very carefully about who and what youâre interested in: getting good quality data to analyse your particular problem is paramount to a meaningful conclusion.
Far too frequently are statistics in game analytics computed on the basis that the data is normally distributed. If your data isnât normally distributed, and you perform statistical tests based on this assumption, youâre going to get duff results that could lead to bad design decisions and ultimately a flop game. Think carefully about the assumptions youâre making about your data: can these assumptions be tested?
Itâs understandable to want to visualise data, especially when game development is such a visual process. Furthermore, data visualisation is an absolutely key component of the analytics process. However, if all the reporting youâre performing can be boiled down to a pretty picture, then itâs likely that youâre missing out on a lot of potential insights into your dataset. In statistics, box plots, histograms and the like are all part of Exploratory Data Analysis, which is usually performed by a statistician to get a âfeelâ of the data theyâre working with before they do the proper work. Itâs quite likely that your dataset contains more insight than is purely representable by graphs.
Note: my opinion on this subject is biased as I have a personal interest in behavioural modelling.
In game analytics academia, itâs frequently stated that itâs very difficult to infer the motivations of a user. Whilst this is true in many contexts, if youâre willing and able to creating a model of the playerâs behaviour, itâs likely that you can do a fairly decent job of understanding why certain events took place in a playthrough. Obviously having such motivational data would be extremely beneficial for designers and moneymen alike, yet itâs a largely unexplored area in game analytics in both academia and production. If a developer had the resources, modelling and analysing the behaviour of a player would go a long way in explaining other game behaviours.
Although this article may give you the impression that I'm not impressed with the use of game analytics in practice, that's not the case. By contrast, I believe that the systems that many companies have set up to collect and analyse data are state of the art. However, I doÂ believe that more can be doneÂ to get the most of the datasets that studios collect. Understanding whatÂ our users want is the essence of the game developmentÂ problem:Â better analytics across the industryÂ willÂ make solving that problem a little bit easier.