This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.
Frankly, I thought we'd hit the ceiling for accurate prediction. New metrics and hypotheses do not contribute to precision; the models are stable. 78 percent precision / 16 percent false positives is enough to start working on churn prediction.
Motivating these players with free subscriptions or valuable items probably won't be efficient (taking into account the VAT tax associated with such gifts in Russia) but emailing these players couldn't hurt, right?
An unexpected gift: As we were in our third month of the data-mining project, we realized the data might become outdated, as the game received several patches during that time.
Reloading the new, larger dataset for all three months, I noticed some changes in the lift charts. The data was behaving slightly differently, although the precision/recall stayed the same.
ETL procedures were rewritten from scratch again, and the whole three-month dataset was fed to the hungry data-mining monster.
At that time, processing time per level was less than a minute, so an increased dataset resulted in an acceptable 5 minute wait time. Unfortunately, all manual fine-tuning had to be redone, but look at the picture:
Increasing the dataset, we've hugely boosted the efficiency of the models!
For the first level, unfortunately, we can't really do anything. As Avinash Kaushik would say, "I came, I puked, I left". Those players left the game right after creating their character and we have few, if any, actions logged for them.
All those numbers above were historical data and a learning dataset for our precious mining models. But as I'm a very skeptical person; I want the battle-tested results! So we take fresh users, just registered today, and put them into the prediction model, saving the results. After seven days, we compare the week ago predicted churners with their real life behavior. Did they actually leave the game or not?
Our original goal -- to predict players about to churn out of the game -- was successfully achieved. With such high precision/recall we can be confident in our motivation and loyalty actions. And I remind you that these are just-in-time results; at 5:30 am, models get processed, new churners are detected, and they're ready to be incentivized the moment we come into the office in the morning.
Have we achieved our second goal, determining why players churn out? Nope. And that's the most amusing outcome for me -- knowing with very high accuracy when a player will leave, I still don't have a clue why she will leave. I started this article listing hypothesizes about causes of players leaving the game early:
We have tested over 60 individual and game-specific metrics. None of them are critical enough to cause churn. None of them! We haven't found a silver bullet -- that magic barrier preventing players from enjoying the game.
The key metric in this research appears to be the number of levels gained during the first day of the trial. Fewer than seven levels -- which represents about three hours of play -- means a very high chance to churn out. The next metrics with high churn prediction powers are overall activity ones:
It took us three months, two books, and a great deal of passion to build a data-mining project from scratch. Nobody on the team had ever touched the topic. On top of our robust but passive analytics system at Innova, we've made a proactive future predicting tool. We receive timely information on potential churners and we can give them highly personalized and relevant tips on improving their gameplay experience (all of those 60+ metrics provide us with loads of data).
The project was made for a specific MMORPG, Aion, but as you can see, a major contribution came from generic metrics approaches applicable to other games, and even general web services.
This was our very first data-mining project, finished in September 2011, and it has been rewritten completely since then, based on our current experiences with predicting the churn of experienced players, clustering and segmentation analysis, and a deeper understanding of our player base. So the data mining adventures are to be continued...