In the follow-up to his original article on predicting player churn, Dmitry Nozhnin, head of analytics and monetization at Russian MMO publisher Innova, shares his methodology for predicting when veteran players will quit the game -- identifying when players will drop two to three weeks before they do with 95 accuracy, all carried out in the live environment of the Russian version of NCSoft's Aion.
In my previous article, I showed the process we developed for predicting churn of our freshest users, who just registered for the game, based on data collected during the first couple of days of their adventures. However, on the other end of spectrum are seasoned gamers who have spent months and months in the game, but for various reasons decided to abandon it. Predicting their desire to leave the game is possible, and in this article, we're sharing our data mining methodology.
Nothing changed from the first data-mining project; we were still on two Dual Xeon E5630 blades with 32GB RAM, 10TB cold and 3TB hot storage RAID10 SAS units. Both blades were running MS SQL 2008R2 -- one as a data warehouse, and the other for MS Analysis Services. Only the standard Microsoft BI software stack was used.
Our dataset had up to six months of recorded gameplay for about 38,000 veteran players.
For new players, defining churn was dead simple -- they just leave the game after a couple of minutes or hours. That's it. The last day of play was clearly defined, and data mining models on such churn factors were already well established. However, for veterans, it took us several iterations to define churn correctly. Our first assumption was this: the player is enjoying the game for some time, but then he decides to quit and leaves. Marking his play days with green, we expected something like this:
Our guess was that defining the churn point would be straightforward -- the last game day. The reality, however, was more complex; the majority of players behave like this:
Is August 25th, when we've seen the player for the last time, the churn point? Or in fact August 16th, the day we hadn't seen the player for seven consecutive days? Or July 31st, the first time she hadn't launched the game for more than seven days? We tried several hypotheses, and the simple ones didn't work out. Defining the churn in a simple way -- predicting that a particular play day will be the last one -- resulted in unimpressive 65 percent precision.
Manual data investigation revealed that majority of churners have a "long tail" of play days -- those occasional activity days during several weeks, or even months, as shown on the second calendar example. They effectively stopped actively playing the game, but still log in from time to time. In fact, they had already quit; occasional logins are for auction sales, random chats, or probably indicate that the account has been passed on to guildmates.
The next step was to cut off this tail using some empirical thresholds in order to trace back to the day when the player's activity decline started. The most effective query was something like "the last day of play when total game days for the previous 30 calendar days were fewer than 9". Still, the precision was under 80 percent, and empirical rules didn't work for loyal but very casual players.
Key success factor of this project was reframing the moment of churn from "the player has left the game" to "the player's activity has dropped below the churn threshold". We already store and widely use the Frequency metric, defined as "days with game logins in last 30 calendar days". In short, it means how often player has been playing -- every day, every other day, on weekends, or just a few days a month. We segment players according to their play frequency:
The next step is redefining churn as they fall into The Pit, an area of extreme inactivity with very high probability to churn. This idea really makes sense from business point of view -- instead of detecting churners the day the leave the game forever, we're now focusing on early detection and prediction of disinterested players, and have several weeks to incentivize them to keep playing.
The new approach was to predict players who will fall down into The Pit in two weeks for 7-9, 10-15, and 16-20 cohorts, and in three weeks for the 21-25 cohort. So we're looking for players who are losing momentum, and whose activity will drop significantly over the next several weeks: