As an industry, weâ€™ve increasingly turned our attention to the promise of big data and analytics. The concept is simple - When operating at a large enough scale, when collecting terabytes of data, billions of events, even one-in-a-million insights become predictable assets in our developer arsenal, allowing us to predict and repeat success. Weâ€™ve seen it work for companies like Zynga and Kabam, and thousands of blog posts and panels and articles proclaim the bright future of the big data market. The sheer size of our modern marketplace has tilted the odds in our favor.
20 years ago, very few could dream of having tens of millions of active users, let alone having their behavior and customs recorded and stored. The market has changed, our attention has shifted - but in our heart of hearts we know that comparatively little innovation has followed our increased interest in big data. Data-driven design is still incredibly difficult to effectively incorporate into your workflow, and far from every company manages to get a return on their big data investment. Analytics is still difficult because weâ€™re stuck in a paradigm that is as old as I am.
"The McKinsey Global Institute projected that there will be a shortage of 190 000 data scientists by 2018."
For those that paid attention, the last 20 years of technological development has been truly remarkable.Â Weâ€™ve gone from floppy disks to the app store, from dial up to broadband, from DOS to iOS, from command line to touch screen - but SQL is still SQL.
Imagine a world where the graphical operating system never caught on, where business was still conducted primarily in DOS and other command line interfaces. Remember, or imagine, the frustration that came with needing specific training to use some of the most instrumental tools of your trade. Imagine hiring people to operate those machines, instead of buying machines to improve the productivity of the people you hire. It was a nightmare then, and it is nightmare today, because thatâ€™s exactly where we are today with analytics.
Last year the McKinsey Global Institute projected that there will be a shortage of 190 000 data scientists by 2018. There are already 1000 job openings for data scientists in San Francisco alone. This tells us two things:
1) We want to make data-driven decisions.
2) Weâ€™ve made ourselves painfully reliant on data scientists, because data-driven decision making is hard.
So what exactly is a data scientist? I saw this humorous tweet the other week, and I thought Iâ€™d share it with you:Â
â€śData Scientist (noun): Person who is better at statistics than any software engineer and better at software engineering than any statistician.â€ťÂ
Another netizen quipped that a data scientist is a statistician, living in San Francisco, much like a â€śgrowth hackerâ€ť is a Marketing Professional living in San Francisco. These data scientists will cost you at least 130k dollars/year, and through our work onÂ Traintracks we increasingly find that up to 1/10 of the development workforce at a typical game studio is hired to wrangle, wrestle, and munge data. And why do we need them?Â
Without pointing any fingers, I lifted this sentence from the website of a company that is claiming to innovate in the field of analytics and big data:
â€śWith its unmatched speed and familiar ANSI SQL interface, [this product] is a powerful drop-in solution to fix application performance issues.â€ť
Familiar ANSI SQL interface! Howâ€™s that for a sentence? How many of the people best suited to make design decisions at your company are even proficient in SQL?Â
If we donâ€™t know how to use it, itâ€™s not our interface. The data scientist is the interface. The data scientist is a human keyboard, not unlike the telephone operators from the infancy of telephony, and they operate between you and your data, between you and your power to make data-driven decisions - and by the year 2018, there will be a shortage of 190 000 data scientists. This is not a sustainable state of affairs.
Based on the readership of this blog, itâ€™s a pretty safe bet that some of you work at a company that has already spent a million dollars on analytics-related expenses this year. Our industry is literally spending millions of dollars attempting to make data-driven decisions so that we can address the concern that we all share: How can we make our next investment successful?
Why is it so hard being a game studio? Because there are no guarantees, and no continuity. You can have game with 10 million MAU today, and your next game might not even get 100 000 downloads. Game studios disappear seemingly overnight. We all worry about it. We worry about it as employees wanting job security, we worry about it as executives wanting to build better companies, and we worry about it as investors. The days of easy money for game studios are over. Investors have rightfully gotten weary of the industryâ€™s track record. The industry itself, and its investors, are all looking for the same kinds of innovations and solutions.
Bruce Gibney at the Founders Fund captured it rather eloquently in a blog post entitled â€śWhat happened to the futureâ€ť when he wrote:
"At the least grandiose level, we need analytical software much more powerful and much easier to use than the current state of the art. Most analytical platforms are exceedingly arcane, requiring lengthy experience with that exact platform to acquire mastery, and yet the quality of analysis remains fairly poor. It does society no good to collect huge amounts of data that only a small minority can analyze, and even then only partially."
Arcane is the right word. In 1994 the developers at my fatherâ€™s company built their own database, and their own programming language, to reduce the need for complicated SQL workflows. As misguided as that might seem, they felt it was the only way for them to really scale their business. That was 20 years ago, and not much has happened since. We areÂ still offered that â€śfamiliar ANSI SQL interfaceâ€ť, and weâ€™re asked to think of it as â€śfamiliarâ€ť rather than arcane.
Thereâ€™s this joke about SQL, you might have heard it:
A SQL query walks into a bar, confidently approaches two girls at two tables and asks â€śMay I join youâ€ť?
If youâ€™re a nerd like me, itâ€™s a pretty funny joke,Â well worth a chuckle. Even funnier to consider is that the joke might be older than me.
The big data industry is still waiting for itâ€™s â€ś1984 momentâ€ť. Weâ€™re waiting for something like the Macintosh, with its graphical OS and commitment to ease-of-use, to come around and change the way we interact with data entirely, but it wonâ€™t happen unless the industry realizes just how poorly served we are by the current paradigm.
There are aÂ lot of differing opinions on where the industry has to go - or can go - from here, but my time and money is invested in a simple idea: Great games are not built by data scientists. They're built by great game studios that have a commitment to making good games, rather than chasing metrics. By empowering those studios to get to know their players better, and to let them connect with, and understand, their audience in a way that is currently reserved for the analysts, I hope to have some small impact on the great games of tomorrow. With Traintracks, we hope to make the term "data scientist" obsolete as fast as it became trendy, and no game designer should be forced to learn SQL to stay competitive. If the last 20 years have taught me anything, it is that a technology's full potential cannot be reached before it can be effectivelyÂ used by someone without a degree.Â The video game industry is built on the foundation of one the most impressive feats of tech democritization ever, and we should continue to innovate.
There is a bright future for the games industry, but your place in it is in no way guaranteed. Your future depends on how well you manage to predict and repeat success. So bring your data into play.