Postmortem: Building The Turing Test around a secret mechanic
The Turing Test is a first person puzzle adventure game based on Jupiter’s moon Europa. It tells the story of a group of astronauts and an artificial intelligence called TOM. I am David Jones, design director and Writer of The Turing Test.
As a product, The Turing Test is about to break £1M in revenue for Bulkhead Interactive. It was brilliantly published and marketed by Square Enix Collective. It has over 200,000 owners on Steam. It has been released on Xbox One, Playstation 4 and Steam. It was developed on the budget of approximately £110,000. The team size was 1 person for 3 months, 6 people for 9 months and 9 people for a further 6 months. The project was 18 months from start to finish. It has a 'Very Positive' rating on Steam and scored an average of 76/100 on Metacritic across all platforms.
For us, this represented a huge success.
At the start of The Turing Test’s development we had 4 core goals:
- Business: Survive the business. Keep the budget low
- Marketing: Have the main character to serve as the game's icon.
- Design: One singular puzzle mechanic explored deeply.
- Story: A story with a great twist.
Keeping the budget low
We decided that, to Portal and other indie puzzle games, we would have a modular workflow. This enabled the puzzles to not be constrained by the art. The Turing Test has 77 puzzles. If each room was individually modeled the budget would have skyrocketed.
The downside to a modular workflow is that it creates a homogeneous look. We chose to break up the modular nature of the game with ‘story rooms’ and drastic lighting. Once per chapter, the player is rewarded with detailed interactive art set pieces that help tell the story.
We made several other decisions to keep the game’s budget lean:
- It's first person to save money on animation.
- It is in space to save money on environment art.
- We made us of only 4 voice actors despite there being 8 characters.
- Some areas of the space station are made of a clever mix of store bought assets and custom art.
The Turing Test is a game that is primarily a success of the production. Money was only spent where it would increase the quality of the product.
"The Turing Test is a game that is primarily a success of the production. Money was only spent where it would increase the quality of the product."
The game spent a large portion of time being developed by a one-man team to keep the costs low while puzzles and mechanics were produced. After that process, a larger team came on board as the game entered full production. It may sound cynical, but The Turing Test was primarily designed to be a ‘financially successful game’ first and after was designed to be received well by game critics.
The story portions of the game were also designed to be greater than the sum of their parts. The story was written to require only 2 voice actors and 2 supporting voice actors, we also used members of the studio for certain characters. There are only 2 scenes in which the characters deliver dialogue with facial animation. There are only 2 character models, both are female. There is only one exterior area. Outside of cinematics, third person characters only have idle animations. This, however, creates a focused pair of characters which fits the game.
We did, however, invest large amounts of time into producing a very clear and deliberate game. In a puzzle game every color, every shape, and every sound has to be scrutinized. Despite being efficiently made, the game was never ‘cheap’. You see this widely reflected in critic and player reviews. We worked a lot of overtime to make sure the game had lots of content and would be a good purchase for our players. We cut costs where we could by cleverly designing a ‘small production,’ but we created extra optional content and depth for players who wanted to explore deeper into the world we created.
My single piece of advice to any puzzle game designer would be: come up with an interesting mechanic and explore it deeply. Exploring an idea deeply is difficult.
It took, perhaps, one year of work to create all the puzzles in The Turing Test (including implementation). That’s 77 puzzles in about 250 work days. So, about 1 puzzle every 3 days. In reality, more than 77 puzzles were created but many were combined and removed. It is a very arduous process and sometimes weeks may pass with no fruit. I’d also note puzzle design is much easier in 2D than 3D. Most puzzle designers will tell you that it took them a long time to design all the puzzles in their videogame. For the ethos behind The Turing Test’s level design please watch this video:
Matthew VanDevander: DOs and DON'Ts of Honest Puzzle Game Design
All of the puzzles in The Turing Test were ‘blocked out’ in white boxes first. It was then playtested with game design students at a local college before proceeding to full art production. The playtesters would play through the whole series of puzzles. At the end of each puzzle, each playtester was presented with an opportunity to rate the ‘fun’ and ‘difficulty’ of the puzzle. This was recorded, along with their playtime, to analyze difficulty and smooth the difficulty curve. This method was inspired by Croteam’s process on The Talos Principle who did similar data gathering to order their puzzles by difficulty.
"Our playtesting method was inspired by Croteam’s process on The Talos Principle who did similar data gathering to order their puzzles by difficulty."
Data gathering is risky as it can neuter the flair and artistry of a game's design. So, more important to us was observing players’ behavior whilst playing the game. A combination of these two approaches allowed us to create a game with a somewhat reasonable difficulty curve. However as The Turing Test is a linear puzzle game, there will always be those players who find specific puzzles to be unsolvable. A more open puzzle design would have allowed players to return to puzzles that block their progress. However, that would have completely contradicted the game’s narrative.
The Turing Test’s design has one fatal flaw in my opinion. The game mechanic, though well developed and designed, is unoriginal. Great puzzle games have great mechanics: Portal explores portals, Braid explores time and Antichamber explores non-euclidean worlds. However, The Turing Test, in my opinion, is a series of conventional transport puzzles- a visually abstracted sokoban. Later in the game, more interesting mechanics such as character swapping are added to increase the complexity of the transport puzzles, but even good transport puzzles need novelty to reach the heights of Portal or Braid. This lack of novelty put a ceiling on the game’s success and can be seen in critics' reviews.
The largest problem with marketing The Turing Test was that the character swapping mechanic was designed to be a surprise. This surprise is not revealed until halfway through the game. This meant that our most unique central mechanic could not be shown in marketing, reviews or previews because the mechanic is tightly integrated with the story. You start The Turing Test under the assumption that you are Ava Turing, an astronaut investigating an incident on Europa. However as the plot progresses you learn that you truly play as TOM, the AI in charge of the Europa mission. TOM controls Ava via an implant and also controls several of his worker bots throughout the game. This allows the player to swap between different characters changing their point of view throughout the game. The player has to use TOM's robots and Ava together to solve puzzles. Not being able to advertise this may have adversely affected the game's sales.
"The idea of a surprise mechanic tightly integrated with the story was inspired by Oddworld: Stranger’s Wrath. "
During development, the 'swapping' mechanic originally had the capability to swap between multiple human characters. At this point, we planned to tell a story that followed a group of astronauts solving puzzles together. This would have allowed exposition to be delivered in a more natural fashion between human characters. However, it would have required a high quality of artificial intelligence and animation that we could not afford. It seemed too ambitious for our small budget so we decided to keep it to a single human character and AI characters.
The idea of a surprise mechanic tightly integrated with the story was inspired by Oddworld: Stranger’s Wrath. Oddworld Stranger’s Wrath has a plot twist in the middle of the game which fundamentally changes the player character, player’s motivation, and the mechanics of the game. It is one of the best gameplay twists I have personally experienced. A twist right up there with the introduction of the Flood in Halo: Combat Evolved.
However, Stranger’s Wrath sold poorly, and I believe part of the reason for that is that it hid its best content. Adopting the same strategy for The Turing Test was risky as we had to advertise the ‘mystery’ of Europa instead of our game mechanics. Fortunately, players bought into this idea, allowing us to surprise them.
“Write what you know” is a common refrain for new writers. However as my old lecturer warned: people write uninteresting stories because they have uninteresting lives. I had done lots of reading around machine creativity as part of my bachelor's degree and this served as the basis for The Turing Test’s story. Questions around machine creativity quickly turn to philosophical conversations that can get preachy. As such I decided to have the majority of the conversation between two characters who see the world in a very different way. A purely rational robot and an intuitive human.
"We knew that The Turing Test would draw comparisons to Portal. As such, we wanted to subvert expectations."
We knew that The Turing Test would draw comparisons to Portal. As such, we wanted to subvert expectations. Portal's GLaDOS had already told the story of a malevolent AI. We asked what would happen if we inverted the story of 2001 and Portal. We decided tell the story of a principled AI and an self interested human. We thought the easiest way to make the player empathise with the AI character would be to make them star as the AI character. So, half way through the story, it is revealed that the player is playing as TOM rather than Ava.
The player is conditioned, over the course of the last portion of the game, to consider killing Ava. We assumed the player would not kill Ava immediately as they had formed a bond over the first half of the game. As such, we slowly conditioned them to consider it. This was done by introducing threat in the music, forcing the player to use a gun to solve a puzzle and prompting them to consider murder through dialogue:
"Would you kill a few to save all of humanity?"
"I am permitted to use lethal force."
"Ava, the true test of a person's character is what they do when no one is watching."
We saw a gap in the market for a hard science fiction game story. Most video games explore the fantastical but we thought a more grounded story would suit the themes. The story was largely based on artificial intelligence research I had previously undertaken at University. That research focused on experiments from 1950 onwards exploring computer generated music. (The radio that plays music in Chapter 4, sector 36 was originally meant to play The Illiac suite, arguably the first piece of computer composed music) This research enabled me to write a story with a deeper understanding of “computer creativity”. This reading hopefully gives a more realistic look into the nature of AI.
What’s unfortunate to me is the little attention that has been given to the other side of the story. The hard science fiction story about hypothetical organisms and gene therapy. I worked with my wife (a Biology postgraduate) and Doctors at The University of Nottingham to come up reasonable fictional life form that could deliver a form of 'eternal life'. I thought the interaction between a video game company and an educational institution could be really interesting. Though it was interesting, I have yet to read of anyone expounding the very deep scientific notes in the environment that explain the organism. The biology of the infectious organism is very well developed however perhaps too obtuse. However, I do believe the real problem was us not exposing players to this information in a consumable way.
As with everything I’ve written, I cannot listen to The Turing Test. Just hearing a line of dialogue will make me ask for the television to be muted. I find it incredibly uncomfortable to listen to. The team felt similarly. But we believe the story is an interesting one, which is something players agree with.
Making games is hard
I was given a great piece of advice for starting a business: “When climbing a mountain you will see higher peaks emerge around you. Finish the mountain you are climbing before you start a new one.” As with all proverbial wisdom, it can be easily misapplied but it is an important proverb for us.
"We believe finishing The Turing Test was the right decision. However, the opportunity cost is impossible to measure."
In March 2016 our Kickstarter for Battalion 1944 generated £317,281. At GDC 2016 the team arrived to a storm of developer and media interest surrounding the Kickstarter. The passion within the team for Battalion was at a high. Comparatively, the passion to finish The Turing Test was waning. Whereas The Turing Test was receiving below 100,000 views on its videos, Battalion was receiving over a million.
With 3 months of work left on The Turing Test, and a new passionate community requiring immediate progress on Battalion, the team was at breaking point. Conversations were had within the team to decide if work should continue on The Turing Test or if should it be abandoned immediately to support the accelerated development of Battalion. It was decided that The Turing Test should be finished and a large portion the revenue generated should be invested into Battalion. A few members of the Bulkhead team reluctantly finished The Turing Test. It was a difficult time for morale. Fatigue hit the team hard. Eventually, all the work paid off.
When The Turing Test sold as well as it did, we were all surprised. At the point of writing we have invested a large majority of the revenue The Turing Test generated into Battalion 1944. We believe finishing The Turing Test was the right decision. However, the opportunity cost is impossible to measure. “Finish the mountain you are climbing” remains relevant. However, I would caution that sometimes abandoning your project is the right thing to do.
The Turing Test was well received by most players due to decisions we made to focus the game’s development around the core experience. If we had not made those decisions to narrow the scope of the game then the project would likely have failed. So far Bulkhead Interactive has survived by following this rule: spend your time working where players spend their time playing.