Testing and Iterating

I have a great idea: let’s throw away half of everything we do!

This is about the process of testing and iterating on a game until it works, or you decide that it will never work well enough. There are a huge number of ways to test a game, all with their own weaknesses and strengths. You should likely use a combination of several of these. The constant iteration and testing means that you will design and implement a lot of things that you end up throwing out. It’s frustrating, but it works.

Testing out a game will usually begin with a small number of other game experts discussing the high level drafts. At this point, I am convinced that you should already be talking to others about the idea. It is more likely that you lose money because you made the wrong product, than it is likely that you lose money because someone heard about your idea, copied it and stole your market. Just talking to others in the industry might very well help you make a much stronger concept to start with.

Once you have some first prototype, you can start testing it out on friends, family and other unfortunate people you happen to meet. At this stage, they can give you some general pointers about how interesting they find the concept, and help you roughly figure out who might be the target audience and who definitely is not. Just remember that is is very, very rough at this stage. Do not assume that your friends are in any way a representative sample of your customer base.

When we are a little further along, we have often been testing games in the lobby of our nearby university. Of course, the sample of people is again clearly skewed, but we can catch early UI misses this way.

We take an smartphone or tablet loaded up with our latest game version in one hand, and our own smartphones in our other hand. Then we stop a random person in the lobby, and ask them if they would like to help us out by playing our games for a minute. We hand them the smartphone with the game, and record a video of their fingers (and voice) with the other smartphone. Then we just say nothing, apart from encouraging them to speak their minds.

It is quite common for the first test to reveal that 7 out of 10 participants had trouble at the same spot in the tutorial. We fix that, and then go back to do 10 more such tests.

A more automated way to get such tests done – as well as going a bit deeper into the game – is available at PlaytestCloud.com. They don’t stop people in lobbies, but rather have people test play a game while recording what is happening on their screens and what they are saying. The game company then gets a video of the whole thing, and can watch and annotate that back at the office. It is a very useful service.

We have also done some more traditional user experience testing with several cameras, one way mirrors and questionnaires. While they work, they are quite cumbersome and, in the end, no more useful than the lobby testing or PlayTestCloud.

The deepest of the qualitative testing we do, is in collaboration with our nearby university. Here we wire up people with an Emotiv EPOC brainscanner, and a Tobii eyetracking device in front of them.

Together with videos, this allows us to see exactly what they are experiencing and where they are looking. It is useful for pinpointing some very specific problems in the game.

So far, it has been all about user experience testing. Of course, you should also test the game functionality technically. On the Apple side, there is a somewhat manageable set of devices. On the Android side, there is not. (Our Benji game has reported over X thousand device versions that it runs on).

TestDroid is a convenient service where you can test out your app on a huge number of different Android devices/versions. We simply make the game play itself and record it doing that. There are, of course, multiple other options as well for how you might outsource technical quality assurance, and a lot of companies offering such services.

At this point, we have tested the game out conceptually, technically, and with a limited number of players that we have listened carefully to. It is now time to go for larger numbers and start working statistically.

We try to go into pre-alpha soft launch with our games as soon as possible, and then develop the games in iterations, gathering feedback all the time. We release the game in some place on the other side of the world (to make sure our friends do not influence the data), and advertise to get small cohorts of users. Usually, we buy some 500-1000 users in each round we test.

To have a look at what these users do in the game, we need some analytics software integrated. So far at Tribeflame, we have made our own bare bones version, as well as integrated a number of others like Flurry, Game Analytics, Google (Play) Analytics, DeltaDNA, etc. Some solutions are very basic, while others are quite comprehensive. The important part is that you can see at least some basic numbers about where you lose players during the first sessions, and you are able to track retention numbers over the first month.

The different forms of testing will each give you it’s own unique look into some aspect of how the game works. None of them will give you the complete story, but they complement each other nicely. The soft launch metrics of thousands of players and show you how people behave with good certainty, but is does not tell you why they behave like that. In contrast, small groups of players that you meet face to face, or bring in through PlaytestCloud, will be able to describe the problems much, much better, but on their own, they are only a small biased sample. Together these two approaches give quite a good picture of how the game works.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s