Using Machine Learning to tune a game

AB-tests are nice and all, but we need to go to the next step.

This is David. He’s a nice Columbian guy that somehow lost his way and ended up in Finland. He designs games for us, looks at data and tunes the games based on this. Here’s a sketch of how it works:

Blog_1

 

What happens here? First of all, the player plays the game. Each game session is reported back to our servers. Don’t worry, it’s all anonymous. As in player #485712454 failed at level #57, or player #20495 succeeded at level #127. What we use this data for is basically making the game better. We can see which levels are too hard or too easy. If a level is too hard, people will obviously get stuck, and then quit the game. If it is too easy, people get bored, and also quit the game. If the game is balanced just right, people feel challenged and then feel good about themselves and the game when they pass the challenge.

To achieve the balance, David will look at the data on how players progressed in our game. He will then adjust the difficulty of the game by tweaking level data and various variables.

The second part of the server is where the rules of the game are kept. Here sits all level data as well as some globally configurable rules. At every startup, the game client will download a new set of rules and levels from this server. David will simply put his new configuration on this server, wait for players to play this new, tweaked version, and look at the analytics data again.

There is some lore in the game development community on how to build an optimal challenge curve. Please note that what follows are the “best practices” that are not exactly “scientifically proven” (I know, that’s an oxymoron).

The first thought is likely that you make the game increase in difficulty in synch with the players’ improving skill. That might work, but we can do better than that. It is not optimal to always have the player under equal pressure. In fact, that can be exhausting. So, let’s add some waves. Build up to a challenging level, and then when the player finally passes that level, let them rest and feel good about themselves with a few easier levels. Give them some time to breathe and recover from the challenge. Then you build up to the next challenge.

For F2P games, you likely want to increase the difficulty just a tad more every few waves. These tough levels are where you hope that the player will end up paying you. Get them 98% of the way to finish a level, and hope that they pay you for a booster that gets them the last 2%. What we end up with is a curve like this:

Blog_2

It is not only important to look at how often people win or lose, but also how people lose. If they always lose in exactly the same way, they will soon get frustrated and quit. Keep track of how close to winning they are when they lose. And make sure that varies, with a large percent of close calls where the player almost won. Again, that is when they most likely end up paying. When they play the level for the 20th time, and get 98% of the way to winning. Now they know that they can use a booster to get over the edge, or they will likely play another 20 times until they are again this close to winning. What an opportunity! The booster suddenly seems like really good value.

So far, the wisdom of the game development community. But are these guesses, or facts? And how do they apply to your specific game? This is where the machine learning algorithm comes in. Markus here is doing his PhD in Machine Learning at the University of Turku, right next to us. He’s looking at what can be done with the data generated by our games. We could try something like this:

Blog_3

The game reports to our servers, the analytics engine looks at things like fail rates how close to winning they got, who paid for boosters and who stopped or kept playing.

What the machine learning algorithm does is this:

-it has been told to maximise the player long term retention.

-it get’s all the player behaviour data

-it modifies the rules, and looks at how the modifications affected the retention

-it automatically learns from this feedback loop what modifications were good and which ones were bad

It is quite reasonable that the machine learning algorithm will learn the same rules that the game development community have figured out. But that is by no means certain. We might have had it all wrong the whole time since no one has really run a robust enough experiment.

Admittedly, it’s still early days for us in trying this out. I’ll let you know if it works once we get that far.

All About Retention

It’s all about keeping your players. Seriously, nothing is more important!

This week is about how different features of the game will affect how long your players will stay and enjoy your game. It’s a rough sketch based on what we have learned from the dozen or so games that we have made so far.

First some basics (skip the next two paragraphs if you know what retention is):

Retention is the number of players coming back to your game after X days. Say you get 1000 people to download your game on the 31st of January. Those 1000 people are called a “cohort”. If 400 people from that cohort play on 1st of February, then your one-day retention is 40%. People who play on the 2nd of February do not count towards the one-day retention (those would count towards the second-day retention), and people who downloaded on the 30th of January also do not count (those are another cohort).

Obviously, people from this cohort who play on the 7th of February count towards your 7-day retention, and people who play on the 28th of February count towards your 28-day retention. The basic benchmark has been that your 1-day retention should be 40%, your 7-day retention should be 20%, and your 28-day retention 10%. Some people use 30-day retention instead of 28-day, but as retention is also dependent on the day of the week, you will get more stable data if you’re tracking exactly 4 weeks. This is because people have more time for playing games during weekends – which means that someone who downloads on Friday is more likely to play on Saturday than someone who downloads on Sunday is likely to play on Monday. The one-day retention varies by some 5 percentage points around weekends for our games.

For some more nuanced data, here’s a cheat sheet for what Key Performance Indicators (KPI’s) you should be aiming at with a mobile game. This is borrowed from a talk Ben Holmes from Index Ventures gave at Slush. It’s an incredibly useful summary.

 

retention72

According to this, our Benji Bananas game had a “Best in Class” 1-day retention, a good 7-day retention and a Less-than-viable 30-day retention. And, the monetisation? Let’s not talk about it! With some 75 million organic downloads, the CPI is just about 0, though.

 

This is all very easy to say, but incredibly hard to do in practise. At Tribeflame, we have a philosophy of “soft launching” very, very early. As soon as we can somehow claim that it’s a game, we start testing it. This is different from what most companies call a soft launch. “Pre-Alpha launch” might better describe it. At that point, there is basically just the core of the the core loop. It’s not pretty, and you cannot play it for a long time. It’s also buggy, and players have no way of spending money in the game. The feedback is always that the game has lots of huge issues, but that’s fine at this point. Seeing the game improve and the metrics rise gives us a rough guide to how different features of the game influence retention.

As I already mentioned, we start testing with an ugly core game. With this, we should be able to get a one-day retention at around 20%. At times, we can even be below that, but should quite quickly improve it. There is not yet enough content such as upgrades or levels to keep players in the game. Even 7 day retention is going to be way below 10% at this point. The game might also be confusing, since there is not yet a good tutorial.

Let’s focus on the tutorial next. This will, obviously, help us keep more players as we are not losing people simply because they get confused. Getting the tutorial in order should add to 1-3 day retention clearly. Of course, it does not help that much later on – with the people who already understand how to play the game. However, there are the players who would have stayed for a very long time, had they only at the beginning understood how to play and had a good first impression. I’ll write another article about tutorials and the first session later on.

Another thing that helps early on is high production values. Pretty graphics, nice animations, etc. These things will help you get users more cheaply (lower your Cost Per Install, CPI). There’s a higher chance of featuring when your game looks amazing. People who go to your pages on Google Play or the App Store have a higher chance of downloading your game. And it will help with virality – people are more likely to show their friends pretty things. What it will not do, is help you with long term retention. Someone who has played the game for hundreds of sessions is most likely almost blind to your graphics already. They take the look for granted, and focus on their long-term goals in the game. To help you out here, you need a good meta game.

The meta game is about what happens between the game sessions. What is the player dreaming of achieving in between the daily grind of play sessions? This is what will lift the tail of your retention curve. In short, a good meta game will set the player up with a long term goal – like build the perfect fortress, or assemble the perfect team or deck of cards. This long term goal is then divided into short steps – making the player feel clear progression every step of the way.

Draw up the glorious mountaintop that the player will conquer, make it look like it’s closer than it really is, and then dangle a carrot in front of the player, and celebrate every single step they take on the way. Just one more turn and I will achieve that milestone, then one more to get to that one, etc. etc.

This is, for instance, why there is a huge amount of achievements and missions in many successful F2P games. Just telling people what to do next is one surprisingly good way of actually getting them to stay in the game and do it. It’s also why upgrades to buildings, characters, vehicles, etc. are split into a lot of small steps.

I need to point out here that I, as a player of other companies’ games, actually enjoy this experience of feeling like I’m close to my big goal, only to realise that the journey was longer than I first expected. If I’m enjoying a good game (or a good book, TV series, etc.) I do not really want it to end.

 

retention2

Here’s a rough sketch of how the curves improve when adding features.

Most people will advise you to focus on retention first, and use a great retention as a proxy for player’s commitment to the game, and therefore as a signal for how much they are eventually willing to pay. Some suggests going the other way and using early monetization (players paying in the first week) as a proxy for their commitment to the game, and eventually their long-term retention. Probably either will work. However, there is some data to suggest that the later players start paying in the game, the more they will end up spending.

Also note that you would rather keep the players even if the do not pay, than lose them completely. Therefore, there should be no hard pay gates. A free player is a free ad for the game. A walking, talking evangelist with a network of friends to influence. In games where you play against others (PvP), the free players are also free content generators for the game company.

For a specific game, retention is likely to go down over time. For instance, in our Benji game, the one-day retention at launch was at 54%. Now, a few years later, we are tracking 35-40% for one-day retention, with all numbers measured on organic users. Where you get your users will have a huge influence on the retention numbers – but that is also for another post.

The same trend has also been observed by others.

One last observation: it’s likely better not to demand very deep concentration in mobile games. Deep concentration limits when people can play, which in turn makes it harder for them to form a habit around your game. It’s better to make the game playable with one eye while watching TV or your kids with the other eye. Console, VR and PC games will brag about how “immersive” they are. On mobile, if you’re immersed while waiting for the bus, you will miss the bus!