This is the second part of Hypersonic game: programming a bot for fun (Part One), I recommend reading the previous post before.

Rewriting the Bot in Java was really interesting because I faced a problem I don’t usually have when building backend applications. Most of the time the GC works well and I don’t even have to worry about it.

As explained in the first article the bot has a max amount of time per turn, if it fails to provide an answer, the bot dies. So you have to make sure your bot is fast enough all the time. But before we start blaming the GC lets see what we are talking about.

The Model

We basically have a Map, that contains a matrix of Cells and each Cell has a type, an owner and a few other small attributes. To add some numbers each map contains 13*11=143 Cells. Not a big deal right? What happens if we now try to simulate different scenarios (maps) so we can pick the best movement for our bot? Remember the decision tree from the first post:

This means that simulating just one turn would require 10 new Maps, or 1430 Cells. And unfortunately simulating just one move won’t make your bot survive, if you want to be a clever bot you have to prepare for the next 6 or 7 moves maybe more. That is 10^6 Maps or 143.000.000 Cells for a simple bot that simulates 6 turns.

Okay okay, numbers are not exactly like that cause you can’t move left if there is an obstacle for example, so the decision tree gets smaller in most conditions. But anyways, we are just talking about simulating the next 6 turns, just put an 8 instead if the numbers are not big enough for you!

Creating millions of Cell objects is a problem, and deleting them before the next round is going to take time. We only have 100ms to do all the work… will that be enough?

The timeout

When I started working with this model I quickly noticed that after the first turns my bot was dying because of the 100ms time limit. I created a test to do a huge loop of simulations to see where the CPU was spending the time on and what the memory looked like with VisualVM, this is the result:

CPU sample

As you can see the Map.<init>() and Cell.<init>() methods are in the top 10, but that’s not a surprise cause we are creating lots of them all the time…

If we do a memory sample we can see that there are between 1 and 5 million Cell instances while doing simulations:

Memory sample

Okay, 5 million cells is a “lot”, but that is not a big deal for a modern machine with a couple GB of RAM if not more, come on, we are just using 300 or 400 megabytes of ram. Where is the problem? Who is wasting so much time? I tried to find the problem placing timers everywhere around my code, without success…

If it’s not me then its the Garbage Collector

If we run the test with the -XX:+PrintGC parameter we can see where the problem is. Here is just a small bit of what the Garbage Collector does while the test is running (these lines are just a couple seconds of execution):

See those numbers at the end of each line? That is the time the GC is consuming every time it does a garbage collection. Your program blocks while the GC is doing this.

It is pretty quick, but spending 0.05 seconds when you only have 0.10 seconds to give an answer is a big problem. And even if your bot is so fast it just needs 0.01 seconds for simulating you aren’t safe cause sometimes the GC does a full GC and your bot is dead, as you can see that took 0.28 seconds in my machine.

Taming the GC (spoiler: you can’t)

That’s what I tried to do. If the GC is the problem I will have to find out how to make it run faster, or run “later” or whatever, but please get out of my way you are killing my bot!

Unfortunately the problem is not the GC, the problem is that I am creating and deleting millions of objects in milliseconds. The only real solution is creating less objects… Yeah, so smart.

Lets think about the model again. Each Cell is an object. Why? If a Cell is just an empty cell, aren’t all empty cells the same? If two cells have a bomb exploding in 5 turns, why am I using two objects instead of one? How about an obstacle? Maybe I can use a limited set of objects instead of creating new ones all the time.

And this is how the CellFactory was born. There is just one empty cell for all maps that I create, and the same happens to walls. Boxes and bombs are a bit different: I create objects if needed but when there is already an object with the same parameters, I reuse it instead. For example bombs, there is a “bomb pool” where all bomb objects are stored once created so they can be reused.

The important change behind this new model is that now Cells are immutable. We don’t want anyone to modify a Cell that is being used in hundreds of Maps…

Forget about the GC

Now that we have less objects the GC should be less busy, lets run the test again and see what the GC is doing:

Now a full garbage collection takes 0.02 seconds, and minor garbage collection takes 0.00 seconds usually. This is way better isn’t it?

Also the amount of objects in memory has been reduced drastically:

Memory sample after optimization

So next time you see a strange performance issue check the GC and if it is taking too much time consider changing your model to reduce the number of instances being created and destroyed.