Perf Primer : CPU, GPU and your Android game

Perf Primer : CPU, GPU and your Android game


So let’s say you’ve got
some awesome Android game with hundreds of
thousands of downloads. But it seems like you can’t
break that million-user mark. The reason? Users are unhappy with
your game’s performance. You see, users notice
bad performance before any other
feature in your game– before your item stats;
or your UI interfaces; or how, on level three,
one of your characters is actually speaking Klingon. And guess what? Users unhappy with
performance give bad reviews at a higher percentage than
any other problem in your game. So you want to break
the million-user mark? You’re going to have to fix
the perf problems in your game. And for modern
Android games, that starts with understanding that
there exists a delicate dance– a tango, if you will– of
interaction between the CPU and GPU in your
game that can have a drastic impact on
your performance. And the dance works a
little something like this. Modern graphics APIs don’t
allow the CPU to talk directly to the GPU hardware. Instead, it uses an
intermediate process known as a graphics
driver to handle communication between the two. Drawing calls from
the CPU are cached by the driver in
a message queue. And when the GPU hardware
is ready for work, it will start consuming
commands from the same resource and begin executing on them. The existence of the message
queue process in the driver means the CPU can
be pushing messages into the driver at
a different rate than the GPU is reading from it. And as such, your GPU can
be consuming, processing, and drawing data that’s
around one to two frames behind the CPU. Now, while alarming,
don’t worry. This is actually really
typical of modern hardware architectures. But with this in
mind, understand that hitting 60 frames
a second in your game requires both your CPU
frame and your GPU frame to both complete their work
in around 16 milliseconds. When either of these two
systems starts to go out of sync and take longer than
that, bad things happen to your frame rate. For example, what if
your GPU has more work to do then your CPU? This means that your CPU is
going to be submitting frames into the driver at
a faster frequency than the GPU is actually
going to be consuming them. Eventually, the driver queue
is going to get filled. And any time the CPU wants
to submit a driver call, it’ll have to block,
wait for the GPU to clear up some
commands in order to free up space, at which
point the CPU can then carry on and submit its new commands. On the other hand,
if you’re CPU-bound, then you’re inserting bubbles
into your GPU pipeline. Every time the CPU
submits a frame, the GPU will consume it
immediately, process it, and sit around waiting
for the next frame where it can do work again. The straightforward
way to fix this is to simply do less
work on the CPU. But in this case, it may
be that the GPU simply doesn’t have enough work to
do and the correct action is actually to give it
more work in order to fully maximize
the GPU throughput. There’s a great
opportunity to put like 200 extra zombies on the
screen for your designers. You can get a general estimate
of where your problem lies by taking a good look at your
CPU frame time for a couple seconds of simulation. For example, if your CPU frame
time is around 16 milliseconds, then your game is
running pretty well. Now, while your goal
is to keep things under the 16-millisecond mark,
some frames, your CPU load changes and you expect
slight variances in the time. If, however, your frame time
starts to spike suddenly and your CPU computation load
hasn’t changed– you know, you haven’t put 200
zombies on the screen– then there’s a high probability
that this is due to the GPU getting backed up,
forcing your CPU time to stall in the driver. These tall spikes are actually
pretty telling since they’re around 11 to 15
milliseconds of time, which would be exactly how long the
GPU would need to flush a frame or so. In order to figure out
what the culprit really is, it’s time to sit down
and get proficient with mobile profiling tools. Some of the nuances
of the GPU performance require dedicated
tools to discover. But thankfully, each
Android chip manufacturer provides a great suite
of tools to give you more information
on how to optimize your game for their hardware. Understanding and optimizing the
dance between the CPU and GPU is the first step in maximizing
your game’s performance, which, of course, results in
happier users, which could result in
better transactions. And that’s the
whole point, right? So keep calm, profile
your code, and as always, remember perf matters.

11 thoughts to “Perf Primer : CPU, GPU and your Android game”

  1. The first #iobyte is here! Perf Primer : CPU, GPU and your Android game
     #perfmatters #io14 #android #games

    Quickly finding your games' performance bottleneck is critical to address development issues before they become a problem. In this video @Colt McAnlis covers 4 simple steps to follow in order to determine what part of your CPU / GPU pipeline that your game is currently choking on. Find the problem, fix it quick, and be on your way.

  2. Google should just do low level API, like Apple did with Metal. Especially considering how "great" our monopolist Qualcomm is doing with their ES 3.0 drivers… GlBufferSubData/glBufferData stall driver

  3. I would like to be able to code, but I actually picked up some interesting things I never knew before.

Leave a Reply

Your email address will not be published. Required fields are marked *