COLT MCANLIS: Hello everyone. Oh, come on guys. Look up from your pixels. Let’s try this again. Hello everyone. AUDIENCE: Hello. COLT MCANLIS: All right. That’s the type of pre-lunch
enthusiasm I would expect on a Thursday. I can’t see all of your
hangovers, that’s good. All right, hello everyone. My name is Colt McAnlis. I’m a developer advocate at
Google working on Chrome. And joining me today is the
amazingly talented Grace Kloba, who happens to be
the technical lead on Chrome for Android. So all of the cool Chrome on
mobile questions can go to her after the talk. But if you have questions
about my shirt, that come to me. What we’re here today to talk
to you about is how to supercharge your websites, both
on mobile and on desktop, with the help of using the
GPU and a lot of amazing intrinsics that we’ve put
inside of Chrome. Quick show of hands, how many
of you attended the Jank Busters talk yesterday? Awesome, that’s a good set. That’s a good set. You can view the content that
we’re going to talk about today as a pairing of what Nat
talked about yesterday. Today we’re going to talk about
how to use the GPU to get awesome stuff done. Now before we begin, I want
to point you all to this beautiful perfmatters hashtag. Quick show of hands, how many
of you have seen this on the interwebs so far? You all are my favorite
people. Hugs for everyone after that. Come find me later. If you see something today
that inspires you, some statistic you didn’t know, or
some cool technique that you haven’t heard of before, please
feel free to go to your social media outlet
of choice and use the perfmatters hashtag. Spread the word. Spread the love. That’s what we’re all here
at Google to do. I’ll quit wasting time
at this point. Let’s dig into it. Before we can talk about how the
GPU is used to supercharge your website into awesome town,
we first need to start with a little bit of history so
that you can understand how Chrome actually draws
your web page. It all starts at the top with
a very complex series of algorithms known as software
rasterization. Effectively, this suite of
tools, or algorithms and computation, is responsible
for taking a high-order primitive, like this beautiful
little glyph we have here, subdividing it into
boundary pixels. And then finally, adding
color to it and pushing it to your screen. So we can actually subdivide it,
add some color to it, and you can see that this is
programmer art, so, of course, my at symbol is very,
very pixelated. Now Chrome will use this
concept of software rasterization as your
page is loaded. Chrome loads your page. It’ll actually go through and
software rasterize everything you see, so all of the glyphs
that you see, your images, the small text, the lines on the
screen, the borders, the rounded edges, the
drop shadows. This is all being pushed
through the software rasterization path into
a single large bitmap. And then what happens is, as you
scroll, what Chrome will do is it’ll do the smart thing
and it’ll actually go through a series of memory copy
operations and effectively take all of those lines, the
scan lines in the bitmap, and mem copy them to a position
that’s higher in the bitmap. And then it’ll go back through
and only software rasterize what hasn’t been seen
on the page. This, of course, means
that Chrome is doing the smart thing. It’s not spending all of its
time software rasterizing divs as they’re positioned
on the page. It’s only updating what is
new and what is awesome. GRACE KLOBA: So it’s
2013, right? Most of the sites has animation,
transition, to make it look good. So what is animation? It’s essentially bits update
to the screen constantly to make it feel like a movie. So let’s take a look
at this example. We have this rubber duck
animated through the river to the other side. So for every frame, we
have to move the duck to a new location. This means for every frame, we
have to paint to the duck in the new location with
a new background. And then go back to where
the duck used to be and then erase the duck. And [INAUDIBLE] just clean background. Essentially, for every
frame, we have to paint these two regions. And there’s a new trend,
the retina display. So it makes the display look
pretty because it doubles the screen resolution, which
quadruples the number of the pixels pushed onto the screen. So Colt mentioned earlier during
the scrolling for every frame we do a mem copy. That’s not free. It’s even not cheap with the
retina display because we’re pushing quadruple number of the
pixel to the screen while the memory bus speed
hasn’t caught up. And in the animation case, for
every frame we have to paint the two regions. So now that region is quadruple
the size, which means a much longer
paint times. So if we look at the chart,
the number keep going up. It’s not coming down anytime
soon, at least I was not told. So Colt, what do we do? COLT MCANLIS: Well, you see,
because these things keep getting larger and larger and
larger, we have to take advantage of the hardware that’s
resident on the system. So everyone should have
pixels by now. I see most of you
typing on them. And everyone should have one
of these beautiful little phones in your pocket as well. All of these contain one common
element, is they have a new piece of hardware– well,
an old piece of hardware really– called a graphics
processing unit, or a GPU. Now, GPUs were actually created
in the mid to late ’90s to effectively help with
a concept of software rasterization needed for
games and boring stuff like CAD software. Effectively, architects came
together and created dedicated hardware to do software
rasterization. And then we started calling
it hardware rasterization after that. These GPUs super excel at doing
software rasterization. They are amazingly capable
to push pixels around on the screen. They do it better than anything
else we have out on the market today. So the cool thing is that if the
GPU is actually the best at moving pixels and doing
software rasterization, the question for Chrome is, how do
we utilize this piece of hardware to make your webpages
render faster? Now, as we already talked about,
you can look at our upload diagram in a hierarchical
view of this. So we start with our
page layout. And that, of course, is
rasterized by the CPU. Then every frame that you make
small updates, the CPU is responsible for updating
and presenting those pixels to the screen. Now this means that there’s a
lot of work going on in the CPU as you do small scrolls,
big scrolls, and then animations over the page. Because of the fact of that
the GPU is actually really good at pushing pixels, it makes
sense then that we can insert the GPU between the CPU
and the actual monitor. So this means that the CPU can
do a single upload the GPU and allow the GPU to handle
finalized positioning on the screen on your behalf. This reduces the overall amount
of CPU work that has to be done rendering your
pages a lot faster. Clicky. Ah, there we go. GRACE KLOBA: So recalling the
CPU mode, we allocate one big bitmap to cover the entire
visible regions. In the GPU mode, we divided the
page into smaller tiles. And each tile is cached in the
GPU memory as a texture. So let’s reexamine the case
where we scroll the page. When the page is first loaded,
we allocate enough number of the tiles to cover the
visible areas. And then when the page is
scrolling down, some of the tiles from the previous frame
are still visible. But they are drawn in a
different position relative to the window of the screen. So one key difference
here is we can’t read off the mem copy. So retina display, nice and
we are fine with it. Similar as a software rendering
mode, there will be new content to show. And then we just allocate the
new tiles, render them in the CPU, and upload them to
the GPU textures. Some of the old tiles from
the previous frame now they are invisible. If there’s enough GPU memory,
we leave them in the cache. So if the page is scrolling up,
they will be visible and we have them right away. There’s no need to
do a CPU paint. If you continue scrolling down,
at some point we’re going to run out
of GPU memory. What happens is we go back to
the oldest tiles, which user hasn’t seen for a while,
and we recycle them. This means if those tiles will
be visible again, we have to go back to the CPU to paint them
and upload them to the GPU texture. So the amount of the GPU memory
available is really device dependent. The goal for us is to keep as
much as possible in the GPU memory, so when user interacting
with a page, we can avoid going to
the CPU to paint. Before we move on to the next
topic, I want to mention, Chrome can also do
pre-painting. So earlier, I mentioned when the
page is the first loaded Chrome allocates enough tiles
for the visible areas. During the idle cycle, we
proactively paint the area just outside of the
visible region. And this prediction also
is gesture aware. So for example, if you’re
scrolling down a page, the pre-paint will pre-paint the
area below the current visible area because that’s the
direction going. COLT MCANLIS: So this is really
cool for scrolling, but it actually puts us in a little
bit of a bind when we start talking about
the same duck animation as we saw before. By the way, ducks should be
the new animation thing in presentations, in my
personal opinion. I’m going to start
this movement. Too many people put cats
in their slides. I really think ducks are ready
for a comeback, especially if you’ve ever been bitten
by a duck. Actually it hurts a lot. Anyhow, as we have the duck– that’s my aside, right? We got time. We got plenty of time. Story time with Colt. You can hashtag that. That’ll be fun. Anyhow, as the duck is actually
flowing across the pond here, we run
into a problem. Where before, in the software
mode, we would actually rasterize the regions that
were dirtied by the duck moving around, we now actually
have to rasterize more pixels because we have to redraw
the entire tile that’s touched by the duck. So as you can see here, we have
the duck moving from a one by two over to
a two by two. We have to redo all of
those tiles at once. So we’re touching a lot more
pixels during our animation. So this means the GPU tiles
can actually get us into a bit of a bind. Clicky. Clicky. Hashtag clicky. There you go. Can you do it for me? I think we just broke
the internet. Anyone? Can anyone fix the internet? You are all tweeting
right now. That’s the problem. You are all going
and tweeting. You’re hashtagging. AUDIENCE: Don’t you
mean Google+? COLT MCANLIS: Google+’ing,
thank you. Actually, thank you for that. OK. I can reboot it. That might work. Maybe I should install Vista. That’ll actually help too. Or did we freeze? I’m going to do a song
and dance while he tries to fix this. What do you want on here? AUDIENCE: What’s that? COLT MCANLIS: What do you
want me to click here? AUDIENCE: So go into
the movie. COLT MCANLIS: He’s going
to fix this. I’m going to tell some great
stories about rasterization real quick. So for those of you who don’t
know, who haven’t been over to the Chrome booth yet, we
actually have an entire area there dedicated to
performance. So a lot of you web developers
out here who are running into problems with compute processes,
rendering issues, or even network load times,
please come by and stop by the booth. Talk to us. We’ve got pretty much all of
the genius brains of Google who work on performance day and
night there waiting to ask you questions, and to answer
your problems, and to run your site through our tools,
and to solve all sorts of critical things. Still need more time? Cool. I can keep doing this. All right, so for too long,
we’ve actually been spending a lot of time talking about
network or web performance in just terms of page wait. So we’re very concerned
about load time. But as we’ve started getting
more web apps on mobile and web apps doing crazy things, we
realized that the wait that the user experiences your app
while they’re inside of it actually has a large– GRACE KLOBA: We’re back. COLT MCANLIS: We’re back? Oh, well I was in the
middle of something. Hold on. Hold on. Wait, no, I’m not done yet. GRACE KLOBA: OK. COLT MCANLIS: We realized that
how the user experiences your app inside of it actually has
a lot to do with how much money they’re willing to spend
and how much you get from that user in terms of retention. So this means that we have to
start worrying about the other factors too, things like compute
performance and render performance, which brings us
back to how we can utilize the GPU to get awesome stuff done. Did I break it again? Maybe it’s my clicker. There we go. OK, so to rehash, after a five
minute soliloquy, we have this duck effectively moving
to the screen. It would be ideal if we
could effectively know the context here. Is that we have the duck
moving and we have the background static. It would great if we could
somehow separate these two items so that the GPU can handle
positioning of the duck and we won’t have to use all
of the extra CPU cycles painting the duck in position. And this is actually what we
can do inside of Chrome. We actually allow some
annotations and some intrinsics that you can add to
your page that allow you to separate page elements into
separate GPU layers. What this means is that
effectively each layer has its own set of tiles. They’re uploaded once
to the GPU. And then the GPU can position
these things around without any interaction in the CPU. Of course, this turbo charges
the duck in their animation allowing it to do awesome stuff
through the screen. And it allows the CPU to sit
back and drink margaritas and basically chill out
while the GPU does all the heavy lifting. Sorry, we’re having fun
technical problems. What kind of conference would
this be if you don’t get a blue screen while installing
printer software? Anyhow, so this is how the
GPU stuff is working. Now let’s actually talk about
how you, as a developer, can control all of these things. First, it’s worth pointing out
that there’s a set of page elements that are auto promoted
to their own layer once your HTML is parsed and
your page is loaded. Depending on your platform,
and your build, and your hardware spec, things like
canvas, video, iframe, and plug-ins are all promoted
to their own layer on your behalf. You don’t have to do anything. And this is fantastic because
most the time these type of page elements spend their
entirety updating large portions of the screen. They could be overlapping
with other tiles. Recall the animation
that you saw with the duck moving around. So we’re spending lots of times
updating and rasterizing pixels we don’t actually
need to. GRACE KLOBA: There is also a
set of the CSS properties which will manually
promote the page element to its own layer. For example, all the 3D
Transform, like TranslateZ, Translate3D. This making sense because these
transforms can be easily accelerated on the GPU. It’s worth noting the 2D
transform does not promote the element to its own layer, which
means they will be still rendered in the software mode. COLT MCANLIS: Now those two
examples are effectively load time modifications. So effectively, you parse the
page elements, or you parse the page, and these things are
auto promoted on your behalf. It’s worth pointing out that
there’s a set of CSS properties that allow you to
push things to a separate layer at run time. And the two tags you need to
be concerned with are CSS animations for opacity
and transform. How these effectively work is
that when the page loads, an element on the page will stay
static with the base layer. Once the animation begins, the
element in question will actually be promoted
to its own layer. So the CPU has to spin up,
create a new layer, paint the data into the new layer, as
well as go back and paint where the object used to
be in the base layer. Then as the animation occurs,
the GPU and the CPU are working in harmony creating
happy ducks through the universe in a plethora of all
the quacks that you hear through the cosmos. And then at the end of the
animation, the duck then has to be promoted back to the
base layer where it– there we go. Maybe it’s my clicker. The duck has to be demoted back
to the base layer, which means we have to kill the
original layer we had and then rerasterize the area where
the duck finally lands. I’m just going to let you click
everything from now on. Of course, this means
that we run into some interesting concepts. So if we start an animation and
all of a sudden we see a hitch in our performance, this
means that the CPU may be doing too much work and actually
firing up, doing rasterization, and moving on. Which means of course we have
to start talking about well, why isn’t everything in a
layer and what are the consequences and tradeoffs
of doing that? Well the first off thing is you
shouldn’t put everything in its own layer. Even though the GPU can move
these things around and position them, there’s actually
some consequences you should be aware of. First off, is the cost
of too many layers. You need to know that each layer
you put on the screen effectively creates
more tiles. And as Grace mentioned before,
the GPU has a static non-growable memory resource
in its texture cache. So what happens is as these
tiles are invalidated and more tiles exist, if the cache is
full, we effectively have to push old tiles out of the cache
before we can actually put the new tiles in. This is going to put a lot of
pressure on your cache, which is going to result in more
evictions, which is going to result in a lot of additional
CPU overhead painting new tiles that would have previously
been resident inside of the cache. In addition to the tile
overhead, you also have to take into account just
the additional processing that’s involved. Now, this is minimal compared
to the cache thrashing. But effectively, each layer that
you add has to be sorted every frame. There has to be occlusion
determinations that occur. And then Chrome can actually
go through and do union processing to determine whether
or not it should actually draw something. So let’s say you have an object
and then you have an opaque layer in front of it, we
may be actually able to not draw that lower object because
it’s actually hidden. GRACE KLOBA: So there’s
overhead, as Colt mentioned, when the animation starts. We pay extra cost to
promote the page element to its own layer. So if there is a lot of pixels
during the paint that may take a longer time, which means
there’s a long delay for the animation to start. So what you can do is, as we
earlier mentioned, the CSS properties can permanently
move the page element to its own layer. So you can either translateZ(0)
to promote it in the page loading time. Of course, the tradeoff
is it’s going to use a lot of memory. So now the job is– it’s your job to make the
decision whether you want animation to start instantly
or you want to preserve the memory. Let’s look at another example. So on a mobile site, it’s a
common case to try to slide in some content from off
screen to on screen. So for example, we have
this helpful duck. When the page loaded, the
duck is off screen. And then it has a display now. So when the user asks the duck
for help and then what you probably would do is change the
display property from none to block and start
the animation. Unfortunately, we cannot start
the animation right away because we have to go
to paint the duck . If this duck is fat, big, taking
a lot of pixels, it may take a little bit longer time. So what do we do? Remember earlier, we mentioned
Chrome does pre-paint. So if you keep that display
block property, even when the duck is off the screen and also
use a previous trick we mentioned, using the
translateZ(0) to permanently promote it to its own layer,
after page loaded, before the animation start, Chrome
will paint the duck in its own layer. So when user calls the duck for
help, the duck will just jump out right away. Of course, everything
there is a tradeoff. If the user never calls the duck
for help, we still spend cycles on the CPU to paint the
duck and then allocated memory on the GPU to cache it. COLT MCANLIS: So what basically
we’re getting at here is that with these small
annotations and these CSS properties, you can actually
control how the GPU is being used to render your page. So you guys can make faster
loading websites, faster rendering websites, which make
your users happier, which means your websites
make more money. And then your boss recognizes
your contribution and gives you the bonus and promotion. So really, we’re here to help
you get promotions so you can drink margaritas and stuff. Now with that, that we’re giving
to you, we’re asking you to then understand that as
we inject the GPU into the process, some parts of the
architecture of how Chrome works changes, which You
need to be aware of. Most important of that
is how input is being handled with this. Because we’re actually taking
what was a single-threaded process and adding some new
components into it. GRACE KLOBA: So before we talk
about this new [INAUDIBLE] threaded compositing, let’s take
a look at how without it how it works. So threaded compositing is a new
concept that we added into the Chrome. And the Chrome [INAUDIBLE] working on Chrome for
Android, Chrome for Chrome OS, and on Windows. We’re soon going to turn
it on the Mac. So when something changed
[INAUDIBLE] that’s what do you did. And then we will do the paint
and the composite on the BLINK thread. And then present this content
through the GPU commands to the driver. But the driver only consume
the GPU commands up to the refresh rate. For example, at a 60 FPS display
it only consume the frame at 16.6 millisecond time
frame, which means there’s no need, in our case, even the
paint and to composite a small amount of the work, there is
still no need to paint it twice using a 16 milliseconds
frame. COLT MCANLIS: And to be fair,
that diagram is actually a little misleading. The what’s going on under the
hood in these browsers is not actually that simple. It’s actually chaos in a jar,
what’s really going on. Effectively, if you look at
our single-threaded model here, you have lots and lots
of resources and events fighting for time slices on this
single-threaded model. So things like parsing,
layout, paint, all of these things. So what happens is if we
actually get one of these VBLANKS coming along asking for
updated data, there’s a good chance that with all
of the processing that’s occurring that the thread has
not actually had a chance to paint and give the VBLANK an
updated vision of the scene. Now, what’s really interesting
about this, when you dig a little bit deeper, is that
there’s some things that you, as a developer, can control. So if we look at the breakdown
of the blocks here, things like Touch Events, JavaScript
Events, onload callbacks, you as developers really are in
control of how long those events take on the JavaScript
processing thread. Now, there’s some other
things though that you really can’t control. You can wave your hands at it
and maybe do some juju dances, but it’s really not going to
change because on the browser side of things, that’s
controlled– things like layout, paints,
and compositing. Now, the good thing is that some
of these can actually be moved off onto different threads
because of modern architectures. Like most of the phones in your
pockets actually have multiple cores, which means we
can utilize modern techniques for threaded programming. Now, there’s a great technique
that Chrome does use called multi-threaded painting, which
actually allows us to take the paint block, which traditionally
was single threaded sharing time with all
of your JavaScript events, and actually thread that out off
to multiple threads, which reduces the amount
of time it takes. Now, we’re not going to talk
about that today because that’s not on the GPU. Instead, what we’re going to
talk about today is this other block here, which is actually
the composite operation. The best way to think about this
is after your painting is done, remember that Chrome has
to go back and actually create all of these graphics driver
commands, GPU commands, to take those textures and upload
them to the GPU and then actually do that submit. Of course, that isn’t
actually free. It actually cuts into
your frame time. So what we want to talk about
today is how we move that off onto a different thread and
what that means for you. So we now have the BLINK thread,
which is effectively responsible– if there’s one thing you should
all take away from this today is that there are tons of
technical problems on this side of the stage. Basically we have our BLINK
thread, which handles all of our JavaScript processing,
including layout and other sorts of things. And we can actually move the
composite operation down onto a separate thread. So this means that whenever the
system can define that a composite is actually needed, we
can actually go through and kick off a composite onto
a separate thread. And one more. GRACE KLOBA: One more. OK. COLT MCANLIS: There we go. Too far. No, go back. The universe, you divided
by 0, no. GRACE KLOBA: OK. Calm down. So it’s good. We move the composite into
a separate thread. So we offload some work
from the BLINK thread. But there is a key change we
made is let the composite thread to accept input events
when it is possible. For example, scroll events can
be received and processed on the composite thread without
the intervention of the BLINK thread. This allows us to scroll as fast
as possible even there is a long JavaScript is executing
in the BLINK thread. It is the for this exactly
reason we ask you to let the browser to scroll. Simply put, if you’re using
touch events or mousewheel events to drive the scroll, the
benefit of the threaded compositing is thrown
out of the window. The custom scroll libraries
normally will put a touch handler, mousewheel handler,
and then they will call prevent default. So the result is there’s no
scroll events happen or get sent to compositers thread. What compositers thread can do
is just wait for the BLINK thread to process the JavaScript
doing the layout and then paint. This is essentially go back to
the single thread [INAUDIBLE]. COLT MCANLIS: Now,
this is actually pretty common on mobile. So how many of you guys actually
have a mobile website that has a static header
at the top of it? A good amount of hands. So OK, keep your hands up. Keep your hands up. How many of you, in order to
keep that banner at the top of the page, actually intercept
the scroll handler? Still a couple of you, OK. Well, I’m here to let
you know that that’s actually a bad practice. So just like Grace said, that
when you intercept the scroll handler on Chrome for Android,
what you’re effectively doing is throwing out all of the
cool stuff you get from multi-threaded compositing. And the result of that is what
you’re going to see is that as the page scrolls, the scroll’s
going to happen and the headers going to move, whether
you like it or not. And then your JavaScript
code is actually going to execute and fire. And then you’ll see your header
snap to the position that you want it to go to. So if we have this beautiful
Duck Surfing 101 page here and you have the page scroll, you
actually want this thing to stay static. Now, instead of actually
intercepting the scroll handler, you can take advantage
of these CSS properties that we’ve been using
and actually put the header in its own layer. So if you actually add a
position fixed as well as a z-index 0 to this thing– or
z-index anything really, just as long as a z-index is there
so you can create proper context stacking. What this will do is actually
promote the header into its own layer and allow the page
itself to transition underneath it. So you will get the overhead of
having your own layer, but you won’t get the ghost chasing
effect that you’re going to see on top of mobile. If you want a really great
example of who does this right, definitely check out
the Google I/O 2013 mobile site where we do this on
the session agenda. You can actually see as things
scroll, we have these static headers at the top and it
does it the right way. Look at the source code. It’s a great example. And so with that though, it’s
worth pointing out, that we just gave you an example that
we didn’t talk about before. We talked about TranslateZ. We talked about overflow. We talked about those other
things and that will promote items into their own
layer as well. But in reality, you may have
inadvertently put something into a layer without really
knowing about it, which means that we are not leaving
you abandoned here. We actually have updated our
tools to tell you how the GPU is doing things. First off, there’s a fantastic
set of flags that you can turn on in Chrome’s dev tools
that will tell you where the layers are. First off, if you open up dev
tools you actually see this beautiful awesome little flag
there that says, Show Composited Borders. What effectively this does is
any of your layers it will add an orange border around the
boundaries of the layer. So I randomly took a picture
of a random website on the internet here. And you can actually see that
what we have here is lots and lots and lots of layers. That’s what each one of those
orange borders means. So part of their feed here,
effectively, is each layer. If you look at the source code
of this, you’ll actually see that each one of those leaf
elements in the DOM has TranslateZ on it. So once again, you’re seeing
that this would come to bad performance like we’ve
already talked about. Because each one of those
layers is going to have additional tiles. All of those tiles are going to
be fighting for residency inside of GPU memory. And of course, this is going to
slow things down on mobile. Now the cool thing is that it’s
not always axial aligned like one of the images there. You can see that they actually
have this loading icon that is going to be rotated with a CSS
transform as well as a little animated gif too. And you can see that the cool
borders match that as well. Now, it’s worth pointing out
that these borders, as they’re rendered, actually obey
occlusion ordering. So effectively, if you’ve got
that nice little spinning thing and then an image in
front of it, you won’t actually see that border. So take that into account
as you’re moving along. Another great tool that you can
actually dive into is one called chrome://tracing. Now chrome://tracing
personally– hey, I’m a beat box guy now. Chrome://tracing is actually
my favorite tool inside of Chrome because it will actually
tell you what Chrome is doing under the hood. Effectively, it gives you a
hierarchical view of the call stack that’s happening inside of
Chrome based upon what your website is doing. So I actually put together a
little test page to give you this graphic here. And effectively, all the test
page did was load a bunch of large images, like 1024’s and
2048’s, and actually resize them with a width and height
parameter down to 64 by 64’s. And this is what
chrome://tracing showed me. So first off, the 1024 by
1024 actually took 51 milliseconds to decode. And then after that,
it took another 29 milliseconds to resize. So think of it this way, for
all of you guys in here who are doing responsive web design,
and you’re sending your full resolution desktop
images down to your mobile client because you feel that
that’s the only way you can handle viewport differences,
you could be wasting a lot of time. Instead, you can see that the
next image that I had in the list was much smaller. It was about 128 by 128. So the decode took less
time, as well as the resize took less time. So the cool thing to take away
from this slide is that we can actually figure out how your
page is using the GPU by looking at some patterns that
you can see inside of chrome://tracing. GRACE KLOBA: So earlier we
talked about browser scroll versus JavaScript scroll. Let’s take a look in the
chrome://tracing. So in this case, the top row
is a compositers thread. The bottom row called
[? CrRendererMain ?] is essentially the
BLINK thread. The number on the left is a
process ID because compositer and the BLINK thread are both in
the render process, so they have the same number,
same process ID. Chrome trace also comes with
this nice FPS guideline. So it’s running at a 60 FPS,
repeated at 16.6 milliseconds. In this case, [INAUDIBLE] line data was the first
composite in this window. As you can see, the scroll is
updated nicely at 60 FPS, even when we have a long paint
happening in the BLINK thread. Let’s take look at
another trace. This is done on the same device
with the same version of the Chrome. Instead, we load a different
page, of course. Otherwise, it should be same. So in this page, the JavaScript
is hijacking the touch events, and then drive
the scrolling of the page. First thing we did is pull back
our IPS guideline and left the line with the
first composite. Immediately, you will notice
between the second and the third composite, it took 33
milliseconds instead of the 16 milliseconds. Why? Because in this case, we also
have a long paint happening in the BLINK thread. And because the scrolling is
driven by the JavaScript, so the composite step in the
compositers thread cannot start until the paint
is finished. So again, the thing you want to
learn from this example is let the browser scroll. COLT MCANLIS: And with that, we
can recap what we’ve talked about today. Four main bullet points to
walk away from this talk. Number one, making a
presentation is a lot more technically difficult apparently
than writing a whole web browser. First off is actually
GPU plus layers equals faster rendering. By actually getting the GPU
into the mix and actually using layers, we can allow it
to alleviate a lot of the processing burden that would
previously exist in rasterization on the CPU. But that comes with a caveat
though, is that too many layers is going to be a
seriously bad time. If you put too many layers out
there, everything’s going to be fighting for residency inside
of the GPU tile cache and that’s going to create
some problems. In addition to that, that’s
going to create some changes in how input is handled,
especially for mobile devices, which means that you guys
should really let Chrome handle the scrolling
of things. You’re going to have a
smoother frame rate. It’s going to give your
users a better impression of your site. You’re going to make lots of
money and get really cool promotions. And then you can send Grace and
I emails thanking us for all of our time and effort
that we put into this technically flawed talk. And then finally, if you’re not
really sure what’s going on, please use tooling to
verify what’s going on under the hood. Use things like show composited
borders, as well as about tracing and get
a really good deep dive of what’s happening. So with that, we thank you all
so much for giving us your time today. Before we take off, I want
to actually, once again, encourage all of you to use the
perfmatters hashtag for any of your web performance
related things. As well as invite you
to join the Google+ web performance community. You can see the short link
there It’s a fantastic community. We’ve got lots of the big brains
in web performance all contributing there. It’s a community of awesome
web performance stuff. With that, it looks like we’ve
got about four minutes and 30 seconds for questions. The mics are here. Please give us your thoughts. Thank you. [APPLAUSE] AUDIENCE: I have three brief
questions if I can read my own handwriting. COLT MCANLIS: Wait, how
brief can they be? You wrote them down. AUDIENCE: That was so I
could pay attention. COLT MCANLIS: That’s not
a brief question. AUDIENCE: What is the limit
to the number of tiles? Is it 100 tiles, 1,000 tiles,
10 tiles, in the mem cache of the GPU? GRACE KLOBA: It really
depends. As we mentioned, the impact
of the tiles is memory. So we try to do– [? the parsing ?] is the tile is fully behind
it, we’re not going to render it. But if there is overlap because
of the tile size, you of course pay the overhead. So that depends on how
much you overlap. AUDIENCE: So I guess the main
question is, how big is the memory that’s allocated
to the GPU? GRACE KLOBA: As I think I
mentioned earlier, it’s really device dependent. So for example, [INAUDIBLE] the Android mobile device,
we’re basically using the memory limit, which is– most of today’s mobile devices
are unified memory. So the total memory on the
Galaxy Nexus is 1 gig, on the Nexus 10 it’s 2 gig. That’s overall memory. But available for the GPU we use
that one as a measurement and also using the screen
resolution as a measurement. In these two cases, we probably
allocated a limited, so around 256 meg. AUDIENCE: OK, so it’s actually
part of the RAM is– GRACE KLOBA: Yeah,
part of the RAM. And then we have to be
conservative because there’s other apps. So we basically allocate a
budget based on the device memory limit and the
screen resolution. AUDIENCE: OK, should I go to the
back of the line or ask– COLT MCANLIS: Yeah,
if you don’t mind. We’ll get some more
questions in here. This side over here. AUDIENCE: Hi, question is, are
there any good methods for syncing scrolling between
different panes, if you want to create a paned approach, that
don’t rely on JavaScript? GRACE KLOBA: Different page? AUDIENCE: Panes. So for example, your
iOS session example was a good one. I noticed on iOS on Safari
that while the vertical scrolling sticks nicely, the
horizontal scrolling doesn’t work at all. It doesn’t keep it in
sync as you scroll. So are there mechanisms to have
different divs within a web page where the scrolling in
one direction or the other is synced between them, and yet,
still have some of the advantages of layering? GRACE KLOBA: I think, if I
understand the question, is you want the vertical scroll
handled by the browser, but the left and the right
handled by content? AUDIENCE: Yeah, think of Google
spreadsheet where you have the ability to split panes
and go both directions. Well, when I scroll horizontally
I need both the header, or top and bottom
panes, to stay in sync. COLT MCANLIS: I think the– AUDIENCE: When I scroll
vertically I need– COLT MCANLIS: You have
to invoke dark magic. That’s a tricky one. Let’s dig deep on that if
you guys wouldn’t mind. We’re actually running
out of time here. And I know there’s lots of
question, so Grace and I will actually be at the Chrome
Office Hours for the next half hour. Please, everyone in line and
anyone else who has questions, please come meet us there. Let’s talk more about this stuff
and help you guys out. Once again, thank you
for your time today. Appreciate it. [APPLAUSE]

Google I/O 2013 – Web Page Design with the GPU in Mind
Tagged on:                     

6 thoughts on “Google I/O 2013 – Web Page Design with the GPU in Mind

  • May 18, 2013 at 4:44 am

    "How much memory do you give us to use on the GPU for layering". Wow c'mon guys… Clearly it's 6gb regardless of whether you're on a 1gb device or a 2gb device…

  • May 23, 2013 at 8:33 am


  • June 8, 2013 at 7:07 am

    Vista! Way to go

  • August 5, 2013 at 7:03 pm


  • January 3, 2014 at 9:30 am

    4:26 is that an apple Newton?

  • January 19, 2014 at 5:08 am

    Closed Captioning at 21:45 says "[INAUDIBLE]", when Grace actually says:

    "So when something changed, 
    Uh, we will do the, uh
    Something changed to modify the DOM-
    That's what a YOU DID
    And then we will do the paint
    And the composite on the Blink's thread"

    Lookin' out for you, deaf people 😉


Leave a Reply

Your email address will not be published. Required fields are marked *