Paul Irish

Making the www great

Advanced Performance Audits With DevTools

Recently, I’ve spent some time recently profiling real-world mobile websites. Using the 1000/100/6 performance model1, and spelunking deep into each app, the findings have been fascinating.

I’ve written up case study documents for each, incorporating all the findings:

  1. Illustrated diagnoses for the poor performance
  2. What actions the developer should take
  3. How Chrome’s tooling should improve
  4. Questions and insights for the rendering engine (Blink)

➜ Perf Audits: CNet, Time, Google Play

In this doc, we look at the scrolling of CNET, input latency on CNET, some very interesting challenges on the responsive Time.com, and infinite scroll on Google Play’s desktop site.

The intended audience is browser engineers and performance-minded frontend developers. It’s fairly advanced, but I’m spelunking deep to identify how the sites butt heads with the browser APIs and architecture.

Lastly, we’re using this research to improve Chrome DevTools and what you hear from Chrome.

Wikipedia eng team scrutinizing their performance millisecond by millisecond. (Yes, it’s a long paper printout of the Chrome DevTools timeline flamechart :)

(BTW, use this link to view the same doc but with comments enabled)

1 - More on this performance model later. Stay tuned.

WebKit for Developers

Feb 2015: A lot’s happened since I wrote this post two years ago. Chrome forked WebKit and started Blink, Opera adopted Chromium, and node-webkit became nw.js. This post describes a complexity of defining WebKit that doesn’t exist much anymore; with Chrome’s departure the WebKit world is more simple and clear.

WebKit is deployed through iOS Safari and Mac Safari, and the active GTK community leverages WebKit inside the GNOME Platform. Some smaller mobile browsers use WebKit, some Chromium, some use forks of either, and many just use the system WebViews that are both powered by up-to-date version of iOS WebKit and Android Chromium.

The post below is kept intact and represents a snapshot of history in early 2013, rather than the modern WebKit landscape.

WebKit box logo For many of us developers, WebKit is a black box. We throw HTML, CSS, JS and a bunch of assets at it, and WebKit, somehow.. magically, gives us a webpage that looks and works well. But in fact, as my colleague Ilya Grigorik puts it

WebKit isn’t a black box. It’s a white box. And not just that, but an open, white box.

So let’s take a moment to understand some things:

What is WebKit?
What isn’t WebKit?
How is WebKit used by WebKit-based browsers?
Why are all WebKits not the same?

Now, especially with the news that Opera has moved to WebKit, we have a lot of WebKit browsers out there, but its pretty hard to know what they share and where they part ways. Below we’ll hopefully shine some light on this. As a result you’ll be able to diagnose browser differences better, report bugs at the right tracker, and understand how to develop against specific browsers more effectively.

Standard Web Browser Components

Let’s lay out a few components of the modern day web browser:

  • Parsing (HTML, XML, CSS, JavaScript)
  • Layout
  • Text and graphics rendering
  • Image decoding
  • GPU interaction
  • Network access
  • Hardware acceleration

Which of those are shared in WebKit-based browsers? Pretty much only the first two.

The others are handled by individual WebKit ports. Let’s review what that means…

The WebKit Ports

There are different “ports” of WebKit, but allow me to let Ariya Hidayat, WebKit hacker and eng director at Sencha to explain:

What is the popular reference to WebKit is usually Apple’s own flavor of WebKit which runs on Mac OS X (the first and the original WebKit library). As you can guess, the various interfaces are implemented using different native libraries on Mac OS X, mostly centered around CoreFoundation. For example, if you specify a flat colored button with specific border radius, well WebKit knows where and how to draw that button. However, the final actual responsibility of drawing the button (as pixels on the user’s monitor) falls into CoreGraphics.

As mentioned above, using CG is unique to the Mac port. Chrome on Mac uses Skia here.

With time, WebKit was “ported” into different platform, both desktop and mobile. Such flavor is often called “a WebKit port”. For Safari Windows, Apple themselves also ported WebKit to run on Windows, using the Windows version of its (limited implementation of) CoreFoundation library.

… though Safari for Windows is now dead.

Beside that, there were many other “ports” as well (see the full list). Google has created and continues to maintain its Chromium port. There is also WebKitGtk which is based on Gtk+. Nokia (through Trolltech, which it acquired) maintains the Qt port of WebKit, popular as its QtWebKit module.

Some of the ports of WebKit

  • Safari
    • Safari for OS X and Safari for Windows are two different ports
    • WebKit nightly is an edge build of the Mac port that’s used for Safari. More later…
  • Mobile Safari
    • Maintained in a private branch, but lately being upstreamed
    • Chrome on iOS (using Apple’s WebView; more on it’s differences later)
  • Chrome (Chromium)
  • Android Browser
    • Used the latest WebKit source at the time
  • Plenty more ports: Amazon Silk, Dolphin, Blackberry, QtWebKit, WebKitGTK+, The EFL port (Tizen), wxWebKit, WebKitWinCE, etc

Different ports can have different focuses. The Mac port’s focus is split between Browser and OS, and introduces Obj-C and C++ bindings to embed the renderer into native applications. Chromium’s focus is purely on the browser. QtWebKit offers its port for applications to use as a runtime or rendering engine within its cross-platform GUI application architecture.

What’s shared in all WebKit browsers

First, let’s review the commonalities shared by all WebKit ports.

  1. So first, WebKit parses HTML the same way. Well, except Chromium is the only port so far to enable threaded HTML parsing support.
  2. … Okay, but once parsed, the DOM tree is constructed the same. Well, actually Shadow DOM is only turned on for the Chromium port, so DOM construction varies. Same goes for custom elements.
  3. … Okay, well WebKit creates a window object and document object for everyone. True, though the properties and constructors it exposes can be conditional on the feature flags enabled.
  4. … CSS parsing is the same, though. Slurping up your CSS and turning it into CSSOM’s pretty standard. Yeah, though Chrome accepts just the -webkit- prefix whereas Apple and other ports accept legacy prefixes like -khtml- and -apple-.
  5. … Layout.. positioning? Those are the bread and butter. Same, right? Come on! Sub-pixel layout and saturated layout arithmetic is part of WebKit but differs from port to port.
  6. Super.

So, it’s complicated.

Just like how Flickr and Github implement features behind flags, WebKit does the same. This allows ports to enable/disable all sorts of functionality with WebKit’s compile-time Feature Flags. Features can also be exposed as run-time flags either through command line switches (Chromium’s here) or configuration like about:flags.

All right, well let’s try again to codify what’s the same in WebKit land…

What’s common to every WebKit port

  • The DOM, window, document
    • more or less
  • The CSSOM
  • CSS parsing, property/value handling
    • sans vendor prefix handling
  • HTML parsing and DOM construction
    • same if we suspended belief in Web Components
  • All layout and positioning
    • Flexbox, Floats, block formatting contexts… all shared
  • The UI and instrumentation for the Chrome DevTools aka WebKit Inspector.
    • Though since last April, Safari uses it’s own, non-WebKit, closed-source UI for Safari Inspector
  • Features like contenteditable, pushState, File API, most SVG, CSS Transform math, Web Audio API, localStorage
    • Though backends vary. Each port may use a different storage layer for localStorage and may use different audio APIs for Web Audio API.
  • Plenty of other features & functionality

It gets a little murky in those areas. Let’s try some differences that are much less murky.

Now, what’s not shared in WebKit ports:

  • Anything on the GPU
    • 3D Transforms
    • WebGL
    • Video decoding
  • 2D drawing to the screen
    • Antialiasing approaches
    • SVG & CSS gradient rendering
  • Text rendering & hyphenation
  • Network stack (SPDY, prerendering, WebSocket transport)
  • A JavaScript engine
    • JavaScriptCore is in the WebKit repo. There are bindings in WebKit for both it and V8
  • Rendering of form controls
  • <video> & <audio> element behavior (and codec support)
  • Image decoding
  • Navigating back/forward
    • The navigation parts of pushState()
  • SSL features like Strict Transport Security and Public Key Pins

Let’s take one of these: 2D graphics Depending on the port, we’re using completely different libraries to handle drawing to screen:

Graphics layer in WebKit

Or to go a little more micro… a recently landed feature: CSS.supports() was enabled for all ports except win and wincairo, which don’t have css3 conditional features enabled.

Now that we’ve gotten technical.. time to get pedantic. Even the above isn’t correct. It’s actually WebCore that’s shared. WebCore is a layout, rendering, and Document Object Model (DOM) library for HTML and SVG, and is generally what people think of when they say WebKit. In actuality “WebKit” is technically the binding layer between WebCore and the ports, though in casual conversation this distinction is mostly unimportant.

A diagram should help:

WebKit Diagram

Many of the components within WebKit are swappable (shown above in gray).

As an example, WebKit’s JavaScript engine, JavaScriptCore, is a default in WebKit. (It is based originally on KJS (from KDE) back when WebKit started as a fork of KHTML). Meanwhile, the Chromium port swaps in V8 here, and uses unique DOM bindings to map things over.

Fonts and Text rendering is a huge part of platform. There are 2 separate text paths in WebKit: Fast and Complex. Both require platform-specific (port-side) support, but Fast just needs to know how to blit glyphs (which WebKit caches for the platform) and complex actually hands the whole string off to the platform layer and says “draw this, plz”.

“WebKit is like a Sandwich. Though in Chromium’s case it’s more like a taco. A delicious web platform taco.” Dimitri Glazkov, Chrome WebKit hacker. Champion of Web Components and Shadow DOM

Now, let’s widen the lens and look at a few ports and a few subsystems. Below are five ports of WebKit; consider how varied their stacks are, despite sharing much of WebCore.

Chrome (OS X) Safari (OS X) QtWebKit Android Browser Chrome for iOS
Rendering Skia CoreGraphics QtGui Android stack/Skia CoreGraphics
Networking Chromium network stack CFNetwork QtNetwork Fork of Chromium’s network stack Chromium stack
Fonts CoreText via Skia CoreText Qt internals Android stack CoreText
JavaScript V8 JavaScriptCore JSC (V8 is used elsewhere in Qt) V8 JavaScriptCore (without JITting) *

* A side note on Chrome for IOS. It uses UIWebView, as you likely know. Due to the capabilities of UIWebView that means it can only use the same rendering layer as Mobile Safari, JavaScriptCore (instead of V8) and a single-process model. Still, considerable Chromium-side code is leveraged, such as the network layer, the sync and bookmarks infrastructure, omnibox, metrics and crash reporting. (Also, for what it’s worth, JavaScript is so rarely the bottleneck on mobile that the lack of JITting compiler has minimal impact.)

All right, so where does this leave us?

So, all WebKits are totally different now. I’m scared.

Don’t be! The layoutTest coverage in WebKit is enormous (28,000 layoutTests at last count), not only for existing features but for any found regressions. In fact, whenever you’re exploring some new or esoteric DOM/CSS/HTML5-y feature, the layoutTests often have fantastic minimal demos of the entire web platform.

In addition, the W3C is ramping up its effort for conformance suite testing. This means we can expect both different WebKit ports and all browsers to be testing against the same suite of tests, leading to fewer quirks and a more interoperable web. For all those who have assisted this effort by going to a Test The Web Forward event… thank you!

Opera just moved to WebKit. How does that play out?

Robert Nyman and Rob Hawkes touched on this, too, but I’ll add that one of the significant parts of Opera’s announcement was that Opera adopted Chromium. This means the WebGL, Canvas, HTML5 forms, 2D graphics implementations–all that stuff will be the same on Chrome and Opera now. Same APIs, and same backend implementation. Since Opera is Chromium-based, you can feel confident that your cutting-edge work will be compatible with Chrome and Opera simultaneously.

I should also point out that all Opera browsers will be adopting Chromium. So Opera for Windows, Mac and Linux and Opera Mobile (the fully fledged mobile browser). Even Opera Mini, the thin client, will be swapping out the current server-side rendering farm based on Presto with one based on Chromium.

.. and the WebKit Nightly. What is that?

It’s the mac port of WebKit, running inside of the same binary that Safari uses (though with a few underlying libraries swapped out). Apple mostly calls the shots in it, so its behavior and feature set is congruent with what you’ll find in Safari. In many cases Apple takes a conservative approach when enabling features that other ports may be implementing or experimenting with. Anyway, if you want to go back to middle-school analogies, think of it as… WebKit Nightly is to Safari what Chromium is to Chrome.

Chrome Canary also has the latest WebKit source within a day or so.

Tell me more about WebKit internals.

You got it, bucko.

Further reading:

Hopefully this article described a bit of the internals of WebKit browsers and gave some insight on where the WebKit ends and the ports begin.


Reviewed by Peter Beverloo (Chrome), Eric Seidel (WebKit).
I’ll update with any corrections or modifications.

  • 8:30am. Removing the slides embed because it’s making Firefox scroll to its position.
  • 9:50am. Chrome for Mac’s font rendering in Skia uses CoreText to draw glyphs, so it’s more like CoreText via Skia. (thx thakis!)
  • 10:49am. Fixed broken link and typo. (thx tim!) Fixed weird meta description choice.
  • 3:00pm: Mike West pointed to Levi Weintraub’s sweet & detailed What is WebKit?” presentation (video). slides
  • 4:00pm. tonikitoo, hacker on QtWebKit wanted to clear up some subtleties:
    Nokia (through Trolltech, which it acquired) maintains the Qt port of WebKit, popular as its QtWebKit module. It might be valuable to mentioned that Digia acquired the Trolltech/Qt division of Nokia and now maintains QtWebKit. Also, not to be too nit-picker, in your subtitles “Some of the ports of WebKit” I would have said “Some of the Products running on top of specific ports of WebKit” because strictly speaking Safari is not a port, but a “Browser that makes use of the Apple WebKit port on Mac”.
    True.
  • 5:00pm. This article has been translated into Japanese. Thanks Masataka Yakura!
  • 2013-03-17: This article has been translated into Russian
  • 2013-03-18: This article has been translated into Chinese (also this one!)
  • 2013-04-06: Chrome emeritus Ben Goodger gave more insight on the relationship between WebCore, WebKit, WebKit2 and the Chromium Content API that’s useful context.

Why Moving Elements With Translate() Is Better Than Pos:abs Top/left

Update (2016): The more canonical writeup of this technique is at High Performance Animations - HTML5 Rocks. TL;DR: Only transform & opacity; never top/left!

In modern days we have two primary options for moving an element across the screen:

  1. using CSS 2D transforms and translate()
  2. using position:absolute and top/left

Chris Coyier was asked why you should use translate. Go read his response which covers well why it’s more logical to move elements for design purposes (with transform) independent of your element layout (with position).

I wanted to answer this with Chrome and get some good evidence on what’s going on. I ended up recording a video on the experience:

It’s a good watch and dives into Chrome DevTools’ Timeline, compositing, paint cost, accelerated layers, and more… but if you want the abbreviated text version, read on:

First thing, Chris made some simple demos to try things out:

This sorta works, but there is such low complexity in this situation that it’s all going to look pretty great. We need something closer to a complex website to properly evaluate the two (thx joshua for the macbook!):

Now we’re starting to get closer, but immediately I get distracted by something.

Distraction: Pixel snapping

If you run the demo above you might notice the top edge of the MacBook looks a little bit better in the top/left one. (And here I am writing a post about why translate is better! Preposterous!) So this is due to the absolute positioned macbook sticks to pixel positions, whereas the translate()‘d one can interpolate at sub-pixel positions.

One of the Chrome GPU engineers, James Robinson, called this the “dubstep effect”, as these pixels appear to be getting pounded by bass. Here’s a closeup of the effect… watch the top white edge of the macbook in each:

On the left, you see stair-stepping down three pixels and back up again. This pixel snapping may result in a less distracting transition in this case, though this wouldn’t be so noticeable if you were moving objects with less of a high-contrast situation.

Back to performance

If you run DevTool’s Timeline Frames mode on these two examples here, they start to tell a very different story:

The top/left has very large time to paint each frame, which results in a choppier transition. All the CSS including some big box shadows, are computed on the CPU and composited against that gradient backdrop every frame. The translate version, on the other hand, gets the laptop element elevated onto it’s own layer on the GPU (called a RenderLayer). Now that it sits on its own layer, any 2D transform, 3D transform, or opacity changes can happen purely on the GPU which will stay extremely fast and still get us quick frame rates.

Watch the video above towards the end to see more on how to diagnose paint costs, see what areas are being repainted, and evaluate what elements are on the GPU.

Demo

Click through below, try adding more and see how the the framerate reacts. Feel free to open up DevTools, too and explore Timeline:

Guidelines for animation

  1. Use CSS keyframe animation or CSS transitions, if at all possible. The browser can optimize painting and compositing bigtime here.
  2. If needs to be it’s JS-based animation, use requestAnimationFrame. Avoid setTimeout, setInterval.
  3. Avoid changing inline styles on every frame (jQuery animate()-style) if you can, declarative animations in CSS can be optimized by the browser way more.
  4. Using 2D transforms instead of absolute positioning will typically provide better FPS by way of smaller paint times and smoother animation.
  5. Use Timeline Frame’s mode to investigate what is slowing down your behavior
  6. “Show Paint Rects” and “Render Composited Layer Borders” are good pro-moves to verify where your element is being rendered.