Browser

A browser could be one process with many different threads or many different processes with a few threads communicating over IPC. There is no standard specification on how one might build a web browser.

Chrome

Processes

Browser: Controls "chrome" part of the application including address bar, bookmarks, back and forward buttons. Also handles the invisible, privileged parts of a web browser such as network requests and file access.
1. UI thread: draws buttons and input fields of the browser
2. Network thread: deals with network stack to receive data from the internet
3. Storage thread: controls access to the files and more
Renderer: A process for each tab, controls anything inside of the tab where a website is displayed.
Plugin: Controls any plugins used by the website, for example, flash.
GPU: Handles GPU tasks in isolation from other processes. It is separated into different process because GPUs handles requests from multiple apps and draw them in the same surface.

Since operating systems provide a way to restrict processes’ privileges, the browser can sandbox certain processes from certain features. For example, the Chrome browser restricts arbitrary file access for processes that handle arbitrary user input like the renderer process.

Because processes have their own private memory space, they often contain copies of common infrastructure (like V8 which is a Chrome's JavaScript engine). This means more memory usage as they can't be shared the way they would be if they were threads inside the same process.

To save memory, Chrome puts a limit on how many processes it can spin up. The limit varies depending on how much memory and CPU power your device has, but when Chrome hits the limit, it starts to run multiple tabs from the same site in one process.

Servicification in Chrome

Chrome is undergoing architecture changes to run each part of the browser program as a service allowing to easily split into different processes or aggregate into one.

General idea is that when Chrome is running on powerful hardware, it may split each service into different processes giving more stability, but if it is on a resource-constraint device, Chrome consolidates services into one process saving memory footprint.

Site Isolation

Site Isolation is a recently introduced feature in Chrome that runs a separate renderer process for each cross-site iframe.

Even running a simple Ctrl+F to find a word in a page means searching across different renderer processes.

Navigations Lifecycle

UI thread handles address bar input from user and other browser UI
When a user hits enter, the UI thread initiates a network call to get site content. The network thread goes through appropriate protocols like DNS lookup and establishing TLS Connection for the request
1. At this point, the network thread may receive a server redirect header like HTTP 301. In that case, the network thread communicates with UI thread that the server is requesting redirect.
2. Security checks such as MIME Type sniffing and Cross Origin Read Blocking (CORB) check happens to make sure sensitive cross-site data does not make it to the renderer process.
3. Network thread checks the domain against registered service worker scopes, if a service worker is registered for that URL, the UI thread finds a renderer process to execute the service worker code. The service worker may load data from cache, eliminating the need to request data from the network, or it may request new resources from the network.
Network thread tells UI thread that the data is ready
UI thread then finds a renderer process to carry on rendering of the web page.
1. an IPC is sent from the browser process to the renderer process to commit the navigation. It also passes on the data stream so the renderer process can keep receiving HTML data.
2. Once the browser process hears confirmation that the commit has happened in the renderer process, the navigation is complete and the document loading phase begins.
Once the navigation is committed, the renderer process carries on loading resources and renders the page.
Once the renderer process "finishes" rendering, it sends an IPC back to the browser process (this is after all the onload events have fired on all frames in the page and have finished executing). At this point, the UI thread stops the loading spinner on the tab.

Render Lifecycle

The renderer process is responsible for everything that happens inside of a tab. The renderer process's core job is to turn HTML, CSS, and JavaScript into a web page that the user can interact with.

Main thread: handles most of the code you send to the user.
Worker thread: sometimes parts of your JavaScript is handled by worker threads if you use a web worker or a service worker
Compositor and raster threads: run inside of a renderer processes to render a page efficiently and smoothly.

Parsing

Construction of a DOM

When the renderer process receives a commit message for a navigation and starts to receive HTML data, the main thread begins to parse the text string (HTML) and turn it into a Document Object Model (DOM).

The DOM is a browser's internal representation of the page as well as the data structure and API that web developer can interact with via JavaScript.

Parsing an HTML document into a DOM is defined by the HTML Standard. Feeding HTML to a browser never throws an error because the HTML specification is designed to handle errors gracefully.

Subresource loading

A website usually uses external resources like images, CSS, and JavaScript. Those files need to be loaded from network or cache. The main thread could request them one by one as they find them while parsing to build a DOM, but in order to speed up, "preload scanner" is run concurrently. If there are things like <img> or <link> in the HTML document, preload scanner peeks at tokens generated by HTML parser and sends requests to the network thread in the browser process.

When the HTML parser finds a <script> tag, it pauses the parsing of the HTML document and has to load, parse, and execute the JavaScript code, because JavaScript can change the shape of the document using things like document.write() which changes the entire DOM structure.

Resource Hints

If the JavaScript does not use document.write(), async or defer attribute can be added to the <script> tag, to download files in parallel without blocking parsing.

async

async scripts are executed as soon as the script is downloaded, so it doesn't guarantee the order of execution and may be render blocking if parsing is not completed when script is downloaded.

defer

defer scripts are executed after parsing is completed. So, neither the downloading nor the execution of the defer files are render-blocking. It also waits for other defer scripts to be downloaded and guarantees the order of execution in which they appear in the page.

dns-prefetch: indicates to the browser that it should perform the resolution of a given domain name (determining the IP to contact) before that domain is used to download resources.

preconnect: indicates to the browser that it should connect a given origin, before that domain is used to download resources. Preconnecting involves, like dns-prefetch, the DNS resolution, but also the TCP handshake and TLS negotiation (if the page is secure).

preload: tells the browser that the resource is needed for current navigation and it must be downloaded as soon as possible, with high priority.

prefetch: hint the browsers that the user is likely to need the target resource for future navigations. The resource is downloaded with a low priority.

Styles

The main thread parses CSS and determines the computed style for each DOM node.

Layout Tree

The layout is a process to find the geometry of elements. The main thread walks through the DOM and computed styles and creates the layout tree which has information like x y coordinates and bounding box sizes. Layout tree may be similar structure to the DOM tree, but it only contains information related to what's visible on the page.

If display: none is applied, that element is not part of the layout tree (however, an element with visibility: hidden is in the layout tree). Similarly, if a pseudo class with content like p::before{content:"Hi!"} is applied, it is included in the layout tree even though that is not in the DOM.

Determining the Layout of a page is a challenging task. Even the simplest page layout like a block flow from top to bottom has to consider how big the font is and where to line break them because those affect the size and shape of a paragraph; which then affects where the following paragraph needs to be.

CSS can make element float to one side, mask overflow item, and change writing directions. One can imagine, this layout stage has a mighty task.

Paint

This process determines the order of painting. For example, z-index might be set for certain elements, in that case painting in order of elements written in the HTML will result in incorrect rendering.

At this paint step, the main thread walks the layout tree to create paint records. Paint record is a note of painting process like - background first, then text, then rectangle.

The most important thing to grasp in rendering pipeline is that at each step the result of the previous operation is used to create new data.

Even if your rendering operations are keeping up with screen refresh, these calculations are running on the main thread, which means it could be blocked when your application is running JavaScript.

JavaScript operation can be divided into small chunks and schedule to run at every frame using requestAnimationFrame().

Turning this information to pixels on the screen is called rasterizing. A naive way to handle this would be to raster parts inside of the viewport. If a user scrolls the page, then move the rastered frame, and fill in the missing parts by rastering more.

Compositing

However, the modern browser runs a more sophisticated process called compositing. Compositing is a technique to separate parts of a page into layers, rasterize them separately, and composite as a page in a separate thread called compositor thread. If scroll happens, since layers are already rasterized, all it has to do is to composite a new frame. Animation can be achieved in the same way by moving layers and composite a new frame.

To find out which elements need to be in which layers, the main thread walks through the layout tree to create the layer tree. If certain parts of a page that should be separate layer (like slide-in side menu) is not getting one, then you can hint to the browser by using will-change attribute in CSS.

One might be tempted to give layers to every element, but compositing across an excess number of layers could result in slower operation than rasterizing small parts of a page every frame, so it is crucial that to measure rendering performance of your application.

Once the layer tree is created and paint orders are determined, the main thread commits that information to the compositor thread. The compositor thread then rasterizes each layer. A layer could be large like the entire length of a page, so the compositor thread divides them into tiles and sends each tile off to raster threads. Raster threads rasterize each tile and store them in GPU memory.

The compositor thread can prioritize different raster threads so that things within the viewport (or nearby) can be rastered first. A layer also has multiple tilings for different resolutions to handle things like zoom-in action.

Once tiles are rastered, compositor thread gathers tile information called draw quads to create a compositor frame.

Draw quads: Contains information such as the tile's location in memory and where in the page to draw the tile taking in consideration of the page
compositing. Compositor frame: A collection of draw quads that represents a frame of a page.

A compositor frame is then submitted to the browser process via IPC. At this point, another compositor frame could be added from UI thread for the browser UI change or from other renderer processes for extensions. These compositor frames are sent to the GPU to display it on a screen.

The benefit of compositing is that it is done without involving the main thread. Compositor thread does not need to wait on style calculation or JavaScript execution. This is why compositing only animations are considered the best for smooth performance. If layout or paint needs to be calculated again then the main thread has to be involved.

Input Events

When user gesture like touch on a screen occurs, the browser process is the one that receives the gesture at first and it sends the event type and its coordinates to the renderer process. Renderer process handles the event appropriately by finding the event target and running event listeners that are attached.

Non-fast scrollable region

If no input event listeners are attached to the page, Compositor thread can create a new composite frame completely independent of the main thread.

The compositor thread marks a region of the page that has event handlers attached as 'Non-Fast Scrollable Region'. With this information, the compositor thread can make sure to send input event to the main thread if the event occurs in that region. If input event comes from outside of this region, then the compositor thread carries on compositing new frame without waiting for the main thread.

Avoid event delegation, which attaches handler to topmost element and allows event to bubble to event target. This will cause the entire page is marked as a non-fast scrollable region

{ passive: true } hints to the browser that you still want to listen to the event in the main thread, but compositor can go ahead and composite new frame as well.

Coalescing Events

When the compositor thread sends an input event to the main thread, the first thing to run is a hit test to find the event target. Hit test uses paint records data that was generated in the rendering process to find out what is underneath the point coordinates in which the event occurred.

Input event has higher fidelity than the screen can refresh.

If a continuous event like touchmove was sent to the main thread 120 times a second, then it might trigger excessive amount of hit tests and JavaScript execution compared to how slow the screen can refresh.

To minimize excessive calls to the main thread, Chrome coalesces continuous events (such as wheel, mousewheel, mousemove, pointermove, touchmove ) and delays dispatching until right before the next requestAnimationFrame.

Any discrete events like keydown, keyup, mouseup, mousedown, touchstart, and touchend are dispatched immediately.

For most web applications, coalesced events should be enough to provide a good user experience. However, if you are building things like drawing application and putting a path based on touchmove coordinates, you may lose in-between coordinates to draw a smooth line. In that case, one can use the event.getCoalescedEvents method in the pointer event to get information about those coalesced events.

Feature Policy

Feature Policy is a new web platform feature that can be a guardrail for you when you are building your project. Turning on feature policy guarantees the certain behavior of your app and prevents you from making mistakes.

The HTTP Feature-Policy header provides a mechanism to allow and deny the use of browser features in its own frame, and in content within any <iframe> elements in the document.