Skip to main content

How Does the Browser Parse HTML?

Answer

When you load a webpage, the browser goes through a complex process to convert HTML text into a rendered page. This is called the Critical Rendering Path.

Parsing Flow

Step-by-Step Process

1. Bytes → Characters → Tokens

Bytes: 3C 68 74 6D 6C 3E...

Characters: <html><head>...

Tokens: [StartTag: html] [StartTag: head] [EndTag: head]...

2. Tokens → DOM Tree

Parser Blocking

Script Loading Strategies

<!-- Parser blocking (avoid in <head>) -->
<script src="app.js"></script>

<!-- Async: Download parallel, execute immediately when ready -->
<script async src="analytics.js"></script>

<!-- Defer: Download parallel, execute after DOM parsed -->
<script defer src="app.js"></script>

Render Tree Construction

DOM Tree + CSSOM = Render Tree

Note: Render Tree only contains VISIBLE elements
- <head> not included
- display: none not included
- visibility: hidden IS included (takes space)

Layout (Reflow)

The browser calculates:

  • Position of each element
  • Size (width, height)
  • Geometry relationships
// These trigger layout/reflow (expensive)
element.offsetWidth;
element.getBoundingClientRect();
element.style.width = "100px";

Paint & Composite

Paint: Fill in pixels
- Colors
- Borders
- Shadows
- Text

Composite: Layer management
- z-index layers
- Transform layers (GPU accelerated)
- Opacity layers

Optimization Tips

<!-- 1. CSS in head, JS at end or defer -->
<head>
<link rel="stylesheet" href="styles.css" />
</head>
<body>
<!-- Content -->
<script defer src="app.js"></script>
</body>

<!-- 2. Preload critical resources -->
<link rel="preload" href="font.woff2" as="font" crossorigin />

<!-- 3. Avoid layout thrashing -->
<script>
// ❌ Bad: Read then write repeatedly
for (let i = 0; i < items.length; i++) {
items[i].style.width = container.offsetWidth + "px";
}

// ✅ Good: Batch reads, then writes
const width = container.offsetWidth;
for (let i = 0; i < items.length; i++) {
items[i].style.width = width + "px";
}
</script>

Key Points

  • HTML parsing builds DOM tree incrementally
  • CSS blocks rendering, not parsing
  • Scripts block parsing (unless async/defer)
  • Render tree = visible DOM + CSSOM
  • Layout calculates geometry
  • Minimize reflows for performance