How Does the Browser Parse HTML?
Answer
When you load a webpage, the browser goes through a complex process to convert HTML text into a rendered page. This is called the Critical Rendering Path.
Parsing Flow
Step-by-Step Process
1. Bytes → Characters → Tokens
Bytes: 3C 68 74 6D 6C 3E...
↓
Characters: <html><head>...
↓
Tokens: [StartTag: html] [StartTag: head] [EndTag: head]...
2. Tokens → DOM Tree
Parser Blocking
Script Loading Strategies
<!-- Parser blocking (avoid in <head>) -->
<script src="app.js"></script>
<!-- Async: Download parallel, execute immediately when ready -->
<script async src="analytics.js"></script>
<!-- Defer: Download parallel, execute after DOM parsed -->
<script defer src="app.js"></script>
Render Tree Construction
DOM Tree + CSSOM = Render Tree
Note: Render Tree only contains VISIBLE elements
- <head> not included
- display: none not included
- visibility: hidden IS included (takes space)
Layout (Reflow)
The browser calculates:
- Position of each element
- Size (width, height)
- Geometry relationships
// These trigger layout/reflow (expensive)
element.offsetWidth;
element.getBoundingClientRect();
element.style.width = "100px";
Paint & Composite
Paint: Fill in pixels
- Colors
- Borders
- Shadows
- Text
Composite: Layer management
- z-index layers
- Transform layers (GPU accelerated)
- Opacity layers
Optimization Tips
<!-- 1. CSS in head, JS at end or defer -->
<head>
<link rel="stylesheet" href="styles.css" />
</head>
<body>
<!-- Content -->
<script defer src="app.js"></script>
</body>
<!-- 2. Preload critical resources -->
<link rel="preload" href="font.woff2" as="font" crossorigin />
<!-- 3. Avoid layout thrashing -->
<script>
// ❌ Bad: Read then write repeatedly
for (let i = 0; i < items.length; i++) {
items[i].style.width = container.offsetWidth + "px";
}
// ✅ Good: Batch reads, then writes
const width = container.offsetWidth;
for (let i = 0; i < items.length; i++) {
items[i].style.width = width + "px";
}
</script>
Key Points
- HTML parsing builds DOM tree incrementally
- CSS blocks rendering, not parsing
- Scripts block parsing (unless async/defer)
- Render tree = visible DOM + CSSOM
- Layout calculates geometry
- Minimize reflows for performance