How Web Browsers Work

Understanding the internal working mechanism of web browsers, from DNS lookup to page rendering.

Featured image



Overview

Let’s explore how web browsers work internally, from the moment a user enters a URL until the page is displayed.

Browser Components

Modern web browsers consist of several main components:

  1. User Interface - Address bar, back/forward buttons, bookmarks, etc.
  2. Browser Engine - Marshals actions between the UI and the rendering engine
  3. Rendering Engine - Responsible for displaying requested content (parses HTML and CSS)
  4. Networking - Handles HTTP requests
  5. JavaScript Engine - Parses and executes JavaScript code (e.g., V8 in Chrome)
  6. UI Backend - Used for drawing basic widgets like combo boxes and windows
  7. Data Storage - Persistence layer (cookies, localStorage, IndexedDB, etc.)


Summary

  1. User accesses website through browser (www.a.com)
  2. Browser identifies server’s IP address through DNS
  3. Browser and server perform 3-Way Handshake
  4. Browser sends HTTP Request to server
  5. Server sends HTTP Response to browser
  6. Browser parses HTML to create DOM Tree
  7. Upon encountering Style tags, pauses DOM creation to parse CSS and create CSSOM Tree
  8. When encountering script tags, passes control to JavaScript engine to parse and create AST
  9. Creates Render Tree by combining DOM + CSSOM
  10. This process is called Construction
  11. Rendering engine performs Layout on Render Tree nodes
  12. UI backend draws UI by traversing Render Tree nodes (Painting)
  13. Finally, composes nodes in Render Tree in order (Composition)
  14. This process is called Operation
  15. Displays final result to web user



Web Browser Working Process Flow Chart


graph TD; A[User accesses website www.a.com] --> B[DNS Lookup: Resolve IP address]; B --> C[3-Way Handshake SYN → SYN/ACK → ACK]; C --> D[Send HTTP Request to Server]; D --> E[Receive HTTP Response]; E --> F[Parse HTML → Create DOM Tree]; F --> G{Style tag detected?}; G -- Yes --> H[Parse CSS → Create CSSOM Tree]; H --> F; G -- No --> I{Script tag detected?}; I -- Yes --> J[Parse JavaScript → Create AST]; J --> F; I -- No --> K[Merge DOM + CSSOM → Render Tree]; K --> L[Layout: Position Elements]; L --> M[Painting: Render UI]; M --> N[Composition: Organize Layers z-index]; N --> O[👀 Display Rendered Page to User]; %% Additional Explanation E -.-> P[⚡ Partial Rendering for Faster Display]; P --> F;



🔍 Detailed Process


Construction Phase
  • STEP 1: Browser - DNS
    - User enters website URL (www.a.com).
    - Browser checks its cache for DNS records.
    - If not found, browser queries DNS resolver cache.
    - If still not found, DNS server performs recursive query to find IP.
    - DNS returns IP address (e.g., 1.1.1.1).
  • STEP 2: Browser - Server
    - Browser connects to server with IP address using random sequence number.
    - Performs 3-Way Handshake (SYN → SYN/ACK → ACK).
    - For HTTPS, TLS handshake occurs (cipher suites, certificate validation).
    - Browser sends HTTP Request with headers (User-Agent, Accept, etc.).
    - Server processes request and prepares response.
    - Server responds with HTTP Response (status code, headers, content).
  • STEP 3: Browser - Parsing
    - Browser parses received data according to W3C specifications.
    - Rendering engine creates DOM Tree from HTML (document object model).
    - When encountering Style tags:
    - Pauses DOM creation.
    - Parses CSS to create CSSOM Tree (CSS object model).
    - Prioritizes render-critical CSS.
    - Resolves styles with specificity rules.
    - Resumes DOM creation.
    - When encountering Script tags:
    - Pauses parsing (unless async/defer attributes).
    - Passes control to JS Engine.
    - Creates AST (Abstract Syntax Tree).
    - Compiles JavaScript to bytecode.
    - Executes JavaScript code which may modify DOM/CSSOM.
    - Creates Render Tree by combining DOM + CSSOM.
    - Render Tree only includes visible elements (excludes display:none).
Operation Phase
  • STEP 1: Layout
    - Rendering engine calculates exact position and size of each element.
    - Computes the geometry of all elements on the page (width, height, position).
    - Determines how elements affect each other (ex: parent element dimensions).
    - Handles responsive layouts with media queries.
    - This process was historically called "Reflow" in some browsers.
  • STEP 2: Painting
    - UI Backend converts layout information into actual pixels on screen.
    - Draws every visual part of the elements (text, colors, borders, shadows, etc.).
    - Creates multiple layers when necessary for efficient updates.
    - Uses GPU acceleration for certain CSS properties when available.
  • STEP 3: Composition
    - Combines the painted layers into final screens.
    - Arranges node layers in order (based on z-index).
    - Lower z-index elements first, followed by higher ones.
    - Handles transparency and blending between layers.
    - Most efficient for animations and scrolling (avoids repainting).



Critical Rendering Path

The Critical Rendering Path is the sequence of steps browsers take to convert HTML, CSS, and JavaScript into actual pixels on the screen:

  1. HTML Processing → DOM: Parse HTML to create the Document Object Model
  2. CSS Processing → CSSOM: Parse CSS to create the CSS Object Model
  3. JavaScript Execution: Execute JavaScript that might modify DOM and CSSOM
  4. Render Tree Construction: Combine DOM and CSSOM into a render tree
  5. Layout: Calculate the exact position and size of each element
  6. Paint: Fill in pixels for all visible content
  7. Composite: Draw the layers in the correct order

Optimizing the Critical Rendering Path:

  • Minimize number of critical resources (HTML, CSS, JS needed for initial render)
  • Minimize critical path length by optimizing the order resources are loaded
  • Minimize number of critical bytes by compressing and optimizing resources


Performance Optimization Techniques

  • Resource Loading Optimization:
    • Use async and defer attributes for non-critical JavaScript
    • Load critical CSS inline and non-critical CSS asynchronously
    • Implement resource hints: preload, prefetch, preconnect
    • Use HTTP/2 for parallel loading of resources
  • Rendering Optimization:
    • Avoid layout thrashing (multiple forced reflows)
    • Use CSS will-change property for elements that will animate
    • Use hardware-accelerated CSS properties (transform, opacity) for animations
    • Implement code-splitting to reduce initial JavaScript load
  • Measurement Tools:
    • Lighthouse for performance auditing
    • Chrome DevTools Performance panel
    • WebPageTest for detailed waterfall analysis
    • Core Web Vitals metrics (LCP, FID, CLS)


Browser Differences

Different browsers use different rendering engines and JavaScript engines:

Browser Rendering Engine JavaScript Engine
Chrome Blink V8
Firefox Gecko SpiderMonkey
Safari WebKit JavaScriptCore (Nitro)
Edge (modern) Blink V8
Internet Explorer Trident Chakra

Key Differences:

  • Feature support (check caniuse.com for compatibility)
  • Performance characteristics
  • Developer tools capabilities
  • Implementation of standards
  • Security model and sandboxing


Additional Notes

The parsing, layout, and UI drawing processes don’t wait for all data to be received from the server. For faster user experience:

Real-World Example: Loading a Modern Web Page

When loading a typical website (e.g., an e-commerce site):

  1. Browser resolves DNS and establishes HTTPS connection
  2. Receives initial HTML (first contentful paint may occur)
  3. Requests CSS files referenced in HTML
  4. Requests JavaScript files (async/defer determines when they execute)
  5. Requests web fonts
  6. JavaScript might fetch additional data via AJAX/fetch API
  7. Single Page Applications (SPAs) perform client-side rendering after initial load
  8. Progressive Web Apps (PWAs) might use service workers to cache resources
  9. Third-party scripts (analytics, ads) might load additional resources
  10. Lazy-loading might defer images and other content until scrolled into view

🔑 Key Points

  1. CSS is a render-blocking resource, not a parsing-blocking resource
  2. JavaScript execution blocks parsing (unless async/defer is used)
  3. Progressive rendering improves perceived performance
  4. DOM and CSSOM construction must complete before render tree creation
  5. Layout and painting are computationally expensive operations
  6. Compositing allows for efficient animations and scrolling



Reference