Lighthouse and Core Web Vitals: What Each Score Actually Measures

Lighthouse and Core Web Vitals: What Each Score Actually Measures

Open Lighthouse on a fresh page and you get four colored arcs and a confident score out of 100. It feels authoritative — until you refresh and the score swings 12 points without anything changing. That gap between what Lighthouse reports and what your users actually feel is the whole story of Core Web Vitals. This post walks through what each metric actually measures, what numbers pass the bar, and how to make the score move in the right direction.

Lighthouse vs Core Web Vitals — Not the Same Thing

Lighthouse is a tool. Core Web Vitals are metrics. The confusion comes from the fact that Lighthouse reports Core Web Vitals (among many other things), so people use the names interchangeably.

The important distinction: Lighthouse runs a lab test. It loads your page in a simulated environment — throttled CPU, throttled network, headless Chrome — and measures what happens. It is reproducible, debuggable, and synthetic.

Core Web Vitals as Google uses them for ranking come from the Chrome User Experience Report (CrUX) — real Chrome users visiting your real site at the 75th percentile over a 28-day window. That is field data, not lab data. Your Lighthouse score can be green while your CrUX numbers fail, because real users have older phones, flaky LTE, and ad-blockers that lab tests do not simulate. The reverse also happens — Lighthouse penalizes things CrUX never sees.

Treat Lighthouse as your debugger and CrUX as the scoreboard. They agree most of the time, but when they disagree, CrUX wins.

Largest Contentful Paint (LCP)

LCP is the time from navigation start until the largest visible element renders. That element is usually a hero image, a video poster, or a big block of headline text. The browser tracks every contender as the page paints and reports the largest one observed before the user interacts.

The thresholds are fixed and public:

LCP threshold     |  Verdict
------------------+-----------
0 - 2.5 s         |  Good (green)
2.5 - 4.0 s       |  Needs improvement
> 4.0 s           |  Poor (red)

LCP is dominated by three things: server response time, render-blocking resources, and the size of the LCP element itself. The big wins are almost always boring — preload the LCP image with <link rel="preload" as="image">, ship modern formats (WebP/AVIF), set explicit width and height so the browser can reserve space, and make sure no JavaScript needs to run before the hero is painted. If your LCP element is text, fonts and CSS delivery dominate. We unpack this further in How Browsers Render a Page and The Critical Rendering Path.

A common surprise: LCP includes time spent waiting for fonts when the LCP element is text and you used font-display: block (the default). Switch to swap or preload the font subset.

Interaction to Next Paint (INP)

INP replaced First Input Delay (FID) as a Core Web Vital in March 2024. FID only measured the first input on a page; INP measures the worst interaction across the entire visit. That change exposed thousands of sites whose first click was fine but whose menu animations, type-ahead search, or modal openings stuttered for half a second.

INP threshold     |  Verdict
------------------+-----------
0 - 200 ms        |  Good (green)
200 - 500 ms      |  Needs improvement
> 500 ms          |  Poor (red)

INP is the time from user input (click, tap, keypress) to the next frame the browser actually paints. It captures the full pipeline: input delay → event handler runtime → rendering and layout. Long tasks on the main thread are the usual culprit. If a click handler runs a 300 ms loop, INP is at least 300 ms even if the visual result lands instantly afterwards.

The practical levers are well-defined. Break long tasks with scheduler.yield() or setTimeout(fn, 0). Move heavy parsing — markdown, syntax highlighting, JSON validation — to a Web Worker. Defer non-critical work with requestIdleCallback. And profile with the Performance panel in Chrome DevTools, looking specifically at the Interactions track, which highlights INP candidates with their full critical path.

Cumulative Layout Shift (CLS)

CLS measures unexpected movement of visible elements during page load. The score is a unitless number — the sum of layout-shift impact fractions across the user's session, weighted by how far things moved.

CLS threshold     |  Verdict
------------------+-----------
0.00 - 0.10       |  Good (green)
0.10 - 0.25       |  Needs improvement
> 0.25            |  Poor (red)

The classic CLS-killer is an image without dimensions. The browser starts rendering the article, the image loads two seconds later, content jumps down, the user clicks the wrong button, the score tanks. Always set width and height attributes on <img> and <video> elements, or use CSS aspect-ratio. Same logic for ads, embeds, and dynamically injected banners — reserve space.

A subtle one: web fonts. If your fallback font is taller or shorter than the loaded font, every paragraph reflows when the font swaps in. The fix is size-adjust, ascent-override, and descent-override in @font-face, or simply matching your fallback to the loaded font's metrics. There is more on this in Web Fonts and Performance.

User-initiated shifts (clicking "load more") do not count against CLS. Animations using transform do not trigger layout, so they do not count either. CLS only catches involuntary movement.

Time to First Byte (TTFB)

TTFB is not officially a Core Web Vital, but it underpins them. It is the time from navigation start to the first byte of the response arriving at the browser — DNS, TCP, TLS, server thinking, and the start of the response.

TTFB threshold    |  Verdict
------------------+-----------
0 - 800 ms        |  Good
800 - 1800 ms     |  Needs improvement
> 1800 ms         |  Poor

TTFB caps everything downstream. If your server takes 2.5 seconds to respond, your LCP cannot be under 2.5 seconds even with a perfect frontend. The biggest TTFB wins come from caching at the edge — putting static HTML on a CDN so the request never reaches origin — or pre-rendering pages at build time so the server's only job is to send bytes. The second tier is database query speed, missing indexes, and N+1 ORM patterns that explode under traffic. Caching layers like Redis matter; so does HTTP caching (see How HTTP Caching Works).

The Lighthouse Performance Score

The 0-100 score on the Lighthouse report is not a Core Web Vital. It is a weighted blend of multiple lab metrics, currently:

  • First Contentful Paint (10%)
  • Speed Index (10%)
  • Largest Contentful Paint (25%)
  • Total Blocking Time (30%)
  • Cumulative Layout Shift (25%)

Total Blocking Time is Lighthouse's lab-data proxy for INP — it sums up time spent on main-thread tasks longer than 50 ms during page load. INP itself cannot be measured in a single lab run because it requires actual user interaction across a session, so Lighthouse uses TBT as a stand-in.

The score uses a log-normal curve. Hitting 90 is genuinely difficult — most production sites land in the 60-80 range. A 100 is achievable for static, content-only pages but vanishingly rare for interactive apps with third-party scripts. Do not chase 100 at the expense of features users actually want; the diminishing returns set in around the high 80s.

The other thing to know is run-to-run variance. The same URL, run twice in a row, can score 78 then 84 then 71. The reasons:

  • Network throttling is simulated, not real, and tiny scheduler timing differences cascade.
  • CPU throttling depends on what else your machine is doing — close Slack and your score jumps.
  • Third-party scripts (analytics, ads, A/B testing) load on different paths each run.
  • Cache state matters even with "disable cache" — DNS, TLS resumption, HTTP/3 0-RTT all affect TTFB.

The mitigation is to run multiple times and take the median. PageSpeed Insights does five runs server-side and reports the median, which is more stable than a single local run. For local debugging, the Chrome DevTools "Performance Insights" panel is more diagnostic than the Lighthouse panel because it shows the actual trace, not a derived score.

For production monitoring, ignore lab scores entirely and watch CrUX (or your own Real User Monitoring data). Lab tests are for finding the regression; field data tells you whether users notice.

A Practical Audit Workflow

Here is the workflow that actually moves scores up, in the order that matters:

  1. Pull CrUX data first. Check the Search Console Core Web Vitals report or PageSpeed Insights. If your field data is green, do not chase a green Lighthouse score — fix something else. If it is red, note which metric (LCP, INP, or CLS) and which device class (mobile vs desktop).
  2. Reproduce in the lab. Run Lighthouse on the worst page, mobile profile, throttled. The Lighthouse Lite Audit tool gives you a quick view of the four headline metrics without installing anything; for deeper traces use Chrome DevTools.
  3. Find the LCP element. In the Lighthouse report, "Largest Contentful Paint element" is a specific DOM node. Optimizing the wrong element is the single most common waste of effort.
  4. Audit render-blocking resources. Anything in the <head> without async, defer, or media=print blocks the first paint. Be ruthless.
  5. Profile interactions. Use the Performance panel's Interactions track on real flows — opening menus, scrolling long lists, typing in search boxes. Each long task is a candidate for scheduler.yield() or worker offload.
  6. Reserve space for everything. Images, ads, embeds, late-loading widgets. CLS rewards predictable layout above all else.
  7. Re-measure with the median of three runs. A single run is a sample, not a result.

While you are auditing, related crawler-facing checks are worth running alongside performance — a fast page that returns 5xx errors or has a broken canonical defeats the point. Pair Lighthouse with the Page Screenshot Service for above-the-fold visual regression, the Broken Link Checker for link rot, the SERP Preview Tool to see how your titles render in Google, and the Robots.txt Tester to confirm crawlers can reach your pages at all.

What "Passes" Mean for SEO

Google has stated that Core Web Vitals are a ranking signal, but a relatively small one — a tiebreaker between similar-quality pages, not a primary ranking factor. The official documentation at Google Search Central is clear that content relevance comes first.

That said, "passes" has a specific meaning: at the 75th percentile of page loads in CrUX over the last 28 days, all three metrics (LCP, INP, CLS) must be in the green range for that URL group. Failing one fails the whole assessment. Mobile and desktop are evaluated separately. Pages with insufficient CrUX traffic fall back to origin-level data. The full definitions are at web.dev's Core Web Vitals guide and the Lighthouse documentation.

The practical takeaway: chasing a perfect Lighthouse score is a vanity exercise. Pulling each Core Web Vital into the green at the 75th percentile of real users is the actual goal — and once you are green, time spent on content quality almost always outranks time spent shaving off another 200 ms.

FAQ

Why does my Lighthouse score swing 12 points between runs?

Run-to-run variance is built in. Lighthouse simulates network and CPU throttling, but the simulation depends on your actual machine's clock, scheduler, and what else is running. Background apps, OS updates, browser extensions, and even memory pressure all shift timing. The fix is to run 3–5 times and take the median; PageSpeed Insights does this server-side, which is why its results are more stable than local Lighthouse runs.

Is INP really replacing FID for ranking?

Yes — INP became a Core Web Vital in March 2024, replacing FID entirely. FID only measured the very first input on a page, which most sites passed easily. INP measures the worst interaction across the entire visit, exposing slow menu animations, type-ahead search, and modal openings that FID never caught. Sites that were green on FID often fail INP without any code changes.

Does a perfect Lighthouse score actually help SEO?

Marginally. Google has consistently stated Core Web Vitals are a tiebreaker between similar-quality pages, not a primary ranking factor. Content relevance, links, and topical authority all outweigh performance. A page with great content and a 75 Lighthouse score will outrank a thin page with 100. Use Lighthouse to fix regressions, not to chase 100 — the marginal SEO return on going from 90 to 100 is essentially zero.

What's the difference between TTFB and LCP?

TTFB is when the first byte of HTML arrives at the browser; LCP is when the largest visible element finishes painting. TTFB is a server-side concern (DNS, TLS, server processing); LCP includes TTFB plus everything after — HTML parsing, render-blocking resources, image download, layout. LCP is bounded below by TTFB — you can't have a 1s LCP if your TTFB is 1.5s.

Should I optimize for lab data or field data first?

Field data first, every time. CrUX data from real users is what Google uses for ranking, and lab tests can be green while field data fails. Pull the Search Console Core Web Vitals report, identify which metric and which device class is failing, then use Lighthouse to reproduce and fix. Optimizing for lab scores without checking field data is solving the wrong problem.

How do I fix CLS caused by web fonts?

Use size-adjust, ascent-override, and descent-override in your @font-face declaration to make the fallback font's metrics match the loaded font's. Tools like Fontaine or Capsize generate these values automatically. The other option is font-display: optional (only use the loaded font if it's available within ~100ms), which avoids the swap entirely but means most users never see your custom font.

Can I trust Chrome DevTools' Performance panel over Lighthouse?

For diagnosing specific issues, yes — DevTools shows the actual trace with main-thread tasks, network waterfalls, and rendering events at millisecond granularity. Lighthouse derives a score from that trace; DevTools shows the trace itself. For finding what to fix, DevTools wins. For tracking regressions over time, Lighthouse with consistent settings is more comparable.

Why is my LCP element a tiny logo image instead of the hero?

LCP picks the largest visible content element painted before the user interacts. If your hero loads slowly, a smaller element (logo, headline text, even a placeholder) might paint first and become the LCP candidate. The fix is to ensure the hero is the largest element AND paints fast — preload the image, use modern formats, and avoid render-blocking resources that delay it. Check the Lighthouse "LCP element" highlighter to see what the browser actually picked.