Methodology — How LeakAudit scores Shopify storefronts

LeakAudit issues a forensic case file. A case file is only as defensible as the methodology behind it. This page documents the eight categories, the weights, the citations, the edge cases — everything we do when your storefront URL hits the queue. Forward this to a CFO. Forward this to a developer. The audit is only useful if the rubric is honest.

The 8 Categories

Each category is scored 0–100. The composite is a weighted average. Below 50 = bleeding. 50–69 = stable. 70–84 = healthy. 85+ = top decile.

CATEGORY 01

First Impression

What does a first-time visitor understand in 5 seconds — promise, proof, next-step?

WEIGHT · 15% · CITED: BAYMARD INSTITUTE 2024 H2 BENCHMARK STUDY

CATEGORY 02

Copywriting

Specificity of value props, jargon density, command-action ratio, hierarchy.

WEIGHT · 12% · CITED: NIELSEN NORMAN PUBS 2023

CATEGORY 03

UX Architecture

Navigation depth, search affordance, decision support, dead ends, confused funnels.

WEIGHT · 13% · CITED: SHOPIFY COMMERCE TRENDS 2025

CATEGORY 04

SEO Surface

Title tags, headings, structured data, internal links, canonical hygiene.

WEIGHT · 10% · CITED: GOOGLE SEARCH CENTRAL DOCS

CATEGORY 05

Trust Signals

Reviews above-fold, badges, return policy visibility, founder presence, contact paths.

WEIGHT · 14% · CITED: BAYMARD TRUST SIGNAL STUDY 2024

CATEGORY 06

Performance

LCP, INP, CLS measured against the Web Vitals threshold table.

WEIGHT · 11% · CITED: GOOGLE WEB VITALS 2024 THRESHOLDS

CATEGORY 07

Mobile

Tap targets ≥44px (Apple HIG), font-size floor 16px (iOS zoom), thumb reach.

WEIGHT · 13% · CITED: APPLE HIG / MATERIAL 3

CATEGORY 08

Conversion Architecture

CTA prominence, cart drawer hygiene, upsell density, friction in checkout, abandonment vectors.

WEIGHT · 12% · CITED: LITTLEDATA SHOPIFY BENCHMARKS 2024

How a score becomes a dollar number

Two storefronts can score 60 and have wildly different leak amounts. The dollar leak is computed downstream of the score, not from the score. We use:

Estimated monthly traffic from a Shopify-aware proxy of public signals (themes, product count, review count). Not Similarweb. Not GA. A defensible-but-honest range.
Category-specific conversion deltas from the cited studies. Web Vitals failure = Google's published 7% conversion impact. Trust failure = Baymard's published 18%. Cart drawer friction = Littledata's published 4–11% range.
An estimated AOV based on the niche proxy (apparel ≠ supplements ≠ furniture).
A floor and a ceiling. The leak number we surface is the midpoint of a range — never a precise figure. If the range is too wide to be useful, we say "wide range" instead of pretending precision.

Edge cases we handle explicitly

Geofenced or login-walled storefronts — we either request a public-route URL or report "insufficient surface" and refuse to score. We don't score what we can't see.
Pre-launch / coming-soon stores — surfaced as "in development" and not scored against live storefronts.
Headless / Hydrogen / Liquid customizations — handled identically because our capture is browser-based, not template-based.
Cookie banners / pop-ups — bypassed in capture so they don't dominate the screenshot. The audit reflects the storefront, not your CMP modal.
A/B tests — captured as one variant. Your AB infrastructure may show a different page to the next visitor; we score what we saw.

The Empty Autopsy Guarantee

[ COMMITMENT ]

If your storefront scores ≥85 across all 8 categories, the audit is on us — and we'll publish your store as a public benchmark for everyone else to chase.

A high-performing storefront is rare and useful. If you score in the top decile, your case file becomes a teaching artefact: anonymised by default, named-attribution if you opt in. You don't pay for the file. We don't charge to certify a healthy store.

Citations footnoted in every report

Every finding in every case file links back to the public source. No "trust the AI." If a finding cites Baymard, the line ends with the report name and figure. If it cites Web Vitals, the line ends with the threshold value. The methodology is the moat — not the model.

Baymard Institute — UX research benchmarks (cart abandonment, trust signals, mobile usability).
Google Web Vitals — Core Web Vitals thresholds (LCP < 2.5s, INP < 200ms, CLS < 0.1).
Shopify Commerce Trends 2025 — sector-specific conversion benchmarks.
Littledata Shopify Benchmarks — cohort conversion and AOV ranges.
Nielsen Norman Group — copywriting and information architecture publications.
Apple Human Interface Guidelines / Material 3 — mobile target and gesture standards.
Google Search Central — SEO surface conventions and structured data.

What we are not

Not a CRO consultancy. We surface the leaks. The fix is yours, your developer's, or your agency's. The Coroner does diagnosis, not surgery.
Not a real-time monitoring tool. Use Monitor for that — weekly re-audits with regression alerts.
Not a conversion tracker. GA4 / Heap / PostHog already track what converted. We score the architecture that decides whether it can.
Not a substitute for taste. A storefront can pass on every metric and still feel wrong. Score is a floor, not a ceiling.

Versioning

This methodology is versioned. Material changes to weights or citations are logged with a date and a rationale. The current version is 1.0 (May 2026). When a citation gets updated (e.g. a new Baymard report), we re-run the affected calibration before publishing the change.

Questions, disputes, or replications welcome. Email the Coroner: hello@leakaudit.app.

How LeakAudit scores Shopify storefronts.