Why PWAs Need a Dedicated SEO Checklist
Progressive Web Apps deliver app-like speed and offline capabilities, but their reliance on JavaScript, service workers, and client-side routing introduces SEO risks that traditional websites don't face. A single misconfigured service worker can serve cached 404 pages as 200 responses. A client-side router without server-side rendering can hide entire sections from Googlebot. And an aggressive cache-first strategy can prevent search engines from ever seeing updated content.
This checklist covers every technical requirement for making a PWA visible, crawlable, and performant in search. Work through each section before launch, or use it as a diagnostic tool for PWAs that are underperforming in organic search.
Rendering & Crawlability
Googlebot renders JavaScript, but it does so on a delayed schedule (sometimes days after the initial crawl). Content that only exists after JavaScript execution is at risk of being missed or indexed late.
- Server-side rendering (SSR) or static generation (SSG) is configured for all indexable pages. Googlebot should receive fully rendered HTML in the initial response. Frameworks like Next.js, Nuxt, and SvelteKit handle this natively.
- Content is visible with JavaScript disabled. Open your pages in Chrome with JS disabled (DevTools > Settings > Debugger > Disable JavaScript). If the main content disappears, crawlers may not see it reliably.
- No content is locked behind user interactions. Tabs, accordions, and modals that require clicks to reveal content may not be indexed. If the content matters for SEO, render it in the initial HTML.
- Dynamic imports don't block critical content. Lazy-loaded components below the fold are fine. Lazy-loading the H1 or main body content is not.
- The
<noscript>tag provides a meaningful fallback. Not a replacement for SSR, but a safety net for edge cases.
Service Worker Configuration
Service workers intercept network requests between the browser and server. A poorly configured service worker can prevent Googlebot from fetching fresh content.
- Network-first strategy is used for HTML documents. Cache-first works for static assets (CSS, JS, images), but HTML pages should always attempt a network fetch first so crawlers get current content.
- Known bot user agents bypass the service worker entirely. Check for
googlebot,bingbot,slurp, and other crawler signatures in the request's user-agent header and serve fresh responses. - Cache versioning is implemented. Old cached responses are purged when a new service worker version activates. Use the
activateevent to clean outdated caches. - Error pages are never cached as successful responses. If a 404 or 500 is cached by the service worker, it will be served as a 200 to both users and crawlers — creating phantom pages or hiding real content.
- The service worker scope covers only what it should. A service worker at
/app/sw.jswithscope: "/"can intercept requests for marketing pages that don't need it. - Cache expiration is set for content pages. HTML caches should expire within hours, not days. Stale content served to Googlebot means stale content in the index.
How Should You Configure manifest.json for SEO?
The web app manifest doesn't directly affect rankings, but a misconfigured manifest breaks install prompts and degrades user experience metrics that Google measures.
- All required fields are present:
name,short_name,start_url,display, andicons. - Icons include both 192x192 and 512x512 PNG sizes. Missing either prevents Chrome's install prompt.
-
start_urlresponds with a 200 status code. Test it directly — if it redirects, the install experience breaks. -
displayis set tostandaloneorminimal-ui. Usingbrowserdefeats the purpose of a PWA. -
theme_colormatches the meta theme-color tag. Inconsistency causes visual glitches in the address bar during installation. - The manifest is linked in the
<head>of every page, not just the homepage:<link rel="manifest" href="/manifest.json" /> - The manifest is served with
Content-Type: application/manifest+json.
URL Structure & Routing
Client-side routers can create URLs that search engines can't follow or that duplicate content across multiple URL patterns.
- No hash-based routing (
/#/page). Googlebot ignores URL fragments. All routes must use the History API with clean paths (/page). - Every client-side route has a corresponding server-side route. Hitting any URL directly (not via navigation) should return a fully rendered page with a 200 status.
- Canonical tags are present on every indexable page and point to the correct canonical URL.
- An XML sitemap includes all indexable PWA routes and is submitted in Google Search Console.
- Trailing slash usage is consistent. Pick
/pageor/page/and redirect the other with a 301. - Pagination uses
rel="next"andrel="prev"or load-more patterns with crawlable links. - Query parameters that don't change content are handled with canonical tags or
robots.txtrules.
What Core Web Vitals Thresholds Must a PWA Meet?
Core Web Vitals are direct ranking signals. PWAs have a natural advantage here (service worker caching enables near-instant repeat loads), but the initial load must also meet thresholds.
- Largest Contentful Paint (LCP) is under 2.5 seconds on mobile with a 4G connection. The app shell should not delay the largest visible element.
- Cumulative Layout Shift (CLS) is under 0.1. Dynamic content injection after the shell loads is the #1 cause of CLS in PWAs. Reserve space for async content.
- Interaction to Next Paint (INP) is under 200 milliseconds. Heavy JavaScript execution on the main thread during hydration can fail this metric.
- Time to First Byte (TTFB) is under 800 milliseconds. Server-side rendering adds processing time — use edge caching or streaming SSR to compensate.
- Total Blocking Time (TBT) is under 200 milliseconds. Break long hydration tasks into smaller chunks using
requestIdleCallbackor framework-specific solutions. - All above-the-fold images use
fetchpriority="high"and are not lazy-loaded.
Offline & Error Handling
Offline functionality is a PWA feature, not an SEO feature. But incorrect offline handling can create SEO problems.
- The offline fallback page includes
<meta name="robots" content="noindex">. If Googlebot somehow receives the offline page, it should not be indexed. - 404 pages return actual 404 status codes, not 200 status codes with "page not found" content. This is critical with service workers that might intercept error responses.
- The service worker does not cache redirect responses. A cached 301/302 served as a 200 confuses crawlers.
- Background sync does not create duplicate content. If content is synced and published offline, ensure it doesn't create parallel URLs when connectivity returns.
Structured Data & Metadata
Meta tags and structured data must be present in the initial server response, not injected via JavaScript after load.
- Title tags and meta descriptions are server-rendered. Do not rely on client-side JavaScript to set
document.title— Googlebot may not execute it in time. - Open Graph and Twitter Card tags are in the initial HTML. Social crawlers (Facebook, LinkedIn, Twitter) do not execute JavaScript at all.
- JSON-LD structured data is included in the initial HTML response, inside a
<script type="application/ld+json">tag. - Canonical URLs are server-rendered, not set via JavaScript.
- Hreflang tags (if multilingual) are in the initial HTML or the XML sitemap.
- The robots meta tag is server-rendered. A JavaScript-injected
noindexmay not be processed by Googlebot.
Testing & Validation
Run these checks before launch and after any significant PWA architecture changes.
- Lighthouse PWA audit scores 100. Run in Chrome DevTools or via the PageSpeed Insights API.
- Lighthouse SEO audit scores 100. This catches missing meta tags, non-crawlable links, and viewport issues.
- Google's Mobile-Friendly Test passes for at least 5 representative pages.
- URL Inspection tool in Google Search Console shows fully rendered content. Use "Test Live URL" and check the rendered HTML.
- Rich Results Test validates all structured data without errors.
-
site:yourdomain.comin Google shows all expected pages indexed. Compare against your sitemap count. - Chrome DevTools Application panel shows the service worker registered with the correct scope and no errors.
- CrUX data (Chrome User Experience Report) shows "Good" for all Core Web Vitals after at least 28 days of real user data.
Common PWA SEO Failures
Serving the app shell to Googlebot. If your SSR pipeline breaks and falls back to the empty app shell, Googlebot indexes a blank page with a loading spinner. Monitor server-side rendering errors aggressively.
Caching pre-rendered content indefinitely. ISR (Incremental Static Regeneration) or SSG pages cached by the service worker may serve outdated content for weeks. Set max-age headers and implement cache-busting for content updates.
Using display: "fullscreen" without navigation. Fullscreen mode removes the browser's back button. If your PWA's internal navigation breaks, users are trapped — and bounce rate spikes.
Ignoring the app install banner's effect on CLS. The browser's install prompt shifts page content. Account for this in your CLS budget, or defer the prompt until after the page is fully stable.
FAQ
Does Google treat PWAs differently than regular websites?
No. Google indexes PWAs the same way it indexes any website. The ranking factors are identical: content relevance, page experience signals (Core Web Vitals), backlinks, and mobile-friendliness. PWAs don't get a ranking bonus or penalty. The advantage is that well-built PWAs naturally score better on performance metrics because of service worker caching and optimized loading.
Can Googlebot execute service workers?
Google's documentation states that Googlebot can execute service workers in some cases, but behavior is not guaranteed. The safe approach is to bypass service workers for all known bot user agents and serve fresh, server-rendered HTML directly. Never rely on service worker caching for delivering content to search engines.
Should I use the App Shell architecture for a content-heavy site?
Use the app shell for navigation chrome (header, sidebar, footer) but not for main content. The app shell should load instantly, then content fills in via SSR or SSG — not via client-side API calls. For content-heavy sites like blogs or documentation, full SSR or SSG is safer than the app shell + API pattern.
How do I check if my PWA pages are being indexed correctly?
Use Google Search Console's URL Inspection tool. Enter any URL and click "Test Live URL." Compare the rendered HTML against what your server returns. If the rendered version is missing content that appears in the server response, your client-side hydration or routing has a bug. Also compare your sitemap page count against the "Pages" report in GSC — a large gap indicates indexation problems.