You deploy a critical fix. CDN cache is purged. Browser hard-refresh shows the new version. Users on Slack still report seeing the old broken page. Some users see it for hours. A few diehards see it for days. The culprit is almost always a service worker installed by your PWA, Workbox, or Next.js / Astro PWA plugin — it intercepted fetches before the network, served cached index.html, and that cached HTML references old hashed JS chunks that may not even exist on the server anymore (cue the white screen of doom). This is one of the most common “deployed but users see old” bugs and it requires both a deploy-time fix and a runtime escape hatch for users already trapped.
Common causes
Ordered by what causes most user reports.
1. Service worker uses CacheFirst for index.html
Workbox / vite-plugin-pwa / next-pwa default strategies sometimes cache HTML with CacheFirst. Once cached, the SW serves it forever from cache — the user never sees new HTML referencing new JS chunk hashes.
How to spot it: DevTools → Application → Service Workers → check the registered SW source. CacheFirst on navigationRoute or HTML routes is the smoking gun.
2. New SW registered but skipWaiting() not called
Browsers install the new SW silently. It sits in waiting state until all tabs of the site close (which for productivity tools may be days). The old SW keeps controlling pages until then.
How to spot it: DevTools → Application → Service Workers shows “waiting to activate” or two SW versions listed. Page banner asking “reload to update” never fires.
3. Cached chunks reference hashes that no longer exist
Workbox’s precache manifest from build N points at /assets/index.abc123.js. Build N+1 hashes it /assets/index.def456.js. SW serves cached index.html referencing abc123 — but the server returns 404. Result: blank page.
How to spot it: User sees blank page; DevTools Network shows 404 for a JS chunk; the SW Application tab shows precached entries pointing at the missing hash.
4. Cache scope set to root and never cleared
navigator.serviceWorker.register('/sw.js', { scope: '/' }) controls every URL under the root. If your SW was installed months ago and you have since changed paths / removed routes, the SW still claims them.
How to spot it: Routes that should 404 instead return cached responses. User has a SW registered from a previous version of the site that hasn’t been updated.
5. No clients.claim() in activate handler
Even after activation, the new SW doesn’t control already-open tabs without clients.claim(). Existing tabs continue with the old SW until reloaded.
How to spot it: New SW shows as “activated” in DevTools but old SW is still controlling the page. Refresh once and the new one takes over.
6. CDN caching sw.js itself with long max-age
If sw.js is served with Cache-Control: max-age=31536000, the browser hits its own HTTP cache for the SW and never sees the new one. The SW update lifecycle never starts.
How to spot it: curl -I https://your-site/sw.js shows long max-age. Browser DevTools Network tab shows sw.js from disk cache rather than network.
7. SW registered conditionally and fails to install on production
if (window.location.hostname !== 'localhost') registerSW() — if your SW only registers in production, a bug in production SW install (e.g., missing precache file) silently fails and the page works fine but offline support / fast loading is gone.
How to spot it: DevTools Console shows Failed to register a ServiceWorker or Service worker installation failed.
Before you start
- Confirm a service worker is actually installed: DevTools → Application → Service Workers, check for registered SW.
- Identify which library installed it: Workbox, next-pwa, vite-plugin-pwa, custom code.
- Capture a sample affected user’s browser console / network tab if possible.
- Know what your current cache strategy is per route type (HTML vs. JS vs. images).
- Have ability to deploy a new SW immediately as part of the fix.
Information to collect
- Library + version used for SW generation (
@vite-pwa/astro,workbox-webpack-plugin,next-pwa). - Cache strategies configured per route (
runtimeCachingarray in config). Cache-Controlheader onsw.jsitself.- Whether
skipWaiting()andclients.claim()are configured. - A list of affected user agents / browsers (mostly Chrome desktop and Safari iOS for PWAs).
- Most recent deploy’s precache manifest contents.
Step-by-step fix
Ordered: stop the bleeding for new users first, then evict trapped users.
Step 1: Switch HTML to NetworkFirst immediately
In your SW config:
// workbox / vite-plugin-pwa
runtimeCaching: [
{
urlPattern: ({ request }) => request.mode === "navigate",
handler: "NetworkFirst",
options: {
cacheName: "html-cache",
networkTimeoutSeconds: 3,
expiration: { maxEntries: 50, maxAgeSeconds: 60 * 60 * 24 },
},
},
],
New deploys will fetch fresh HTML from the network, falling back to cache only if offline. Static JS / CSS chunks can stay CacheFirst because they have hashed filenames.
Step 2: Add skipWaiting and clientsClaim
In the SW source (or config):
// vite-plugin-pwa
VitePWA({
registerType: "autoUpdate",
workbox: {
clientsClaim: true,
skipWaiting: true,
},
});
Or in custom sw.js:
self.addEventListener("install", (event) => {
self.skipWaiting();
});
self.addEventListener("activate", (event) => {
event.waitUntil(self.clients.claim());
});
Caveat: skipWaiting on a busy session can cause a momentary version mismatch between open tabs and the new SW. Tradeoff is fine for content sites; for apps with critical in-progress state, prompt the user instead.
Step 3: Prevent sw.js itself from being cached long
In Vercel/Netlify headers config:
// vercel.json
{
"headers": [
{
"source": "/sw.js",
"headers": [
{ "key": "Cache-Control", "value": "public, max-age=0, must-revalidate" },
{ "key": "Service-Worker-Allowed", "value": "/" }
]
}
]
}
The SW file itself MUST be fetched fresh every navigation; otherwise update never happens. Browsers special-case sw.js somewhat but explicit headers remove all ambiguity.
Step 4: Ship a kill switch for trapped users
Deploy a SW that immediately uninstalls itself for users who are stuck on a broken version:
// public/sw.js — emergency uninstall
self.addEventListener("install", () => self.skipWaiting());
self.addEventListener("activate", async (event) => {
event.waitUntil((async () => {
const keys = await caches.keys();
await Promise.all(keys.map((k) => caches.delete(k)));
const regs = await navigator.serviceWorker.getRegistrations();
await Promise.all(regs.map((r) => r.unregister()));
const clients = await self.clients.matchAll();
clients.forEach((c) => c.navigate(c.url));
})());
});
Deploy this temporarily. Any user with an active SW gets a self-uninstall on next page load, then refresh, then they see real HTML again. After 1-2 weeks, replace it with your normal (fixed) SW.
Step 5: Bump the SW filename or version to force re-registration
Browsers re-check sw.js byte-by-byte. If even one byte differs, install kicks in. To force this on every deploy:
// At the top of sw.js
const BUILD_ID = "__BUILD_ID__"; // replaced at build with commit SHA
Build step:
sed -i "s/__BUILD_ID__/$(git rev-parse --short HEAD)/g" dist/sw.js
Every deploy now produces a byte-different sw.js, triggering update flow.
Step 6: Add an in-page update banner
For users who do not have skipWaiting enabled but a new SW is waiting:
// register-sw.ts
import { registerSW } from "virtual:pwa-register";
const updateSW = registerSW({
onNeedRefresh() {
if (confirm("New version available. Reload?")) updateSW(true);
},
});
Visible prompt converts “trapped users” into one-click recoveries without requiring the kill switch.
Step 7: Audit Service-Worker-Allowed and scope
If your SW lives at /static/sw.js but you want it to control /:
Service-Worker-Allowed: /
Without that header, the SW only controls /static/... paths, which is rarely what you want. See robots txt not working for related header-delivery debug patterns.
Verify
- New deploy: clear browser SW + cache, load site, check Application → Service Workers shows the new SW activated immediately.
curl -I https://your-site/sw.jsshowscache-control: max-age=0.- DevTools Network shows fresh HTML on every navigation, not “(from ServiceWorker)” with old content.
- For trapped users: have someone test the kill-switch SW and confirm a single refresh recovers them.
- After 1 week, no Slack reports of “still seeing old page”.
Long-term prevention
- Default HTML to
NetworkFirstwith shortnetworkTimeoutSeconds; neverCacheFirstHTML. - Always pair
skipWaitingwithclientsClaim; one without the other leaves stale clients. - Keep
sw.jsundermax-age=0in your hosting config — make it impossible to forget. - Embed build ID / commit SHA in SW source so every deploy forces a byte change.
- Always ship an update banner so users have a way out without a kill switch.
- Keep an emergency “uninstall SW” branch ready to deploy; you will need it eventually.
Common pitfalls
- Disabling SW entirely “to fix the bug” without considering users with existing SW installs — they remain broken until you ship a SW that uninstalls itself.
- Using
CacheFirstfor HTML “for speed” — saves 200ms on first load, costs you a week of confused users on every deploy. - Forgetting that iOS Safari has aggressive SW survival; an iPhone user can sit on a stale SW for weeks without triggering an update. See vercel 500 errors for related browser-state debugging.
- Calling
skipWaiting()for an app with in-progress state — users lose unsaved edits when the new SW takes over mid-action. - Versioning the SW by filename (
sw-v2.js) but never deregistering the old one — both run, behavior gets weird.
FAQ
Q: How do I detect if a user is stuck on a stale SW from server logs?
Look at user agent + version of static assets requested. If a user is requesting a chunk hash that exists in build #88 while you are on build #112, they are stuck. Log a synthetic header x-build-id and compare to current.
Q: Should I just remove the service worker entirely?
If you don’t need offline support, fast repeat-visit loads from cache, or push notifications: yes. Most content sites don’t need a SW. Keep one only if it earns its complexity.
Q: My users on Safari iOS see the issue more. Why?
iOS Safari is more aggressive about keeping SWs alive between sessions and is slower to honor skipWaiting(). The kill-switch SW is especially important for iOS.
Q: Will the kill switch break a working PWA install?
It uninstalls the SW. The next time the user visits and you have a working SW deployed, the install lifecycle starts fresh. The “Add to Home Screen” shortcut still works; PWA features are restored. See static site blank page for related symptom debugging.
Tags: #Troubleshooting #service-worker #pwa #cache #Deployment