JWT 'Token Expired' Errors on Fresh Tokens (Clock Skew)

JWT verification fails intermittently with 'token expired' even on tokens issued seconds ago. Fix the server clock drift with NTP and add JWT leeway.

Users report sporadic 401s right after login. The token was minted three seconds ago and the verifier swears it expired. The cause is almost never the token — it is clock drift between the auth server that signed iat/exp and the resource server that checked exp. Same story for nbf (“not before”) rejections immediately after issuance. Fix by running NTP everywhere, adding a small leeway in the verifier, and shortening token TTL once clocks are reliable.

Common causes

Ordered by hit rate.

1. Resource server clock drifted forward

The verifying server is 8 seconds ahead. Tokens that have not yet expired in the issuer’s clock are rejected.

How to spot it: chronyc tracking shows large System time offset, or timedatectl status reports System clock synchronized: no.

2. Container with no NTP

Docker containers run a clock inherited from the host kernel — fine on a real host, but virtualized hosts or kubelet nodes can drift, and stopped/resumed VMs can jump.

How to spot it: date -u inside two pods differs by seconds.

3. JWT library has no leeway by default

jsonwebtoken (node), python-jose, and golang-jwt accept zero leeway by default. A 100 ms skew over the wire can cause occasional false rejections at exp boundary.

How to spot it: Errors cluster around the very last second of the token lifetime.

4. nbf set to issuance time on a slightly fast issuer

Issuer’s clock is ahead, so nbf is “in the future” from the verifier’s view for a few seconds.

How to spot it: Error message says “token not yet valid” or NotBeforeError.

5. Cached time on serverless cold start

Some FaaS runtimes lazily sync time on cold start. The first request after a long idle gap can see skew until the OS catches up.

How to spot it: Error spike correlates with cold starts.

Shortest path to fix

Step 1: Confirm and quantify the skew

# Compare wall clock to a known good source
curl -s --head https://www.google.com | grep -i ^date
date -u

# On the host
timedatectl status
chronyc tracking

If System time offset is anything over 100 ms, fix the clock first.

Step 2: Run NTP everywhere

For systemd hosts, prefer systemd-timesyncd (simple) or chrony (richer).

# Ubuntu/Debian with chrony
sudo apt install -y chrony
sudo systemctl enable --now chronyd
chronyc sources -v
chronyc tracking
# /etc/chrony/chrony.conf
pool time.google.com iburst
pool time.cloudflare.com iburst
makestep 1.0 3
rtcsync

For Kubernetes nodes, ensure the node has chrony or systemd-timesyncd — pods inherit. For VMs after a snapshot/restore, force a step.

sudo chronyc makestep

Step 3: Add leeway in the JWT verifier

Node (jsonwebtoken):

import jwt from 'jsonwebtoken';

jwt.verify(token, publicKey, {
  algorithms: ['RS256'],
  clockTolerance: 5,   // seconds of leeway for iat/exp/nbf
});

Node (jose, modern):

import { jwtVerify } from 'jose';

const { payload } = await jwtVerify(token, key, {
  clockTolerance: '5s',
});

Python (python-jose):

from jose import jwt
jwt.decode(token, key, algorithms=['RS256'], options={'leeway': 5})

Go (golang-jwt/jwt/v5):

parser := jwt.NewParser(jwt.WithLeeway(5 * time.Second))
token, err := parser.Parse(tokenStr, keyFunc)

Five seconds is a safe default. Avoid more than 30 — it weakens the time bound.

Step 4: Shorten token TTL once clocks are trustworthy

With NTP in place and 5 s leeway, you can keep short access tokens (5-15 min) and use refresh tokens for the long tail. That limits the damage if a token leaks.

// Issue 10 min access, 30 day refresh
const access = jwt.sign({ sub }, key, { algorithm: 'RS256', expiresIn: '10m' });
const refresh = jwt.sign({ sub, typ: 'refresh' }, key, { algorithm: 'RS256', expiresIn: '30d' });

Step 5: Monitor for skew

Emit a metric of now() - iat at the verifier. If the distribution shifts, you have drift somewhere new.

const ageSeconds = Math.floor(Date.now() / 1000) - payload.iat;
metrics.histogram('jwt_age_at_verify_seconds').record(ageSeconds);

Alert if p99 goes negative (meaning verifier clock is behind issuer) by more than 2 s.

Prevention

  • NTP (chrony) on every host, with at least two pools listed.
  • Every JWT verifier sets explicit clockTolerance of 5 s.
  • Bake a chronyc makestep into VM resume hooks and large maintenance windows.
  • Treat clock skew as an SLO — alert when nodes drift more than 100 ms.
  • For multi-region deployments, prefer the same NTP source per region to avoid bouncing.

Tags: #Backend #Troubleshooting #jwt