Vercel 500 Errors: Find the Real Cause Fast

Q: My function "times out after 300 seconds" but my code is fast. Why?

You're hitting the default `maxDuration` because Fluid Compute is on. The 300s is the ceiling, not your code's real runtime; a fast function only burns 300s when it's stuck waiting on an upstream call or query that never returns. Add an `AbortController` (Step 4) so the call fails in seconds.

Q: `EDGE_FUNCTION_INVOCATION_FAILED` vs `FUNCTION_INVOCATION_FAILED`, what's the difference?

`EDGE_*` means the failure happened in an Edge function, most often because it used a Node-only API. The plain version means a regular (Node) serverless function threw. The fix path differs: Edge failures usually mean switching `runtime` to `'nodejs'` or removing the Node API (Cause 2).

Build is green but production 500s. Diagnose FUNCTION_INVOCATION_FAILED, missing env vars, Edge runtime, and timeouts with real vercel logs commands.

Published: May 17, 2026 Updated: Jun 21, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

Deploy succeeded, build log is all green, but loading a page or hitting an API returns 500: INTERNAL_SERVER_ERROR, usually with a code like FUNCTION_INVOCATION_FAILED or EDGE_FUNCTION_INVOCATION_FAILED underneath. “Builds clean, runs broken” almost never means a syntax bug. It means runtime context is missing: an env var that wasn’t synced to the Production scope, an Edge function hitting a Node-only API, a database pool that’s full, or an upstream API call that never returns.

Fastest fix: run vercel logs --environment production --status-code 5xx --expand --since 1h and read the first two lines of the stack trace. They name the real cause. Everything below is ordered by how often each cause is the one you’ll find there.

Which bucket are you in?

Match the code shown on the error page (or in vercel logs) to the likely cause:

Error code on the 500 page	Most likely cause	Jump to
`FUNCTION_INVOCATION_FAILED`	Unhandled exception: missing env var, bad import, `catch` that doesn’t return	Causes 1, 2, 6
`FUNCTION_INVOCATION_TIMEOUT`	Upstream call or query never returned before `maxDuration`	Cause 3
`EDGE_FUNCTION_INVOCATION_FAILED`	Edge function used a Node-only API (`fs`, `crypto`, `Buffer`, `process`)	Cause 2
`NO_RESPONSE_FROM_FUNCTION`	Handler finished without returning a `Response`	Cause 6
500 with DB error in logs	Connection pool exhausted	Cause 5

The full list of platform codes lives in Vercel’s error reference.

Common causes

Ordered by hit rate, highest first.

1. Env var missing on Production (wrong scope)

Works fine in Preview, 500s on Production. Open Vercel dashboard, then Settings, Environment Variables, and look at each row’s environment chips: Production, Preview, and Development are independent. Classic miss: a newly added OPENAI_API_KEY is enabled for Preview but not Production.

TypeError: Cannot read properties of undefined (reading 'startsWith')
  at new OpenAI (/var/task/node_modules/openai/index.js:42)

How to spot it: vercel env ls production lists what production actually receives. Diff that against every process.env.XXX reference in your code. A reference that resolves to undefined is the smoking gun.

2. Edge runtime uses a Node-only API

The function exports runtime = 'edge', but the code does import fs from 'fs', require('crypto').createHash, reads process.cwd(), or uses Node’s Buffer. Local dev runs on Node so it works; deployed to Edge, it fails because Edge only ships a subset of Web APIs.

Error: The package "fs" wasn't found on the file system but is built into node.

In a Next.js build you’ll often see this variant instead:

A Node.js API is used (process.cwd at line: 1451) which is not supported in the Edge Runtime.

How to spot it: search function logs (or build output) for not supported in the Edge Runtime or built into node. The 500 page typically shows EDGE_FUNCTION_INVOCATION_FAILED.

3. Upstream call has no timeout

You call OpenAI, Anthropic, or Stripe with no AbortController. The upstream hangs while your function runs out of time and is killed before the reply arrives.

Task timed out after 300.00 seconds
FUNCTION_INVOCATION_TIMEOUT

Note the duration: as of June 2026, Fluid Compute is enabled by default, so the default maxDuration is 300 seconds (5 minutes) on Hobby, Pro, and Enterprise unless you’ve set it lower. (Older Vercel docs and tutorials cite a 10s Hobby / 60s Pro default; that predates Fluid Compute.) A function that “hangs” for the full 300s is almost always a missing upstream timeout, not slow code.

How to spot it: function logs show Task timed out with a duration that equals your configured maxDuration.

4. Slow cold start + heavy dependencies

A large function bundle slows cold starts and can push first-byte time past your maxDuration. Usual culprit: importing all of aws-sdk or firebase-admin when you only need one submodule. The uncompressed bundle limit is 250 MB; you want to stay far below it.

How to spot it:

vercel inspect <deployment-url> --logs
# Check the Functions section for bundle size and cold-start duration

5. Database connection pool exhausted

Each serverless invocation can open a fresh connection. PostgreSQL/MySQL default to roughly 100 concurrent connections; a traffic burst saturates the pool and every subsequent function 500s.

Error: remaining connection slots are reserved for non-replication superuser connections

How to spot it: your DB provider’s dashboard (Supabase, Neon, or PlanetScale) shows active connections pegged at the limit.

6. catch block logs but doesn’t return

try { ... } catch (e) { console.error(e) } with no return new Response(...). The function finishes without responding, and Vercel reports NO_RESPONSE_FROM_FUNCTION or a 500.

How to spot it: function logs show a stack trace but no business log lines before the 5xx. Execution didn’t complete the response.

Shortest path to fix

Step 1: Grab the real error with vercel logs

The vercel logs command was rebuilt in February 2026 and now queries historical logs with native filters, so you no longer have to pipe everything through grep. Filter straight to the production 500s:

# All production 5xx errors from the last hour, with full stack traces
vercel logs --environment production --status-code 5xx --expand --since 1h

# Narrow to just edge functions, or to one request
vercel logs --source edge-function --level error --since 1h
vercel logs --request-id <req_xxxxx> --expand

# Stream live while you reproduce (follows for up to 5 minutes)
vercel logs --follow

Prefer machine-readable output? Add --json and pipe to jq. You can still open the dashboard route Deployments, latest, Functions, Logs, but the CLI is faster and fully searchable. Copy the full stack: the first two lines name the real cause.

Step 2: Reconcile env vars

# List all env var keys in the production scope
vercel env ls production

# Pull locally and diff against code references
vercel env pull .env.production
diff <(grep -oE '^[A-Z_]+=' .env.production | sort) \
     <(grep -roE 'process\.env\.[A-Z_]+' src/ | sort -u)

Newly added or rescoped env vars only take effect after a redeploy (click Redeploy in the dashboard, or push a new commit). Editing a variable does not retroactively patch the running deployment.

Step 3: Flip Edge back to Node to verify

If the stack mentions Edge Runtime or not supported, set the runtime back to Node:

// app/api/chat/route.ts
export const runtime = 'nodejs';  // was 'edge'
export const maxDuration = 30;    // explicit cap; default is 300 with Fluid Compute

Redeploy and watch for a few hours. If it’s stable, decide whether to rewrite Edge-safe (replace axios with fetch, replace Node crypto with Web Crypto’s crypto.subtle) or simply stay on Node. Edge only buys you lower latency; for most API routes that call an upstream model, Node is the safer default.

Step 4: Wrap every upstream fetch with a timeout

const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 8000);

try {
  const res = await fetch('https://api.openai.com/v1/...', {
    signal: controller.signal,
  });
  return Response.json(await res.json());
} catch (e) {
  if (e.name === 'AbortError') {
    return new Response('Upstream timeout', { status: 504 });
  }
  console.error(e);
  return new Response('Internal error', { status: 500 });
} finally {
  clearTimeout(timeout);
}

Critical: every catch must return a Response, not just console.error. An 8-second abort returns a clean 504 to the client instead of letting the function burn the full 300s and die with a timeout.

Step 5: Use a pooled DB connection

Don’t open raw Postgres connections from serverless. Use Prisma Accelerate, Neon’s @neondatabase/serverless, or Supabase’s transaction pooler URL:

// Use the pgBouncer pooler on port 6543 instead of 5432
import { Pool } from 'pg';
const pool = new Pool({
  connectionString: process.env.DATABASE_URL_POOLER,
  max: 1,  // one connection per serverless instance
});

How to confirm it’s fixed

Redeploy, then curl -i https://your-app.vercel.app/api/<route> and confirm HTTP/2 200.
Re-run vercel logs --environment production --status-code 5xx --since 10m. An empty result is what you want.
If the cause was a timeout or DB pool, generate a small burst (for example for i in {1..30}; do curl -s -o /dev/null -w "%{http_code}\n" https://your-app.vercel.app/api/<route> & done) and confirm every response is 200.

Prevention

Add an env-var check to CI: grep every process.env.XXX reference and diff against vercel env ls production; fail if anything is missing.
Force a timeout (8s or below) on every upstream call; require every catch to return a Response.
Lint Edge functions to forbid Node-only imports (fs, path, crypto, net, process).
Run a post-deploy health check that curls critical API endpoints and asserts 200.
Always route DB traffic through a pooler URL; alert when active connections hit 80% of the cap.
Set an explicit maxDuration per function so a hung request fails fast instead of consuming the full 300s default.

FAQ

Why does it work in Preview but 500 only in Production? Almost always an env var scoped to Preview but not Production. Run vercel env ls production and compare against vercel env ls preview. Anything present in one and missing in the other is your candidate.

My function “times out after 300 seconds” but my code is fast. Why? You’re hitting the default maxDuration because Fluid Compute is on. The 300s is the ceiling, not your code’s real runtime; a fast function only burns 300s when it’s stuck waiting on an upstream call or query that never returns. Add an AbortController (Step 4) so the call fails in seconds.

Where do I see the actual stack trace, not just 500: INTERNAL_SERVER_ERROR? The browser only shows the generic code. The real exception is in runtime logs: vercel logs --environment production --status-code 5xx --expand, or the dashboard Logs page. The --expand flag prints the full message under each request line.

Do I have to redeploy after changing an env var? Yes. Env vars are injected at build/deploy time, so an existing deployment keeps the old values until you redeploy or push a new commit.

EDGE_FUNCTION_INVOCATION_FAILED vs FUNCTION_INVOCATION_FAILED, what’s the difference? EDGE_* means the failure happened in an Edge function, most often because it used a Node-only API. The plain version means a regular (Node) serverless function threw. The fix path differs: Edge failures usually mean switching runtime to 'nodejs' or removing the Node API (Cause 2).

Tags: #Hosting #Debug #Troubleshooting #Vercel

Which bucket are you in?

Common causes

1. Env var missing on Production (wrong scope)

2. Edge runtime uses a Node-only API

3. Upstream call has no timeout

4. Slow cold start + heavy dependencies

5. Database connection pool exhausted

6. catch block logs but doesn’t return

Shortest path to fix

Step 1: Grab the real error with vercel logs

Step 2: Reconcile env vars

Step 3: Flip Edge back to Node to verify

Step 4: Wrap every upstream fetch with a timeout

Step 5: Use a pooled DB connection

How to confirm it’s fixed

Prevention

FAQ

Related

Related Articles

Astro Adapter Mismatch Between SSR and SSG Modes

Deploy Preview URLs Got Indexed by Google

GitHub Actions Deploy Step Hangs Until the 6-Hour Job Limit

Monorepo Deploy Only Ships One App Out of Several

Netlify Function Times Out at 10s on Cold Start

Next.js ISR Revalidation Stuck on Stale Pages (Vercel, 2026)