A web page or API is slow — how do you find the cause?

Question

Accepted Answer

**Don't guess — measure first.** Slowness can come from the client, the network, the server, or the database. A methodical approach finds *where* the time goes, then fixes the **biggest contributor** instead of optimizing randomly.

## A methodical sequence

```text
1. MEASURE → where is the time spent? client render, network, server, DB?
2. REPRODUCE → confirm it reliably (same endpoint, payload, user)
3. TRACE → use APM/distributed traces to find the slow span
4. CHECK RECENT CHANGES → deploys, config, traffic, data growth
5. ISOLATE → layer by layer, narrow to one component
6. FIX the biggest contributor → re-measure to confirm
```

## Measure first: where is the time?

Use the browser **Network/Performance** tab and server **timing** to split the total. A useful breakdown:

```text
Total 1200ms =
  DNS/connect      20ms
  server TTFB     900ms   ← the bottleneck is server-side
  download         80ms
  client render   200ms
```

Look at **percentiles, not averages**: **p50** (typical user) vs **p99** (worst case). A fast p50 with a slow p99 points to occasional issues — lock contention, cold caches, a slow DB replica, or GC pauses — not a uniform problem.

## Trace the slow span

APM tools (traces) show exactly where time goes inside a request:

```text
GET /orders                       950ms
  ├─ auth check                    10ms
  ├─ SELECT orders                 30ms
  └─ loop: SELECT user per order  900ms   ← N+1 query, the real cause
```

The trace points straight at the offending call. Then check **recent changes** — a deploy, a missing index, or 10x data growth often explains a sudden regression.

## Why it matters

Guessing wastes hours optimizing the wrong layer. Measuring first, tracing the slow span, and looking at **p50 vs p99** turns a vague "it's slow" into a specific, fixable cause — and re-measuring proves the fix actually worked.