Don't guess — measure first. Slowness can come from the client, the network, the server, or the database. A methodical approach finds where the time goes, then fixes the biggest contributor instead of optimizing randomly.
Don't guess — measure first. Slowness can come from the client, the network, the server, or the database. A methodical approach finds where the time goes, then fixes the biggest contributor instead of optimizing randomly.
1. MEASURE → where is the time spent? client render, network, server, DB?
2. REPRODUCE → confirm it reliably (same endpoint, payload, user)
3. TRACE → use APM/distributed traces to find the slow span
4. CHECK RECENT CHANGES → deploys, config, traffic, data growth
5. ISOLATE → layer by layer, narrow to one component
6. FIX the biggest contributor → re-measure to confirm
Use the browser Network/Performance tab and server timing to split the total. A useful breakdown:
Total 1200ms =
DNS/connect 20ms
server TTFB 900ms ← the bottleneck is server-side
download 80ms
client render 200ms
Look at percentiles, not averages: p50 (typical user) vs p99 (worst case). A fast p50 with a slow p99 points to occasional issues — lock contention, cold caches, a slow DB replica, or GC pauses — not a uniform problem.
APM tools (traces) show exactly where time goes inside a request:
GET /orders 950ms
├─ auth check 10ms
├─ SELECT orders 30ms
└─ loop: SELECT user per order 900ms ← N+1 query, the real cause
The trace points straight at the offending call. Then check recent changes — a deploy, a missing index, or 10x data growth often explains a sudden regression.
Guessing wastes hours optimizing the wrong layer. Measuring first, tracing the slow span, and looking at p50 vs p99 turns a vague "it's slow" into a specific, fixable cause — and re-measuring proves the fix actually worked.