Wiwit , dudu bali-bali saka bumi. Armada host sing paling andal ora ana gunane yen panjalukan gagal, dadi wiwit karo sing ngadhepi pangguna — , , — banjur nambah patang sinyal emas, banjur metrik infra terakhir.
Wiwit , dudu bali-bali saka bumi. Armada host sing paling andal ora ana gunane yen panjalukan gagal, dadi wiwit karo sing ngadhepi pangguna — , , — banjur nambah patang sinyal emas, banjur metrik infra terakhir.
1. USER-FACING SLIs → what the user experiences (latency, errors, availability)
2. GOLDEN SIGNALS → latency, traffic, errors, saturation per service
3. INFRA METRICS → CPU, memory, disk, network (causes, not symptoms)
Yen mung nonton CPU lan disk (bali-bali), sampeyan bisa dadi kabeh ijo nalika pangguna entuk 500. Nonton SLI pisanan (ndhuwur mudhur) tegese sampeyan tangi ing gejala sing bener-bener rarasake pangguna, banjur bor mudhun menyang sinyal emas lan infra kanggo nemokake sabab.
INSTRUMENT app emits metrics/logs/traces (e.g. request_duration_seconds histogram)
↓
COLLECT a TSDB scrapes/ingests them (Prometheus, Datadog agent)
↓
DASHBOARD visualize SLIs + golden signals (Grafana) for humans to read
↓
ALERT fire on SLO violations / burn rate, routed to on-call
# Availability SLI: fraction of requests that succeed
sum(rate(http_requests_total{status!~"5.."}[5m]))
/ sum(rate(http_requests_total[5m]))
# Latency SLI: p99 request latency
histogram_quantile(0.99, sum by (le) (rate(http_request_duration_seconds_bucket[5m])))
Gawe SLO ing saben SLI (contone ketersediaan 99,9%, p99 < 300ms), papan instrumental dheweke, lan tangi nalika SLO ana ing risiko — ora ing saben kripik.
Pengawasan sing dibangun bali-bali ngomong sampeyan disk 80% penuh nanging ora sing pelanggan ora bisa mriksa. Wiwit saka SLI sing ngadhepi pangguna ngikat saben papan instrumental lan tangi bali menyang dampak pangguna sing nyata, nanggung bising kurang, lan menehi landasan bor mudhun sing jelas (gejala → sinyal emas → sebab infra) nalika soko nrungkepi.