Lura ta bugi akan gida uku — kwanan, auzar ma'auni, da bakin zane — kuma manufar ita ce a amsa "menene bai dace ba kuma me yasa" ga tsarin da ya fi girma don a bincika da hannnu. A girma, dabarar ita ce dangantawa, jajjafe, da shiryar kudin.
Lura ta bugi akan gida uku — kwanan, auzar ma'auni, da bakin zane — kuma manufar ita ce a amsa "menene bai dace ba kuma me yasa" ga tsarin da ya fi girma don a bincika da hannnu. A girma, dabarar ita ce dangantawa, jajjafe, da shiryar kudin.
| Gida | Amsa a kan | Kayan aiki |
|---|
| Auzar ma'auni | Akwai wani abu da bai dace ba? (ka'ida, jinsi) | Prometheus, Grafana |
| Bakin zane | A ina a cikin magudanar ruwa? | OpenTelemetry, Jaeger |
| Kwanan | Menene gaske ya faru? | ELK, Loki |
Metrics alert ─▶ trace pinpoints the slow service ─▶ logs explain the cause
(broad) (path) (detail)
Nawan dangantawa/bakin zane ya kamata ya wuce ta cikin lakabin auzar ma'auni, layukan kwanan, da spans, don kaɗai za ka iya canza jiya a gida.
log line: level=error trace_id=abc123 service=payments msg="gateway timeout"
^^^^^^^^^^^^^^^ same id appears in the trace + metrics
✓ Standardize: OpenTelemetry across all services
✓ Use structured (JSON) logs — queryable, not grep-only
✓ Sample traces (e.g. keep all errors + 1% of success) to control cost
✓ Define SLOs and alert on symptoms (latency/error rate), not noise
✓ RED/USE method for dashboards (Rate, Errors, Duration)
Jajjafewa kome a 100% bai yiwu ba kuma ya nutse alama. Ka jajjafu, ka tsara, kuma ka girmama a SLO maimakon haka.
Tare da dalilai masu jari, ba za ka iya shiga SSH da kulla ba — lura ita ce kawai hanya don ganewa halaye na sarrafa aiki.
Dabarar samun nasara ita ce ta dangantawa, ta jajjafe, kuma ta ɗorawa da SLO: tana agida matsaloli na hakika da sauri ba tare da yadda ya kashe ku a ajiya telemitariri ba ko nutse mai agaji a tabki.