The landscape splits by pillar — metrics, logs, traces — plus all-in-one managed platforms. The choice comes down to self-hosted vs managed, driven by team size, budget, and scale.
Tools by pillar
METRICS Prometheus → pull-based scraping, time-series DB, PromQL query language
Grafana → dashboards on top of Prometheus (and many other sources)
LOGS ELK → Elasticsearch + Logstash + Kibana (powerful, heavy to run)
Loki → "Prometheus for logs": cheap, indexes labels not full text
TRACES Jaeger → distributed tracing, OpenTelemetry-compatible
Tempo → trace backend that pairs with Grafana/Loki
ALL-IN-ONE Datadog → managed metrics + logs + traces + APM in one product
