›How would you design monitoring for an application from scratch?Middle#Observability#Monitoring#Sli#Sre#ReliabilityDetails →
›What are metrics, logs, and traces, and when do you reach for each?Middle#Observability#Metrics#Logging#Tracing#SreDetails →
›How do you choose alert thresholds to avoid alert fatigue and false positives?Middle#Alerting#Monitoring#Slo#Sre#ReliabilityDetails →
›What are the four golden signals of monitoring?Middle#Observability#Monitoring#Golden Signals#Sre#MetricsDetails →
›What are the common monitoring tools and how do you choose between them?Middle#Observability#Monitoring#Tooling#Prometheus#SreDetails →
›How do you detect problems before users complain?Middle#Observability#Monitoring#Slo#Reliability#SreDetails →
›A web page or API is slow — how do you find the cause?Middle#Performance#Debugging#Profiling#Observability#SreDetails →
›How do you tell whether a bottleneck is CPU, memory, I/O, or network?Middle#Performance#Profiling#Linux#Troubleshooting#SreDetails →
›How do you decide what to cache and for how long (TTL)?Middle#Caching#Performance#Reliability#SreDetails →
›What is a cache stampede and how do you prevent it?Middle#Caching#Reliability#Performance#SreDetails →
›What is graceful degradation when a dependency fails?Middle#Resilience#Availability#Reliability#SreDetails →
›How do circuit breakers and retries with backoff work in distributed systems?Middle#Resilience#Reliability#Availability#SreDetails →
›How do you tell a DDoS attack apart from a natural traffic spike?Middle#Ddos#Security#Sre#Incident ResponseDetails →
›What are the layers of DDoS defense, and what does each handle?Middle#Ddos#Security#Cdn#Waf#Rate Limiting#NetworkDetails →