उत्पादन घटनामा तपाइँको भूमिका वर्णन गर्नुहोस्।

Question

Accepted Answer

तिनीहरू तपाइँ **शान्त, पद्धतिगत, र दोषमुक्त** रहन चाहन्छन् दबाब अन्तर्गत — सेवा पहिले पुनर्स्थापित गर्नुहोस्, निदान दोस्रोमा, र पुनरावृत्ति रोकथाम तेस्रोमा। **STAR** प्रयोग गर्नुहोस्।

## यो किन महत्त्वपूर्ण छ

```text
INCIDENT ORDER
1. Stabilize — stop the bleeding (rollback, failover, mitigate)
2. Communicate — keep stakeholders updated on a clear channel
3. Diagnose — root cause once it's stable, not during
4. Prevent — a blameless post-mortem with action items
```

## काम गरिएको उदाहरण

```text
S: A deploy caused checkout errors for ~15% of users.
T: I was on call and had to restore service fast.
A: I rolled back the deploy first (service recovered in minutes), posted updates
   every 10 minutes, then traced the cause to an unhandled null from a new API
   field. I added a guard and a contract test.
R: Downtime stayed under 20 minutes. The post-mortem added the missing test to
   CI so it can't recur.
```

## राम्रो बनाम कमजोर

```text
✓ Mitigate first, communicate, blameless follow-up
✗ Debugging live while users are down
✗ Blaming the person who deployed
```

## यो किन महत्त्वपूर्ण छ

घटनाहरूले संयम परीक्षा गर्छन् — टीमलाई चीजहरू टुट्टा हुँदा शान्त हातहरु चाहिन्छ, आतंक नभएर।

दोषमुक्त दृष्टिकोणले मानिसहरूलाई कारणहरू बारे ईमानदार राख्छ, जो वास्तविक रूपमा दोहोरिने रोक्ने एकमात्र तरिका हो।

तपाइँले सबैभन्दा खराब दिन कस्तो हेर्नुहुन्छ भनेर तपाइँको वरिष्ठता बारे राम्रा दिनहरूमा कस्तो हेर्नुहुन्छ भन्दा बढी भन्छ।