不稳定测试是指那些在代码未改变的情况下结果不一致的测试——有时通过,有时在相同的代码上失败。这是一个严重的问题,因为它削弱了对测试套件的信任。理解它们的原因和解决方案很重要。
不稳定测试是什么以及它们为何有害
A FLAKY test gives INCONSISTENT results (pass sometimes, fail other times) on the SAME code:
→ harmful: ERODES TRUST — people start ignoring failures ("oh, it's just flaky") →
real failures get missed too
→ waste time on false alarms / re-runs; break CI; reduce confidence in the whole suite
→ Flaky tests are worse than no test if they make people distrust all tests.
常见原因
✗ TIMING / async → race conditions; not waiting properly for async operations (a top cause
in UI/E2E tests); arbitrary sleeps
✗ ORDER DEPENDENCE → tests depending on each other / shared mutable state
✗ EXTERNAL dependencies → real network/services (network blips, rate limits, downtime)
✗ NON-DETERMINISM → time/dates, randomness, timezone, locale
✗ Test ENVIRONMENT → leftover state, uncleaned data, concurrency/parallelism issues
