Dealing with Flaky Tests

Yuri Kan

Dealing with Flaky Tests

Master flaky test management: root causes, synchronization fixes, quarantine systems, detection strategies, and prevention best practices.

Quick Answer

Dealing with Flaky Tests covers essential QA skills — after this lesson you can identify the root causes of flaky tests in different testing contexts.

— Yuri Kan, Senior QA Lead

What You Will Learn

Identify the root causes of flaky tests in different testing contexts
Apply proven strategies to fix and prevent test flakiness
Build a flaky test detection and quarantine system

Table of Contents

What Are Flaky Tests?

A flaky test is a test that passes and fails intermittently without any code changes. You run the test suite — it passes. You run it again on the same code — a test fails. You run it a third time — it passes again. This non-deterministic behavior destroys trust in the test suite and wastes enormous amounts of developer time investigating false failures.

Industry data shows that flaky tests are the number one complaint of development teams about test automation. Google reported that 1.5% of their tests were flaky, and those tests consumed 2-16% of their entire compute resources through retries. At scale, even a small flaky test percentage has massive impact.

Root Causes of Flakiness

1. Timing and Synchronization Issues (Most Common)

The test interacts with the UI before an element is ready, or checks a condition before an asynchronous operation completes.

Bad pattern:

await page.click('#submit');
// Element might not exist yet!
const message = await page.textContent('.success-message');
expect(message).toBe('Order placed');

Fixed pattern:

await page.click('#submit');
await expect(page.locator('.success-message')).toHaveText('Order placed');
// Playwright automatically waits for the element and retries

2. Test Order Dependencies

Tests that depend on other tests running first (shared state, data created by previous tests).

3. Shared Mutable State

Tests that modify global state (database, files, environment variables) without proper isolation.

4. External Service Dependencies

Tests that call real external services which may be slow, rate-limited, or occasionally unavailable.

5. Resource Contention

Tests competing for limited resources (ports, file handles, database connections) during parallel execution.

6. Time-Dependent Logic

Tests that depend on the current time, day of week, or timezone.

Fixing Flaky Tests

Replace Sleep with Explicit Waits

// BAD — hardcoded sleep
await page.click('#submit');
await page.waitForTimeout(3000);
expect(await page.textContent('.result')).toBe('Success');

// GOOD — wait for condition
await page.click('#submit');
await expect(page.locator('.result')).toHaveText('Success', { timeout: 10000 });

Ensure Test Isolation

@BeforeEach
void isolateTest() {
    database.beginTransaction();
    // Each test gets a clean state
}

@AfterEach
void cleanupTest() {
    database.rollbackTransaction();
    // All changes are undone
}

Mock External Services

await page.route('**/api/external-service/**', route => {
    route.fulfill({
        status: 200,
        body: JSON.stringify({ result: 'mocked response' })
    });
});

Flaky Test Detection Systems

Repeat Mode in PR Pipeline

Run new or modified tests multiple times before merging:

# GitHub Actions example
- name: Run new tests 20 times
  run: |
    for i in {1..20}; do
      npx playwright test --grep @new
    done

Flakiness Tracking Dashboard

Track each test’s pass/fail history over time:

Test: testCheckoutFlow
  Last 100 runs: 96 pass, 4 fail (96% reliability)
  Status: FLAKY (below 99% threshold)
  Last failure: 2024-01-15 — TimeoutError on .payment-confirmation
  Assigned to: @developer-alice

Automatic Flaky Detection

# Analyze test results across CI runs
def detect_flaky_tests(results_last_30_days):
    for test in results:
        pass_rate = test.passes / (test.passes + test.failures)
        if 0.5 < pass_rate < 0.99:
            mark_as_flaky(test)
            notify_team(test)

Quarantine System

When a test is identified as flaky:

Mark it: Add a @Flaky tag or move to a quarantine suite
Isolate it: Remove from the blocking CI pipeline
Monitor it: Continue running in a separate non-blocking job
Fix it: Assign an owner and set a deadline
Restore it: Once fixed and stable for N runs, move back to the main suite

@Tag("quarantine")
@Flaky(reason = "Intermittent timeout on slow CI runners", ticket = "BUG-123")
@Test
void testCheckoutWithCoupon() {
    // This test is quarantined - it runs but does not block deployment
}

Prevention Best Practices

Never use hardcoded waits — always use explicit conditions
Run tests in random order — catches order dependencies early
Repeat new tests — run 20-50 times before merging
Mock external services — eliminate network variability
Use unique test data — avoid conflicts between parallel tests
Set realistic timeouts — long enough for slow CI, short enough to fail fast
Review flaky metrics weekly — make flakiness visible to the team

Exercises

Exercise 1: Diagnose and Fix

Take 3 intentionally flaky tests (timing-dependent, order-dependent, and shared-state) and fix each one. Document the root cause and the fix.

Exercise 2: Build a Quarantine System

Create a @Quarantine tag/label mechanism in your test framework
Configure CI to run quarantined tests separately
Build a script that tracks flaky test history across runs
Set up alerts when a test’s reliability drops below 99%

Exercise 3: Prevention Pipeline

Add repeat-mode testing for new tests in your PR pipeline
Configure random test ordering
Set up a flakiness dashboard tracking reliability per test
Create a team policy document for handling flaky tests

Knowledge Check

1. What is the most common cause of flaky UI tests?

2. What is test quarantine?

3. Which technique helps detect flaky tests before they reach the main branch?

Frequently Asked Questions

What is dealing with flaky tests?

Dealing with Flaky Tests is a key concept in Test Automation. This lesson teaches you to identify the root causes of flaky tests in different testing contexts, providing practical skills you can apply immediately in your testing work.

How do I apply dealing with flaky tests in real projects?

Start by practicing the core techniques covered in this lesson. Specifically, you should apply proven strategies to fix and prevent test flakiness. Apply these skills in your current project to see immediate results.

Why is dealing with flaky tests important for QA engineers?

Dealing with Flaky Tests is a core skill that employers look for in QA professionals. It directly impacts test coverage, defect detection, and team efficiency. Mastering it strengthens your Test Automation capabilities and makes you more effective at delivering quality software.

What should I know before learning dealing with flaky tests?

You should have a basic understanding of software testing fundamentals. Familiarity with flaky tests will help, but the lesson includes review sections for key prerequisites.

How does dealing with flaky tests help my QA career?

Knowledge of dealing with flaky tests is frequently listed in QA job descriptions and interview questions. It demonstrates expertise in flaky tests, test stability and shows you can contribute to quality assurance at a professional level. Senior roles especially value this competency.

Dealing with Flaky Tests

What You Will Learn

What Are Flaky Tests? #

Root Causes of Flakiness #

1. Timing and Synchronization Issues (Most Common) #

2. Test Order Dependencies #

3. Shared Mutable State #

4. External Service Dependencies #

5. Resource Contention #

6. Time-Dependent Logic #

Fixing Flaky Tests #

Replace Sleep with Explicit Waits #

Ensure Test Isolation #

Mock External Services #

Flaky Test Detection Systems #

Repeat Mode in PR Pipeline #

Flakiness Tracking Dashboard #

Automatic Flaky Detection #

Quarantine System #

Prevention Best Practices #

Exercises #

Exercise 1: Diagnose and Fix #

Exercise 2: Build a Quarantine System #

Exercise 3: Prevention Pipeline #

Knowledge Check

Frequently Asked Questions

What Are Flaky Tests?

Root Causes of Flakiness

1. Timing and Synchronization Issues (Most Common)

2. Test Order Dependencies

3. Shared Mutable State

4. External Service Dependencies

5. Resource Contention

6. Time-Dependent Logic

Fixing Flaky Tests

Replace Sleep with Explicit Waits

Ensure Test Isolation

Mock External Services

Flaky Test Detection Systems

Repeat Mode in PR Pipeline

Flakiness Tracking Dashboard

Automatic Flaky Detection

Quarantine System

Prevention Best Practices

Exercises

Exercise 1: Diagnose and Fix

Exercise 2: Build a Quarantine System

Exercise 3: Prevention Pipeline