TL;DR

  • Selenium WebDriver automates real browsers (Chrome, Firefox, Safari, Edge) for testing web applications
  • Start with Python — simpler syntax, faster feedback loop for beginners
  • Master locators (ID, CSS, XPath) and explicit waits before writing complex tests
  • Use Page Object Model from day one — refactoring later is painful

Best for: QA engineers starting browser automation, developers writing E2E tests Skip if: You only need API testing (use Postman) or already know Playwright/Cypress Read time: 15 minutes

Your first Selenium test will probably fail. Not because Selenium is broken, but because web automation has timing issues you haven’t encountered before. Elements load asynchronously. Buttons become clickable after JavaScript executes. Forms validate on blur.

This tutorial teaches you Selenium the right way — handling these real-world challenges from the start.

What is Selenium WebDriver?

Selenium WebDriver is a browser automation tool that controls real browsers programmatically. Unlike tools that simulate browsers, Selenium drives actual Chrome, Firefox, Safari, and Edge instances.

Core components:

  • WebDriver API — language bindings (Python, Java, C#, JavaScript, Ruby)
  • Browser drivers — ChromeDriver, GeckoDriver, SafariDriver
  • Selenium Grid — distributed test execution across machines

Selenium is free, open-source, and maintained by a large community. It’s the foundation that tools like Appium (mobile) and Selenide (Java wrapper) build upon.

Environment Setup

Python Setup

# Install Selenium
pip install selenium

# Install WebDriver Manager (auto-downloads drivers)
pip install webdriver-manager
# test_first.py
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager

# Automatic driver management
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))

driver.get("https://example.com")
print(f"Page title: {driver.title}")

driver.quit()

Java Setup

<!-- pom.xml -->
<dependencies>
    <dependency>
        <groupId>org.seleniumhq.selenium</groupId>
        <artifactId>selenium-java</artifactId>
        <version>4.18.1</version>
    </dependency>
    <dependency>
        <groupId>io.github.bonigarcia</groupId>
        <artifactId>webdrivermanager</artifactId>
        <version>5.7.0</version>
    </dependency>
</dependencies>
// FirstTest.java
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import io.github.bonigarcia.wdm.WebDriverManager;

public class FirstTest {
    public static void main(String[] args) {
        WebDriverManager.chromedriver().setup();
        WebDriver driver = new ChromeDriver();

        driver.get("https://example.com");
        System.out.println("Page title: " + driver.getTitle());

        driver.quit();
    }
}

Locator Strategies

Locators find elements on the page. Choose the right strategy for maintainable tests.

Priority Order (Best to Worst)

StrategyExampleWhen to Use
ID#login-buttonUnique IDs (best option)
Name[name="email"]Form inputs
CSS Selector.btn.primaryMost elements
Link TextSign InLinks with stable text
XPath//div[@class='card']Complex relationships

Python Examples

from selenium.webdriver.common.by import By

# By ID (fastest, most reliable)
driver.find_element(By.ID, "username")

# By CSS Selector (flexible, readable)
driver.find_element(By.CSS_SELECTOR, "button.submit-btn")
driver.find_element(By.CSS_SELECTOR, "[data-testid='login']")

# By XPath (when CSS fails)
driver.find_element(By.XPATH, "//button[contains(text(), 'Submit')]")
driver.find_element(By.XPATH, "//div[@class='form']//input[@type='email']")

# By Link Text
driver.find_element(By.LINK_TEXT, "Forgot Password?")
driver.find_element(By.PARTIAL_LINK_TEXT, "Forgot")

Java Examples

import org.openqa.selenium.By;

// By ID
driver.findElement(By.id("username"));

// By CSS Selector
driver.findElement(By.cssSelector("button.submit-btn"));
driver.findElement(By.cssSelector("[data-testid='login']"));

// By XPath
driver.findElement(By.xpath("//button[contains(text(), 'Submit')]"));

// By Link Text
driver.findElement(By.linkText("Forgot Password?"));

CSS vs XPath Decision

Use CSS when:

  • Selecting by class, ID, or attribute
  • Element has data-testid or similar test attribute
  • Performance matters (CSS is faster)

Use XPath when:

  • Finding element by text content
  • Navigating to parent elements
  • Complex conditions (and, or, contains)
# CSS: cleaner for attributes
driver.find_element(By.CSS_SELECTOR, "[data-testid='submit-btn']")

# XPath: necessary for text matching
driver.find_element(By.XPATH, "//button[text()='Submit Order']")

# XPath: parent navigation (CSS can't do this)
driver.find_element(By.XPATH, "//span[text()='Error']/parent::div")

Waits: The Most Important Concept

90% of flaky Selenium tests fail because of timing. Elements aren’t ready when your code tries to interact with them.

Types of Waits

TypeHow It WorksWhen to Use
ImplicitPolls DOM for N secondsNever (global, hides real issues)
ExplicitWaits for specific conditionAlways (precise, readable)
FluentExplicit + custom pollingSlow-loading elements

Explicit Waits (Python)

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

wait = WebDriverWait(driver, 10)  # Max 10 seconds

# Wait for element to be clickable
button = wait.until(EC.element_to_be_clickable((By.ID, "submit")))
button.click()

# Wait for element to be visible
message = wait.until(EC.visibility_of_element_located((By.CLASS_NAME, "success")))

# Wait for element to disappear
wait.until(EC.invisibility_of_element_located((By.ID, "loading")))

# Wait for text to appear
wait.until(EC.text_to_be_present_in_element((By.ID, "status"), "Complete"))

# Custom condition
wait.until(lambda d: len(d.find_elements(By.CLASS_NAME, "item")) > 5)

Explicit Waits (Java)

import org.openqa.selenium.support.ui.WebDriverWait;
import org.openqa.selenium.support.ui.ExpectedConditions;
import java.time.Duration;

WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));

// Wait for element to be clickable
WebElement button = wait.until(
    ExpectedConditions.elementToBeClickable(By.id("submit"))
);
button.click();

// Wait for visibility
WebElement message = wait.until(
    ExpectedConditions.visibilityOfElementLocated(By.className("success"))
);

// Wait for element to disappear
wait.until(ExpectedConditions.invisibilityOfElementLocated(By.id("loading")));

Common Wait Conditions

from selenium.webdriver.support import expected_conditions as EC

# Element state
EC.presence_of_element_located((By.ID, "elem"))      # In DOM
EC.visibility_of_element_located((By.ID, "elem"))    # Visible
EC.element_to_be_clickable((By.ID, "elem"))          # Clickable
EC.invisibility_of_element_located((By.ID, "elem"))  # Hidden/gone

# Page state
EC.title_contains("Dashboard")
EC.url_contains("/dashboard")
EC.alert_is_present()

# Multiple elements
EC.presence_of_all_elements_located((By.CLASS_NAME, "item"))

Complete Test Example

Login Test (Python)

# tests/test_login.py
import pytest
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager

class TestLogin:
    def setup_method(self):
        self.driver = webdriver.Chrome(
            service=Service(ChromeDriverManager().install())
        )
        self.driver.implicitly_wait(0)  # Disable implicit waits
        self.wait = WebDriverWait(self.driver, 10)

    def teardown_method(self):
        self.driver.quit()

    def test_successful_login(self):
        self.driver.get("https://example.com/login")

        # Fill login form
        email_input = self.wait.until(
            EC.visibility_of_element_located((By.ID, "email"))
        )
        email_input.send_keys("user@example.com")

        password_input = self.driver.find_element(By.ID, "password")
        password_input.send_keys("password123")

        # Submit form
        submit_btn = self.driver.find_element(By.CSS_SELECTOR, "[type='submit']")
        submit_btn.click()

        # Verify redirect to dashboard
        self.wait.until(EC.url_contains("/dashboard"))

        welcome_message = self.wait.until(
            EC.visibility_of_element_located((By.CLASS_NAME, "welcome"))
        )
        assert "Welcome" in welcome_message.text

    def test_invalid_credentials(self):
        self.driver.get("https://example.com/login")

        self.wait.until(
            EC.visibility_of_element_located((By.ID, "email"))
        ).send_keys("invalid@example.com")

        self.driver.find_element(By.ID, "password").send_keys("wrongpass")
        self.driver.find_element(By.CSS_SELECTOR, "[type='submit']").click()

        error_message = self.wait.until(
            EC.visibility_of_element_located((By.CLASS_NAME, "error"))
        )
        assert "Invalid credentials" in error_message.text

Login Test (Java)

// src/test/java/LoginTest.java
import org.junit.jupiter.api.*;
import org.openqa.selenium.*;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.support.ui.*;
import io.github.bonigarcia.wdm.WebDriverManager;
import java.time.Duration;

import static org.junit.jupiter.api.Assertions.*;

class LoginTest {
    private WebDriver driver;
    private WebDriverWait wait;

    @BeforeEach
    void setup() {
        WebDriverManager.chromedriver().setup();
        driver = new ChromeDriver();
        wait = new WebDriverWait(driver, Duration.ofSeconds(10));
    }

    @AfterEach
    void teardown() {
        driver.quit();
    }

    @Test
    void testSuccessfulLogin() {
        driver.get("https://example.com/login");

        WebElement emailInput = wait.until(
            ExpectedConditions.visibilityOfElementLocated(By.id("email"))
        );
        emailInput.sendKeys("user@example.com");

        driver.findElement(By.id("password")).sendKeys("password123");
        driver.findElement(By.cssSelector("[type='submit']")).click();

        wait.until(ExpectedConditions.urlContains("/dashboard"));

        WebElement welcomeMessage = wait.until(
            ExpectedConditions.visibilityOfElementLocated(By.className("welcome"))
        );
        assertTrue(welcomeMessage.getText().contains("Welcome"));
    }
}

Page Object Model

Page Object Model (POM) separates page structure from test logic. Each page becomes a class with elements and actions.

Python Page Object

# pages/login_page.py
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

class LoginPage:
    URL = "https://example.com/login"

    # Locators
    EMAIL_INPUT = (By.ID, "email")
    PASSWORD_INPUT = (By.ID, "password")
    SUBMIT_BUTTON = (By.CSS_SELECTOR, "[type='submit']")
    ERROR_MESSAGE = (By.CLASS_NAME, "error")

    def __init__(self, driver):
        self.driver = driver
        self.wait = WebDriverWait(driver, 10)

    def open(self):
        self.driver.get(self.URL)
        self.wait.until(EC.visibility_of_element_located(self.EMAIL_INPUT))
        return self

    def login(self, email: str, password: str):
        self.driver.find_element(*self.EMAIL_INPUT).send_keys(email)
        self.driver.find_element(*self.PASSWORD_INPUT).send_keys(password)
        self.driver.find_element(*self.SUBMIT_BUTTON).click()

    def get_error_message(self) -> str:
        error = self.wait.until(
            EC.visibility_of_element_located(self.ERROR_MESSAGE)
        )
        return error.text
# pages/dashboard_page.py
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

class DashboardPage:
    WELCOME_MESSAGE = (By.CLASS_NAME, "welcome")
    USER_MENU = (By.ID, "user-menu")
    LOGOUT_LINK = (By.LINK_TEXT, "Logout")

    def __init__(self, driver):
        self.driver = driver
        self.wait = WebDriverWait(driver, 10)

    def is_loaded(self) -> bool:
        self.wait.until(EC.url_contains("/dashboard"))
        return True

    def get_welcome_text(self) -> str:
        message = self.wait.until(
            EC.visibility_of_element_located(self.WELCOME_MESSAGE)
        )
        return message.text

    def logout(self):
        self.driver.find_element(*self.USER_MENU).click()
        self.wait.until(
            EC.element_to_be_clickable(self.LOGOUT_LINK)
        ).click()
# tests/test_login_pom.py
import pytest
from pages.login_page import LoginPage
from pages.dashboard_page import DashboardPage

class TestLoginPOM:
    def test_successful_login(self, driver):
        login_page = LoginPage(driver)
        login_page.open()
        login_page.login("user@example.com", "password123")

        dashboard = DashboardPage(driver)
        assert dashboard.is_loaded()
        assert "Welcome" in dashboard.get_welcome_text()

    def test_invalid_login(self, driver):
        login_page = LoginPage(driver)
        login_page.open()
        login_page.login("invalid@example.com", "wrongpass")

        assert "Invalid credentials" in login_page.get_error_message()

Java Page Object

// pages/LoginPage.java
import org.openqa.selenium.*;
import org.openqa.selenium.support.ui.*;
import java.time.Duration;

public class LoginPage {
    private final WebDriver driver;
    private final WebDriverWait wait;
    private static final String URL = "https://example.com/login";

    // Locators
    private final By emailInput = By.id("email");
    private final By passwordInput = By.id("password");
    private final By submitButton = By.cssSelector("[type='submit']");
    private final By errorMessage = By.className("error");

    public LoginPage(WebDriver driver) {
        this.driver = driver;
        this.wait = new WebDriverWait(driver, Duration.ofSeconds(10));
    }

    public LoginPage open() {
        driver.get(URL);
        wait.until(ExpectedConditions.visibilityOfElementLocated(emailInput));
        return this;
    }

    public void login(String email, String password) {
        driver.findElement(emailInput).sendKeys(email);
        driver.findElement(passwordInput).sendKeys(password);
        driver.findElement(submitButton).click();
    }

    public String getErrorMessage() {
        return wait.until(
            ExpectedConditions.visibilityOfElementLocated(errorMessage)
        ).getText();
    }
}

Handling Common Scenarios

from selenium.webdriver.support.ui import Select

# Standard HTML select
dropdown = Select(driver.find_element(By.ID, "country"))
dropdown.select_by_visible_text("United States")
dropdown.select_by_value("us")
dropdown.select_by_index(1)

# Custom dropdown (React, Vue, etc.)
driver.find_element(By.CSS_SELECTOR, ".dropdown-trigger").click()
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".dropdown-menu")))
driver.find_element(By.XPATH, "//li[text()='United States']").click()

Alerts

# Accept alert
alert = wait.until(EC.alert_is_present())
alert.accept()

# Dismiss alert
alert.dismiss()

# Get alert text
alert_text = alert.text

# Type into prompt
alert.send_keys("My input")
alert.accept()

Frames and Windows

# Switch to iframe
driver.switch_to.frame("iframe-name")
driver.switch_to.frame(driver.find_element(By.ID, "my-iframe"))

# Switch back to main content
driver.switch_to.default_content()

# Handle new window/tab
original_window = driver.current_window_handle
driver.find_element(By.LINK_TEXT, "Open New Tab").click()

wait.until(EC.number_of_windows_to_be(2))

for handle in driver.window_handles:
    if handle != original_window:
        driver.switch_to.window(handle)
        break

# Do something in new window
driver.close()
driver.switch_to.window(original_window)

Screenshots

# Full page screenshot
driver.save_screenshot("screenshot.png")

# Element screenshot
element = driver.find_element(By.ID, "chart")
element.screenshot("chart.png")

# Screenshot on failure (pytest fixture)
@pytest.fixture
def driver():
    d = webdriver.Chrome()
    yield d
    if hasattr(sys, '_current_test_failed') and sys._current_test_failed:
        d.save_screenshot(f"failure_{datetime.now().isoformat()}.png")
    d.quit()

Headless Mode and CI/CD

Headless Browser

from selenium.webdriver.chrome.options import Options

options = Options()
options.add_argument("--headless=new")
options.add_argument("--no-sandbox")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--window-size=1920,1080")

driver = webdriver.Chrome(options=options)

GitHub Actions Integration

# .github/workflows/selenium-tests.yml
name: Selenium Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install dependencies
        run: |
          pip install selenium pytest webdriver-manager

      - name: Run tests
        run: pytest tests/ -v --tb=short

      - name: Upload screenshots
        if: failure()
        uses: actions/upload-artifact@v4
        with:
          name: screenshots
          path: screenshots/

AI-Assisted Selenium Development

AI tools can accelerate Selenium test development when used correctly.

What AI does well:

  • Generating locator strategies from HTML snippets
  • Converting manual test steps to Selenium code
  • Writing Page Object boilerplate
  • Explaining cryptic error messages
  • Suggesting wait strategies for specific scenarios

What still needs humans:

  • Deciding which tests to automate
  • Debugging timing-related flakiness
  • Choosing between locator strategies for ambiguous elements
  • Understanding application-specific behavior

Useful prompt:

I have this HTML form:
<form id="login">
  <input type="email" name="email" placeholder="Email">
  <input type="password" name="password">
  <button type="submit">Sign In</button>
</form>

Write a Selenium Python test that:
1. Fills in email and password
2. Submits the form
3. Waits for redirect to /dashboard
4. Verifies a welcome message appears

Use explicit waits and proper locators.

FAQ

Is Selenium hard to learn?

Selenium basics take 2-4 weeks to learn. The WebDriver API is straightforward — find_element(), click(), send_keys(). The challenge is mastering waits, choosing stable locators, and structuring tests with Page Object Model. Practice with a real application, not just tutorials.

Which language is best for Selenium?

Python for beginners — simple syntax, quick feedback, excellent documentation. Java for enterprise projects with existing Java infrastructure and CI/CD pipelines. JavaScript if your team already uses Node.js. All languages have mature Selenium support.

Is Selenium still relevant in 2026?

Yes. Selenium remains the industry standard with the largest community, most tutorials, and broadest browser support. While Playwright and Cypress offer better developer experience, Selenium integrates with more tools, supports more browsers, and has more enterprise adoption.

What is the difference between Selenium and Playwright?

Selenium controls browsers through WebDriver protocol (W3C standard). Playwright uses Chrome DevTools Protocol for Chromium browsers. Playwright has better auto-wait, built-in assertions, trace viewer, and parallel execution. Selenium has broader browser support, larger community, and more learning resources. For new projects with Chromium focus, Playwright is often the better choice. For existing projects or Safari/legacy browser needs, Selenium remains solid.

When to Choose Selenium

Choose Selenium when:

  • Team has existing Selenium expertise
  • Need Safari or legacy browser support
  • Corporate environment with Selenium infrastructure
  • Want maximum community support and resources
  • Using Appium for mobile (same API)

Consider alternatives when:

  • Starting fresh with Chromium-only needs (Playwright)
  • Component testing with JavaScript framework (Cypress)
  • Prefer auto-wait and simpler API (Playwright)
  • Need video recording and trace debugging (Playwright)

Official Resources

See Also