Selenium Tutorial for Beginners 2026: Complete WebDriver Guide

Q: Is Selenium hard to learn?

Selenium basics take 2-4 weeks to learn. The API is straightforward, but mastering waits, locators, and Page Object Model requires practice with real projects.

Q: Which language is best for Selenium?

Python for beginners due to simple syntax. Java for enterprise projects with existing Java infrastructure. Both have excellent Selenium support.

Q: Is Selenium still relevant in 2026?

Yes. Selenium remains the most widely used browser automation tool with the largest community, cross-browser support, and integration with all major CI/CD systems.

Q: What is the difference between Selenium and Playwright?

Selenium supports more browsers and has a larger community. Playwright offers better auto-wait, built-in assertions, and parallel execution. Choose based on project needs.

Q: Can Selenium test mobile apps?

Not directly. Selenium tests web applications in browsers. For mobile native apps, use Appium which extends Selenium WebDriver API to iOS and Android.

Q: How do I handle dynamic elements in Selenium?

Use explicit waits with WebDriverWait and expected conditions. Wait for elements to be visible or clickable instead of using sleep() or implicit waits.

Step-by-step Selenium tutorial for beginners. Learn WebDriver setup, locators, waits, and Page Object Model with Python and Java examples.

TL;DR
Selenium WebDriver automates real browsers (Chrome, Firefox, Safari, Edge) for testing web applications
Start with Python — simpler syntax, faster feedback loop for beginners
Master locators (ID, CSS, XPath) and explicit waits before writing complex tests
Use Page Object Model from day one — refactoring later is painful
Best for: QA engineers starting browser automation, developers writing E2E tests Skip if: You only need API testing (use Postman) or already know Playwright/Cypress

Your first Selenium test will probably fail. Not because Selenium is broken, but because web automation has timing issues you haven’t encountered before. Elements load asynchronously. Buttons become clickable after JavaScript executes. Forms validate on blur.

This tutorial teaches you Selenium the right way — handling these real-world challenges from the start.

What is Selenium WebDriver?

Selenium WebDriver is a browser automation tool that controls real browsers programmatically. Unlike tools that simulate browsers, Selenium drives actual Chrome, Firefox, Safari, and Edge instances.

Core components:

WebDriver API — language bindings (Python, Java, C#, JavaScript, Ruby)
Browser drivers — ChromeDriver, GeckoDriver, SafariDriver
Selenium Grid — distributed test execution across machines

Selenium is free, open-source, and maintained by a large community. It’s the foundation that tools like Appium (mobile) and Selenide (Java wrapper) build upon.

Environment Setup

Python Setup

# Install Selenium
pip install selenium

# Install WebDriver Manager (auto-downloads drivers)
pip install webdriver-manager

# test_first.py
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager

# Automatic driver management
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))

driver.get("https://example.com")
print(f"Page title: {driver.title}")

driver.quit()

Java Setup

<!-- pom.xml -->
<dependencies>
    <dependency>
        <groupId>org.seleniumhq.selenium</groupId>
        <artifactId>selenium-java</artifactId>
        <version>4.18.1</version>
    </dependency>
    <dependency>
        <groupId>io.github.bonigarcia</groupId>
        <artifactId>webdrivermanager</artifactId>
        <version>5.7.0</version>
    </dependency>
</dependencies>

// FirstTest.java
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import io.github.bonigarcia.wdm.WebDriverManager;

public class FirstTest {
    public static void main(String[] args) {
        WebDriverManager.chromedriver().setup();
        WebDriver driver = new ChromeDriver();

        driver.get("https://example.com");
        System.out.println("Page title: " + driver.getTitle());

        driver.quit();
    }
}

Locator Strategies

Locators find elements on the page. Choose the right strategy for maintainable tests.

Priority Order (Best to Worst)

Strategy	Example	When to Use
ID	`#login-button`	Unique IDs (best option)
Name	`[name="email"]`	Form inputs
CSS Selector	`.btn.primary`	Most elements
Link Text	`Sign In`	Links with stable text
XPath	`//div[@class='card']`	Complex relationships

Python Examples

from selenium.webdriver.common.by import By

# By ID (fastest, most reliable)
driver.find_element(By.ID, "username")

# By CSS Selector (flexible, readable)
driver.find_element(By.CSS_SELECTOR, "button.submit-btn")
driver.find_element(By.CSS_SELECTOR, "[data-testid='login']")

# By XPath (when CSS fails)
driver.find_element(By.XPATH, "//button[contains(text(), 'Submit')]")
driver.find_element(By.XPATH, "//div[@class='form']//input[@type='email']")

# By Link Text
driver.find_element(By.LINK_TEXT, "Forgot Password?")
driver.find_element(By.PARTIAL_LINK_TEXT, "Forgot")

Java Examples

import org.openqa.selenium.By;

// By ID
driver.findElement(By.id("username"));

// By CSS Selector
driver.findElement(By.cssSelector("button.submit-btn"));
driver.findElement(By.cssSelector("[data-testid='login']"));

// By XPath
driver.findElement(By.xpath("//button[contains(text(), 'Submit')]"));

// By Link Text
driver.findElement(By.linkText("Forgot Password?"));

CSS vs XPath Decision

Use CSS when:

Selecting by class, ID, or attribute
Element has data-testid or similar test attribute
Performance matters (CSS is faster)

Use XPath when:

Finding element by text content
Navigating to parent elements
Complex conditions (and, or, contains)

# CSS: cleaner for attributes
driver.find_element(By.CSS_SELECTOR, "[data-testid='submit-btn']")

# XPath: necessary for text matching
driver.find_element(By.XPATH, "//button[text()='Submit Order']")

# XPath: parent navigation (CSS can't do this)
driver.find_element(By.XPATH, "//span[text()='Error']/parent::div")

Waits: The Most Important Concept

90% of flaky Selenium tests fail because of timing. Elements aren’t ready when your code tries to interact with them.

Types of Waits

Type	How It Works	When to Use
Implicit	Polls DOM for N seconds	Never (global, hides real issues)
Explicit	Waits for specific condition	Always (precise, readable)
Fluent	Explicit + custom polling	Slow-loading elements

Explicit Waits (Python)

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

wait = WebDriverWait(driver, 10)  # Max 10 seconds

# Wait for element to be clickable
button = wait.until(EC.element_to_be_clickable((By.ID, "submit")))
button.click()

# Wait for element to be visible
message = wait.until(EC.visibility_of_element_located((By.CLASS_NAME, "success")))

# Wait for element to disappear
wait.until(EC.invisibility_of_element_located((By.ID, "loading")))

# Wait for text to appear
wait.until(EC.text_to_be_present_in_element((By.ID, "status"), "Complete"))

# Custom condition
wait.until(lambda d: len(d.find_elements(By.CLASS_NAME, "item")) > 5)

Explicit Waits (Java)

import org.openqa.selenium.support.ui.WebDriverWait;
import org.openqa.selenium.support.ui.ExpectedConditions;
import java.time.Duration;

WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));

// Wait for element to be clickable
WebElement button = wait.until(
    ExpectedConditions.elementToBeClickable(By.id("submit"))
);
button.click();

// Wait for visibility
WebElement message = wait.until(
    ExpectedConditions.visibilityOfElementLocated(By.className("success"))
);

// Wait for element to disappear
wait.until(ExpectedConditions.invisibilityOfElementLocated(By.id("loading")));

Common Wait Conditions

from selenium.webdriver.support import expected_conditions as EC

# Element state
EC.presence_of_element_located((By.ID, "elem"))      # In DOM
EC.visibility_of_element_located((By.ID, "elem"))    # Visible
EC.element_to_be_clickable((By.ID, "elem"))          # Clickable
EC.invisibility_of_element_located((By.ID, "elem"))  # Hidden/gone

# Page state
EC.title_contains("Dashboard")
EC.url_contains("/dashboard")
EC.alert_is_present()

# Multiple elements
EC.presence_of_all_elements_located((By.CLASS_NAME, "item"))

Complete Test Example

# tests/test_login.py
import pytest
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager

class TestLogin:
    def setup_method(self):
        self.driver = webdriver.Chrome(
            service=Service(ChromeDriverManager().install())
        )
        self.driver.implicitly_wait(0)  # Disable implicit waits
        self.wait = WebDriverWait(self.driver, 10)

    def teardown_method(self):
        self.driver.quit()

    def test_successful_login(self):
        self.driver.get("https://example.com/login")

        # Fill login form
        email_input = self.wait.until(
            EC.visibility_of_element_located((By.ID, "email"))
        )
        email_input.send_keys("user@example.com")

        password_input = self.driver.find_element(By.ID, "password")
        password_input.send_keys("password123")

        # Submit form
        submit_btn = self.driver.find_element(By.CSS_SELECTOR, "[type='submit']")
        submit_btn.click()

        # Verify redirect to dashboard
        self.wait.until(EC.url_contains("/dashboard"))

        welcome_message = self.wait.until(
            EC.visibility_of_element_located((By.CLASS_NAME, "welcome"))
        )
        assert "Welcome" in welcome_message.text

    def test_invalid_credentials(self):
        self.driver.get("https://example.com/login")

        self.wait.until(
            EC.visibility_of_element_located((By.ID, "email"))
        ).send_keys("invalid@example.com")

        self.driver.find_element(By.ID, "password").send_keys("wrongpass")
        self.driver.find_element(By.CSS_SELECTOR, "[type='submit']").click()

        error_message = self.wait.until(
            EC.visibility_of_element_located((By.CLASS_NAME, "error"))
        )
        assert "Invalid credentials" in error_message.text

// src/test/java/LoginTest.java
import org.junit.jupiter.api.*;
import org.openqa.selenium.*;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.support.ui.*;
import io.github.bonigarcia.wdm.WebDriverManager;
import java.time.Duration;

import static org.junit.jupiter.api.Assertions.*;

class LoginTest {
    private WebDriver driver;
    private WebDriverWait wait;

    @BeforeEach
    void setup() {
        WebDriverManager.chromedriver().setup();
        driver = new ChromeDriver();
        wait = new WebDriverWait(driver, Duration.ofSeconds(10));
    }

    @AfterEach
    void teardown() {
        driver.quit();
    }

    @Test
    void testSuccessfulLogin() {
        driver.get("https://example.com/login");

        WebElement emailInput = wait.until(
            ExpectedConditions.visibilityOfElementLocated(By.id("email"))
        );
        emailInput.sendKeys("user@example.com");

        driver.findElement(By.id("password")).sendKeys("password123");
        driver.findElement(By.cssSelector("[type='submit']")).click();

        wait.until(ExpectedConditions.urlContains("/dashboard"));

        WebElement welcomeMessage = wait.until(
            ExpectedConditions.visibilityOfElementLocated(By.className("welcome"))
        );
        assertTrue(welcomeMessage.getText().contains("Welcome"));
    }
}

Page Object Model

Page Object Model (POM) separates page structure from test logic. Each page becomes a class with elements and actions.

Python Page Object

# pages/login_page.py
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

class LoginPage:
    URL = "https://example.com/login"

    # Locators
    EMAIL_INPUT = (By.ID, "email")
    PASSWORD_INPUT = (By.ID, "password")
    SUBMIT_BUTTON = (By.CSS_SELECTOR, "[type='submit']")
    ERROR_MESSAGE = (By.CLASS_NAME, "error")

    def __init__(self, driver):
        self.driver = driver
        self.wait = WebDriverWait(driver, 10)

    def open(self):
        self.driver.get(self.URL)
        self.wait.until(EC.visibility_of_element_located(self.EMAIL_INPUT))
        return self

    def login(self, email: str, password: str):
        self.driver.find_element(*self.EMAIL_INPUT).send_keys(email)
        self.driver.find_element(*self.PASSWORD_INPUT).send_keys(password)
        self.driver.find_element(*self.SUBMIT_BUTTON).click()

    def get_error_message(self) -> str:
        error = self.wait.until(
            EC.visibility_of_element_located(self.ERROR_MESSAGE)
        )
        return error.text

# pages/dashboard_page.py
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

class DashboardPage:
    WELCOME_MESSAGE = (By.CLASS_NAME, "welcome")
    USER_MENU = (By.ID, "user-menu")
    LOGOUT_LINK = (By.LINK_TEXT, "Logout")

    def __init__(self, driver):
        self.driver = driver
        self.wait = WebDriverWait(driver, 10)

    def is_loaded(self) -> bool:
        self.wait.until(EC.url_contains("/dashboard"))
        return True

    def get_welcome_text(self) -> str:
        message = self.wait.until(
            EC.visibility_of_element_located(self.WELCOME_MESSAGE)
        )
        return message.text

    def logout(self):
        self.driver.find_element(*self.USER_MENU).click()
        self.wait.until(
            EC.element_to_be_clickable(self.LOGOUT_LINK)
        ).click()

# tests/test_login_pom.py
import pytest
from pages.login_page import LoginPage
from pages.dashboard_page import DashboardPage

class TestLoginPOM:
    def test_successful_login(self, driver):
        login_page = LoginPage(driver)
        login_page.open()
        login_page.login("user@example.com", "password123")

        dashboard = DashboardPage(driver)
        assert dashboard.is_loaded()
        assert "Welcome" in dashboard.get_welcome_text()

    def test_invalid_login(self, driver):
        login_page = LoginPage(driver)
        login_page.open()
        login_page.login("invalid@example.com", "wrongpass")

        assert "Invalid credentials" in login_page.get_error_message()

Java Page Object

// pages/LoginPage.java
import org.openqa.selenium.*;
import org.openqa.selenium.support.ui.*;
import java.time.Duration;

public class LoginPage {
    private final WebDriver driver;
    private final WebDriverWait wait;
    private static final String URL = "https://example.com/login";

    // Locators
    private final By emailInput = By.id("email");
    private final By passwordInput = By.id("password");
    private final By submitButton = By.cssSelector("[type='submit']");
    private final By errorMessage = By.className("error");

    public LoginPage(WebDriver driver) {
        this.driver = driver;
        this.wait = new WebDriverWait(driver, Duration.ofSeconds(10));
    }

    public LoginPage open() {
        driver.get(URL);
        wait.until(ExpectedConditions.visibilityOfElementLocated(emailInput));
        return this;
    }

    public void login(String email, String password) {
        driver.findElement(emailInput).sendKeys(email);
        driver.findElement(passwordInput).sendKeys(password);
        driver.findElement(submitButton).click();
    }

    public String getErrorMessage() {
        return wait.until(
            ExpectedConditions.visibilityOfElementLocated(errorMessage)
        ).getText();
    }
}

Handling Common Scenarios

Dropdowns

from selenium.webdriver.support.ui import Select

# Standard HTML select
dropdown = Select(driver.find_element(By.ID, "country"))
dropdown.select_by_visible_text("United States")
dropdown.select_by_value("us")
dropdown.select_by_index(1)

# Custom dropdown (React, Vue, etc.)
driver.find_element(By.CSS_SELECTOR, ".dropdown-trigger").click()
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".dropdown-menu")))
driver.find_element(By.XPATH, "//li[text()='United States']").click()

Alerts

# Accept alert
alert = wait.until(EC.alert_is_present())
alert.accept()

# Dismiss alert
alert.dismiss()

# Get alert text
alert_text = alert.text

# Type into prompt
alert.send_keys("My input")
alert.accept()

Frames and Windows

# Switch to iframe
driver.switch_to.frame("iframe-name")
driver.switch_to.frame(driver.find_element(By.ID, "my-iframe"))

# Switch back to main content
driver.switch_to.default_content()

# Handle new window/tab
original_window = driver.current_window_handle
driver.find_element(By.LINK_TEXT, "Open New Tab").click()

wait.until(EC.number_of_windows_to_be(2))

for handle in driver.window_handles:
    if handle != original_window:
        driver.switch_to.window(handle)
        break

# Do something in new window
driver.close()
driver.switch_to.window(original_window)

Screenshots

# Full page screenshot
driver.save_screenshot("screenshot.png")

# Element screenshot
element = driver.find_element(By.ID, "chart")
element.screenshot("chart.png")

# Screenshot on failure (pytest fixture)
@pytest.fixture
def driver():
    d = webdriver.Chrome()
    yield d
    if hasattr(sys, '_current_test_failed') and sys._current_test_failed:
        d.save_screenshot(f"failure_{datetime.now().isoformat()}.png")
    d.quit()

Headless Mode and CI/CD

Headless Browser

from selenium.webdriver.chrome.options import Options

options = Options()
options.add_argument("--headless=new")
options.add_argument("--no-sandbox")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--window-size=1920,1080")

driver = webdriver.Chrome(options=options)

GitHub Actions Integration

# .github/workflows/selenium-tests.yml
name: Selenium Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install dependencies
        run: |
          pip install selenium pytest webdriver-manager

      - name: Run tests
        run: pytest tests/ -v --tb=short

      - name: Upload screenshots
        if: failure()
        uses: actions/upload-artifact@v4
        with:
          name: screenshots
          path: screenshots/

AI-Assisted Selenium Development

AI tools can accelerate Selenium test development when used correctly.

What AI does well:

Generating locator strategies from HTML snippets
Converting manual test steps to Selenium code
Writing Page Object boilerplate
Explaining cryptic error messages
Suggesting wait strategies for specific scenarios

What still needs humans:

Deciding which tests to automate
Debugging timing-related flakiness
Choosing between locator strategies for ambiguous elements
Understanding application-specific behavior

Useful prompt:

I have this HTML form:
<form id="login">
  <input type="email" name="email" placeholder="Email">
  <input type="password" name="password">
  <button type="submit">Sign In</button>
</form>

Write a Selenium Python test that:
1. Fills in email and password
2. Submits the form
3. Waits for redirect to /dashboard
4. Verifies a welcome message appears

Use explicit waits and proper locators.

When to Choose Selenium

Choose Selenium when:

Team has existing Selenium expertise
Need Safari or legacy browser support
Corporate environment with Selenium infrastructure
Want maximum community support and resources
Using Appium for mobile (same API)

Consider alternatives when:

Starting fresh with Chromium-only needs (Playwright)
Component testing with JavaScript framework (Cypress)
Prefer auto-wait and simpler API (Playwright)
Need video recording and trace debugging (Playwright)

FAQ

Is Selenium hard to learn?

Selenium basics take 2-4 weeks to learn. The WebDriver API is straightforward — find_element(), click(), send_keys(). The challenge is mastering waits, choosing stable locators, and structuring tests with Page Object Model. Practice with a real application, not just tutorials.

Which language is best for Selenium?

Python for beginners — simple syntax, quick feedback, excellent documentation. Java for enterprise projects with existing Java infrastructure and CI/CD pipelines. JavaScript if your team already uses Node.js. All languages have mature Selenium support.

Is Selenium still relevant in 2026?

Yes. Selenium remains the industry standard with the largest community, most tutorials, and broadest browser support. While Playwright and Cypress offer better developer experience, Selenium integrates with more tools, supports more browsers, and has more enterprise adoption.

What is the difference between Selenium and Playwright?

Selenium controls browsers through WebDriver protocol (W3C standard). Playwright uses Chrome DevTools Protocol for Chromium browsers. Playwright has better auto-wait, built-in assertions, trace viewer, and parallel execution. Selenium has broader browser support, larger community, and more learning resources. For new projects with Chromium focus, Playwright is often the better choice. For existing projects or Safari/legacy browser needs, Selenium remains solid.

Can Selenium test mobile apps?

Not directly. Selenium is designed for web applications in desktop browsers. For mobile native app testing, use Appium — it extends the Selenium WebDriver API to iOS and Android. Appium uses the same locator strategies and commands, so your Selenium knowledge transfers directly. For mobile web testing (websites on phones), Selenium works through mobile emulation in Chrome DevTools or remote WebDriver connections to real devices.

How do I handle dynamic elements in Selenium?

Use explicit waits with WebDriverWait and expected_conditions. Wait for specific conditions — element visible, element clickable, text present — instead of time.sleep() or implicit waits. For elements that load after AJAX calls, intercept the request or wait for a loading spinner to disappear. Custom wait conditions using lambda functions handle edge cases where standard conditions don’t fit.