What Is Selenium WebDriver?

Selenium WebDriver is the most established web browser automation tool, used by millions of testers worldwide. It provides a programming interface to control web browsers through the W3C WebDriver protocol.

Selenium Architecture

Test Code (Java/Python/JS/C#)
        ↓
   WebDriver API
        ↓
   Browser Driver (ChromeDriver, GeckoDriver)
        ↓
   Browser (Chrome, Firefox, Safari, Edge)

Your test code calls the WebDriver API, which sends commands to the browser-specific driver, which controls the actual browser.

Setting Up Selenium

JavaScript (WebdriverIO)

npm init -y
npm install webdriverio @wdio/cli @wdio/mocha-framework
npx wdio config

Java (Maven)

<dependencies>
  <dependency>
    <groupId>org.seleniumhq.selenium</groupId>
    <artifactId>selenium-java</artifactId>
    <version>4.18.0</version>
  </dependency>
  <dependency>
    <groupId>org.testng</groupId>
    <artifactId>testng</artifactId>
    <version>7.9.0</version>
  </dependency>
</dependencies>

Python

pip install selenium pytest

Writing Your First Test

JavaScript (WebdriverIO)

describe('Login Page', () => {
  it('should login with valid credentials', async () => {
    await browser.url('/login');
    await $('#email').setValue('admin@test.com');
    await $('#password').setValue('secret123');
    await $('button[type="submit"]').click();
    await expect(browser).toHaveUrl('/dashboard');
    await expect($('.welcome')).toHaveText('Welcome, Admin');
  });
});

Java

public class LoginTest {
  WebDriver driver;

  @BeforeMethod
  public void setup() {
    driver = new ChromeDriver();
    driver.manage().window().maximize();
  }

  @Test
  public void testValidLogin() {
    driver.get("https://app.example.com/login");
    driver.findElement(By.id("email")).sendKeys("admin@test.com");
    driver.findElement(By.id("password")).sendKeys("secret123");
    driver.findElement(By.cssSelector("button[type='submit']")).click();

    WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
    wait.until(ExpectedConditions.urlContains("/dashboard"));

    String welcome = driver.findElement(By.className("welcome")).getText();
    assertEquals(welcome, "Welcome, Admin");
  }

  @AfterMethod
  public void teardown() {
    driver.quit();
  }
}

Python

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

def test_valid_login():
    driver = webdriver.Chrome()
    driver.get("https://app.example.com/login")

    driver.find_element(By.ID, "email").send_keys("admin@test.com")
    driver.find_element(By.ID, "password").send_keys("secret123")
    driver.find_element(By.CSS_SELECTOR, "button[type='submit']").click()

    wait = WebDriverWait(driver, 10)
    wait.until(EC.url_contains("/dashboard"))

    welcome = driver.find_element(By.CLASS_NAME, "welcome").text
    assert welcome == "Welcome, Admin"

    driver.quit()

Locator Strategies

StrategyExampleReliability
By.idBy.id("email")High (if unique)
By.cssBy.css("[data-testid='email']")High
By.xpathBy.xpath("//input[@name='email']")Medium
By.nameBy.name("email")Medium
By.classNameBy.className("input-email")Low
By.tagNameBy.tagName("input")Very low
By.linkTextBy.linkText("Sign In")Medium

Best Locator Practices

  1. Prefer data-testid attributes: [data-testid="login-submit"]
  2. Use CSS selectors over XPath when possible — they are faster
  3. Avoid absolute XPath: /html/body/div[3]/form/input[2] breaks easily
  4. Avoid classes used for styling: .btn-primary may change during redesign
  5. Use relative XPath when needed: //button[contains(text(), 'Submit')]

Waits

Implicit Wait

driver.manage().timeouts().implicitlyWait(Duration.ofSeconds(10));

Sets a global timeout for all element lookups. Simple but can mask timing issues.

WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));

// Wait for element to be visible
WebElement element = wait.until(
  ExpectedConditions.visibilityOfElementLocated(By.id("dashboard"))
);

// Wait for element to be clickable
wait.until(ExpectedConditions.elementToBeClickable(By.id("submit")));

// Wait for text to appear
wait.until(ExpectedConditions.textToBePresentInElementLocated(
  By.className("status"), "Complete"
));

// Wait for URL change
wait.until(ExpectedConditions.urlContains("/dashboard"));

Fluent Wait

Wait<WebDriver> fluentWait = new FluentWait<>(driver)
  .withTimeout(Duration.ofSeconds(30))
  .pollingEvery(Duration.ofMillis(500))
  .ignoring(NoSuchElementException.class);

WebElement element = fluentWait.until(d -> d.findElement(By.id("dynamic-element")));

Advanced Interactions

Actions API

Actions actions = new Actions(driver);

// Hover over element
actions.moveToElement(menuItem).perform();

// Drag and drop
actions.dragAndDrop(source, target).perform();

// Right-click
actions.contextClick(element).perform();

// Double-click
actions.doubleClick(element).perform();

// Keyboard shortcuts
actions.keyDown(Keys.CONTROL).click(link).keyUp(Keys.CONTROL).perform();

Handling Dropdowns

Select dropdown = new Select(driver.findElement(By.id("country")));
dropdown.selectByVisibleText("United States");
dropdown.selectByValue("US");
dropdown.selectByIndex(5);

Handling Alerts

Alert alert = driver.switchTo().alert();
String alertText = alert.getText();
alert.accept();    // Click OK
alert.dismiss();   // Click Cancel
alert.sendKeys("input text"); // Type into prompt

Handling Frames and Windows

// Switch to iframe
driver.switchTo().frame("frame-name");
driver.switchTo().frame(0); // by index
driver.switchTo().defaultContent(); // back to main page

// Handle new window/tab
String originalWindow = driver.getWindowHandle();
// ... action that opens new window
for (String handle : driver.getWindowHandles()) {
  if (!handle.equals(originalWindow)) {
    driver.switchTo().window(handle);
    break;
  }
}

Screenshots

File screenshot = ((TakesScreenshot) driver).getScreenshotAs(OutputType.FILE);
FileUtils.copyFile(screenshot, new File("screenshots/test-failure.png"));

JavaScript Execution

JavascriptExecutor js = (JavascriptExecutor) driver;
js.executeScript("window.scrollTo(0, document.body.scrollHeight)");
js.executeScript("arguments[0].click()", hiddenButton);
String title = (String) js.executeScript("return document.title");

Selenium Best Practices

  1. Always use explicit waits — never Thread.sleep()
  2. Quit the driver in teardown — prevent browser zombie processes
  3. Use the Page Object Model — separate test logic from page details
  4. Prefer CSS selectors — faster and more readable than XPath
  5. Run in headless mode for CIoptions.addArguments("--headless")
  6. Set reasonable timeouts — 10-30 seconds for explicit waits
  7. Handle stale element exceptions — re-locate elements when the DOM changes

Exercise: Build a Selenium Test Suite

Create a Selenium test suite for a web application:

  1. Set up a project with Selenium + your preferred language
  2. Write a BasePage class with common methods (navigate, wait, screenshot)
  3. Create page objects for Login, Dashboard, and Settings pages
  4. Write 5 test cases covering login, navigation, form submission, dropdown selection, and logout
  5. Add explicit waits for all dynamic elements
  6. Run tests in both headed and headless modes

Key Takeaways

  • Selenium WebDriver controls browsers via the W3C WebDriver protocol
  • Use explicit waits (WebDriverWait) instead of Thread.sleep() for reliable tests
  • CSS selectors and data-testid attributes are the best locator strategies
  • The Actions API handles complex interactions (hover, drag, keyboard)
  • Always use Page Object Model for maintainable test code
  • Selenium supports Java, Python, JavaScript, C#, Ruby, and Kotlin