6.1 Testing and TDD

Explanation

A test checks expected behavior. Test-driven development, or TDD, means using tests to drive the implementation order:

Choose one small behavior.
Write a test for that behavior before implementing it.
Run the test and check that it fails for the expected reason.
Implement the smallest code that should satisfy the test.
Run the test suite and check that it passes.
Refactor only after the tests pass, and run the tests again.

This order is useful because it prevents vague checking. Without TDD, it is easy to implement first, run one convenient example, and decide that the code is probably correct. TDD forces you to state the expected behavior before the implementation exists.

In a small Rust project, start by writing the test file. For this example, create the library project with:

cargo new --lib mean_tools

The package name in Cargo.toml is also the crate name used by integration tests. Therefore the test below imports mean_tools::mean_value.

For example:

tests/mean.rs

use mean_tools::mean_value;

#[test]
fn mean_of_three_values() {
    assert_eq!(mean_value(&[2.0, 4.0, 6.0]), Some(4.0));
}

#[test]
fn mean_of_one_value() {
    assert_eq!(mean_value(&[5.0]), Some(5.0));
}

#[test]
fn mean_with_negative_values() {
    assert_eq!(mean_value(&[-1.0, 1.0]), Some(0.0));
}

#[test]
fn mean_of_repeated_values() {
    assert_eq!(mean_value(&[3.0, 3.0, 3.0]), Some(3.0));
}

#[test]
fn empty_input_has_no_mean() {
    assert_eq!(mean_value(&[]), None);
}

Before the source file exists, this test file should fail. That is expected. The failure tells you that the test is actually connected to the missing implementation.

Then write the source file:

src/lib.rs

pub fn mean_value(xs: &[f64]) -> Option<f64> {
    if xs.is_empty() {
        return None;
    }

    Some(xs.iter().sum::<f64>() / xs.len() as f64)
}

The function is easy to test because all data enter through the argument xs, and the result is returned directly. The tests are part of the specification: they say what should happen for normal inputs, one-element inputs, negative values, repeated values, and empty input.

Here is a small bad example:

fn mean_value_from_file() -> std::io::Result<f64> {
    let text = std::fs::read_to_string("data.txt")?;
    let xs: Vec<f64> = text
        .lines()
        .filter_map(|line| line.parse::<f64>().ok())
        .collect();

    Ok(xs.iter().sum::<f64>() / xs.len() as f64)
}

This function reads from a hidden file path instead of receiving its data as an argument. If some other part of the workflow changes data.txt, the same call can return a different answer. Similar problems appear when functions depend on hidden global mutable state. A better design is mean_value(xs: &[f64]), where the input is explicit.

Tests should be fast. Fast tests make it realistic to run them after every change, before handing work to another person, and before trusting code written by an AI agent. Long simulations and large validation runs are important, but they should usually be separate from the small test suite.

When using an AI agent, review changes to tests before reviewing changes to the implementation. If the agent changes the tests only to make them pass, it has weakened the check. For important reference cases, store independent verification data separately from ordinary code changes.

For floating-point results that are not exactly representable, use approximate comparison:

let value = 2.0_f64.sqrt().powi(2);
assert!((value - 2.0).abs() < 1e-12);

A bad test copies the implementation. A good test checks behavior that you can justify independently, for example by hand calculation or a known result.

For a very small teaching example, it is sometimes useful to put the function and the tests in one file so that the implementation and specification can be read together:

pub fn mean_value(xs: &[f64]) -> Option<f64> {
    if xs.is_empty() {
        return None;
    }

    Some(xs.iter().sum::<f64>() / xs.len() as f64)
}

#[cfg(test)]
mod tests {
    use super::mean_value;

    #[test]
    fn mean_examples() {
        assert_eq!(mean_value(&[2.0, 4.0, 6.0]), Some(4.0));
        assert_eq!(mean_value(&[5.0]), Some(5.0));
        assert_eq!(mean_value(&[-1.0, 1.0]), Some(0.0));
        assert_eq!(mean_value(&[]), None);
    }
}

Do not treat this inline layout as the recommended structure for real project code. In a real project, normal execution should run the calculation, and a separate test command should run the tests. Project organization is discussed in Project structure, environment, and reproducibility.

Things to look up

Unit test
Test suite
Integration test
Regression test
Test-driven development
Modular design
Reference implementation
Rust test harness
Cargo
cargo test
src/lib.rs
tests/
Option

Exercise

In work/chapter_6.1/, create a small Cargo library project:

cargo new --lib mean_tools

Add a function mean_value(xs: &[f64]) -> Option<f64>. Put the source code in src/lib.rs and the integration tests in tests/mean.rs.

Write the tests before writing the implementation. Run the tests once while the implementation is still missing or incomplete, and record what failure you saw. Then implement the function and run cargo test again.

Include at least five tests. For each test, write one sentence explaining what property it checks.

Notes for the exercise

The source file should be src/lib.rs.
The integration test file should be tests/mean.rs.
The package should be named mean_tools so the integration test can import mean_tools::mean_value.
mean_value(xs: &[f64]) should receive data through the argument xs and return the result.
Do not make mean_value read from or write to a global variable.
Use Option<f64> so empty input can return None.
Include a one-element input.
Include repeated values.
Include negative values.
Include a case whose answer is known by hand.
Use an explicit tolerance when exact floating-point equality is not appropriate.
Tests should check the specification, not copy the implementation.
Keep the tests fast enough that you would actually run them after editing.
If an AI agent changes the tests, inspect that diff before trusting the implementation.
Run the tests with cargo test. If a test fails, read the failure message before changing either the function or the test.