6.2 Search and sort

Explanation

Searching and sorting are basic operations with important assumptions. Search asks where a value appears. Sorting rearranges owned data into an order that later operations can use.

A linear search over a slice checks values one by one:

fn find_i32(xs: &[i32], target: i32) -> Option<usize> {
    for (i, &x) in xs.iter().enumerate() {
        if x == target {
            return Some(i);
        }
    }
    None
}

For an unsorted slice, linear search is the basic model. It may need to inspect every element, so the cost is proportional to N.

The same idea works for floating-point data, but equality on f64 should be used only when exact equality is meaningful for the problem:

fn find_f64(xs: &[f64], target: f64) -> Option<usize> {
    xs.iter().position(|&x| x == target)
}

If the data are already sorted, binary search is possible. Binary search checks the middle value and discards half of the remaining range. The sorting cost is separate: binary search is cheap only after the sorted order is available.

Sorting in Rust usually happens on an owned Vec<T>:

let mut xs = vec![3, 1, 2];
xs.sort();
assert_eq!(xs, vec![1, 2, 3]);

Floating-point values need a comparison rule. total_cmp gives a total ordering, including for special values such as NaN, but you must decide whether that ordering is meaningful for the data:

let mut xs = vec![3.0, 1.0, 2.0];
xs.sort_by(|a, b| a.total_cmp(b));
assert_eq!(xs, vec![1.0, 2.0, 3.0]);

Tests should use small examples whose answers are known by hand. For a search function, include missing values, first position, last position, and repeated values. For a custom sorting exercise, the standard library sort can be used as a reference in tests, but it must not be called inside the implementation being tested.

Things to look up

  • Linear search
  • Binary search
  • Pre-sorted data
  • Option<usize>
  • Slice &[T]
  • Vec<T>
  • Iterator::position
  • Vec::sort
  • sort_by
  • f64::total_cmp
  • Stable sort
  • Insertion sort or selection sort

Exercise

Write a function my_find that searches a slice and returns the index of the first matching value:

fn my_find(xs: &[i32], target: i32) -> Option<usize> {
    todo!()
}

Add tests for these cases:

  1. The target is missing.
  2. The target is the first element.
  3. The target is the last element.
  4. The target appears more than once, and the first index should be returned.

Use small examples where the answer can be checked without running the code.

Notes for the exercise

  • Use &[i32] or &[f64] at the function boundary for search.
  • Return Option<usize> instead of a sentinel value such as -1.
  • Decide what repeated values mean before writing tests.
  • Tests should compare against known small examples.
  • If you later write a custom sort, do not use the standard sort inside the custom implementation. It is fine to use the standard sort only in tests.
  • Count how many elements the search inspects in the best case, worst case, and a repeated-value case.