6.2 Search and sort
Explanation
Searching and sorting are basic operations with important assumptions. Search asks where a value appears. Sorting rearranges owned data into an order that later operations can use.
A linear search over a slice checks values one by one:
fn find_i32(xs: &[i32], target: i32) -> Option<usize> {
for (i, &x) in xs.iter().enumerate() {
if x == target {
return Some(i);
}
}
None
}For an unsorted slice, linear search is the basic model. It may need to inspect every element, so the cost is proportional to N.
The same idea works for floating-point data, but equality on f64 should be used only when exact equality is meaningful for the problem:
fn find_f64(xs: &[f64], target: f64) -> Option<usize> {
xs.iter().position(|&x| x == target)
}If the data are already sorted, binary search is possible. Binary search checks the middle value and discards half of the remaining range. The sorting cost is separate: binary search is cheap only after the sorted order is available.
Sorting in Rust usually happens on an owned Vec<T>:
let mut xs = vec![3, 1, 2];
xs.sort();
assert_eq!(xs, vec![1, 2, 3]);Floating-point values need a comparison rule. total_cmp gives a total ordering, including for special values such as NaN, but you must decide whether that ordering is meaningful for the data:
let mut xs = vec![3.0, 1.0, 2.0];
xs.sort_by(|a, b| a.total_cmp(b));
assert_eq!(xs, vec![1.0, 2.0, 3.0]);Tests should use small examples whose answers are known by hand. For a search function, include missing values, first position, last position, and repeated values. For a custom sorting exercise, the standard library sort can be used as a reference in tests, but it must not be called inside the implementation being tested.
Things to look up
- Linear search
- Binary search
- Pre-sorted data
Option<usize>- Slice
&[T] Vec<T>Iterator::positionVec::sortsort_byf64::total_cmp- Stable sort
- Insertion sort or selection sort
Exercise
Write a function my_find that searches a slice and returns the index of the first matching value:
fn my_find(xs: &[i32], target: i32) -> Option<usize> {
todo!()
}Add tests for these cases:
- The target is missing.
- The target is the first element.
- The target is the last element.
- The target appears more than once, and the first index should be returned.
Use small examples where the answer can be checked without running the code.
Notes for the exercise
- Use
&[i32]or&[f64]at the function boundary for search. - Return
Option<usize>instead of a sentinel value such as-1. - Decide what repeated values mean before writing tests.
- Tests should compare against known small examples.
- If you later write a custom sort, do not use the standard sort inside the custom implementation. It is fine to use the standard sort only in tests.
- Count how many elements the search inspects in the best case, worst case, and a repeated-value case.