Linear Algebra: Vectors
Concept
A vector is an ordered list of numbers that represents a point or direction in space. Vectors are the fundamental data structure of machine learning ā every input, every weight, every hidden state is a vector.
Why This Matters
Every ML model operates on vectors. An image is a vector of pixel values. A sentence is a vector of word embeddings. A layer's output is a vector. The operations you learn here ā addition, scaling, dot products, norms ā are the building blocks of every neural network, from a single neuron to a multi-modal transformer.
Mathematical Notation
A vector is written as a bold lowercase letter or with an arrow:
$$ \mathbf{v} = \begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{pmatrix} \quad\text{or}\quad \vec{v} $$
The entries $v_i$ are called components. The number of components is the dimension. We write $v_i$ to refer to the $i$-th component (1-indexed in math, 0-indexed in code).
A vector lives in $\mathbb{R}^n$ ā the set of $n$-dimensional real vectors (introduced in ch01).
Vector Addition
Add component-wise:
$$ \mathbf{u} + \mathbf{v} = \begin{pmatrix} u_1 + v_1 \\ u_2 + v_2 \\ \vdots \\ u_n + v_n \end{pmatrix} $$
Scalar Multiplication
Multiply every component by a scalar $\alpha$ (alpha ā introduced in ch01):
$$ \alpha \mathbf{v} = \begin{pmatrix} \alpha v_1 \\ \alpha v_2 \\ \vdots \\ \alpha v_n \end{pmatrix} $$
Dot Product (Inner Product)
The dot product takes two vectors and returns a single scalar. It is written with angle brackets or a dot:
$$ \langle \mathbf{u}, \mathbf{v} \rangle = \mathbf{u} \cdot \mathbf{v} = \sum_{i=1}^{n} u_i v_i $$
This is the inner product (notation introduced in ch01). It measures how much one vector points in the direction of another.
Norms
The Euclidean norm (L2 norm) is the length of a vector:
$$ \|\mathbf{v}\|_2 = \sqrt{\sum_{i=1}^{n} v_i^2} $$
The L1 norm (Manhattan norm) sums absolute values:
$$ \|\mathbf{v}\|_1 = \sum_{i=1}^{n} |v_i| $$
When the subscript is omitted, $\|\mathbf{v}\|$ defaults to the Euclidean norm.
Unit Vectors
A unit vector has length 1. To normalize any vector, divide by its norm:
$$ \hat{\mathbf{v}} = \frac{\mathbf{v}}{\|\mathbf{v}\|} $$
The hat ( $\hat{\mathbf{v}}$ ) indicates a unit vector.
Orthogonality
Two vectors are orthogonal (perpendicular) when their dot product is zero:
$$ \mathbf{u} \perp \mathbf{v} \quad \iff \quad \langle \mathbf{u}, \mathbf{v} \rangle = 0 $$
The symbol $\perp$ means "is orthogonal to."
Intuition
Geometric view: A vector is an arrow from the origin to a point. Adding vectors means placing them tip-to-tail. Scaling changes the arrow's length. The dot product tells you the angle: positive means they point in a similar direction, zero means perpendicular, negative means opposite.
Algebraic view: A vector is just a list of numbers. All operations happen component-by-component. This is the view that translates directly to code.
Why dot products matter in ML: A neural network layer computes $f(\mathbf{W}\mathbf{x} + \mathbf{b})$. Each row of $\mathbf{W}$ is a weight vector; the dot product with the input $\mathbf{x}$ measures how well the input matches that row. This is the core computation of every dense layer, every attention head, every linear transformation.
Rust Implementation
Create a new crate as a sibling to ch01 (the ch02 chapter is self-contained, so it does not depend on ch01 code):
cargo new --name ch02 --lib ch02
cd ch02
Replace src/lib.rs with:
/// A vector of f64 values.
/// We use a simple newtype wrapper around Vec<f64>.
#[derive(Debug, Clone, PartialEq)]
pub struct Vector(Vec<f64>);
impl Vector {
/// Create a new vector from a list of components.
pub fn new(components: Vec<f64>) -> Self {
Vector(components)
}
/// The dimension (number of components) of this vector.
pub fn dim(&self) -> usize {
self.0.len()
}
/// Access the i-th component (0-indexed).
pub fn get(&self, i: usize) -> f64 {
self.0[i]
}
/// Add another vector component-wise.
/// Panics if dimensions differ.
pub fn add(&self, other: &Vector) -> Vector {
assert_eq!(self.dim(), other.dim(), "Vectors must have the same dimension");
let result: Vec<f64> = self.0.iter().zip(&other.0).map(|(a, b)| a + b).collect();
Vector(result)
}
/// Multiply by a scalar (scale).
pub fn scale(&self, alpha: f64) -> Vector {
Vector(self.0.iter().map(|&x| alpha * x).collect())
}
/// Dot product with another vector.
/// Corresponds to āØu, vā© = ā u_i v_i.
pub fn dot(&self, other: &Vector) -> f64 {
assert_eq!(self.dim(), other.dim(), "Vectors must have the same dimension");
self.0.iter().zip(&other.0).map(|(a, b)| a * b).sum()
}
/// Euclidean norm (L2): āvāā = ā(ā v_i²).
pub fn norm_l2(&self) -> f64 {
self.dot(self).sqrt()
}
/// L1 norm: āvāā = ā |v_i|.
pub fn norm_l1(&self) -> f64 {
self.0.iter().map(|&x| x.abs()).sum()
}
/// Return a unit vector (normalized).
/// Panics if the vector has zero length.
pub fn normalize(&self) -> Vector {
let norm = self.norm_l2();
assert!(norm > 0.0, "Cannot normalize a zero vector");
self.scale(1.0 / norm)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_add() {
let u = Vector::new(vec![1.0, 2.0, 3.0]);
let v = Vector::new(vec![4.0, 5.0, 6.0]);
let result = u.add(&v);
assert_eq!(result, Vector::new(vec![5.0, 7.0, 9.0]));
}
#[test]
fn test_scale() {
let v = Vector::new(vec![1.0, 2.0, 3.0]);
let result = v.scale(2.0);
assert_eq!(result, Vector::new(vec![2.0, 4.0, 6.0]));
}
#[test]
fn test_dot() {
let u = Vector::new(vec![1.0, 2.0, 3.0]);
let v = Vector::new(vec![4.0, 5.0, 6.0]);
// 1*4 + 2*5 + 3*6 = 4 + 10 + 18 = 32
assert_eq!(u.dot(&v), 32.0);
}
#[test]
fn test_norm_l2() {
// 3-4-5 triangle: ā[3, 4]ā = 5
let v = Vector::new(vec![3.0, 4.0]);
assert_eq!(v.norm_l2(), 5.0);
}
#[test]
fn test_norm_l1() {
let v = Vector::new(vec![-1.0, 2.0, -3.0]);
// |ā1| + |2| + |ā3| = 1 + 2 + 3 = 6
assert_eq!(v.norm_l1(), 6.0);
}
#[test]
fn test_normalize() {
let v = Vector::new(vec![3.0, 4.0]);
let unit = v.normalize();
// Unit vector of [3, 4] is [3/5, 4/5] = [0.6, 0.8]
assert!((unit.get(0) - 0.6).abs() < 1e-10);
assert!((unit.get(1) - 0.8).abs() < 1e-10);
// Its length should be 1
assert!((unit.norm_l2() - 1.0).abs() < 1e-10);
}
#[test]
fn test_orthogonal() {
// [1, 0] and [0, 1] are orthogonal (dot = 0)
let u = Vector::new(vec![1.0, 0.0]);
let v = Vector::new(vec![0.0, 1.0]);
assert_eq!(u.dot(&v), 0.0);
}
}
Run the tests:
cargo test
You should see:
running 7 tests
test tests::test_add ... ok
test tests::test_dot ... ok
test tests::test_norm_l1 ... ok
test tests::test_norm_l2 ... ok
test tests::test_normalize ... ok
test tests::test_orthogonal ... ok
test tests::test_scale ... ok
test result: ok. 7 passed; 0 failed; 0 ignored; 0 measured; 0 filtered
Walkthrough
Vector(Vec<f64>)ā We use a newtype pattern: a tuple struct wrappingVec<f64>. This gives us a distinct type with its own methods, while delegating storage to Rust's standard dynamic array.dim()ā Returns the number of components, which Rust callslen()on the innerVec.zipandmapā Vector operations use Rust's iterator combinators.self.0.iter().zip(&other.0)pairs up corresponding components, thenmapapplies the operation. This is the code equivalent of "component-wise."dotuses.sum()ā After pairing and multiplying, we sum everything. This is $\sum_i u_i v_i$ expressed in one line.norm_l2callsdot(self, self)ā The Euclidean norm is the square root of the dot product of a vector with itself. Reusingdotkeeps the code DRY and reinforces the relationship between the two operations.normalizecallsscaleā Normalization is just scaling by the reciprocal of the norm. Again, reuse.- Edge cases tested: The empty vector case ā
sum()on an empty iterator returns0.0, which is correct (the empty dot product is 0, the empty norm is 0). Butnormalizeon a zero vector will panic, which is the correct behavior (you cannot normalize zero).
Verification
The tests above verify:
| Test | What it checks | Mathematical invariant |
|---|---|---|
test_add | $[1,2,3] + [4,5,6] = [5,7,9]$ | Component-wise addition |
test_scale | $2 \cdot [1,2,3] = [2,4,6]$ | Scalar multiplication |
test_dot | $\langle [1,2,3], [4,5,6] \rangle = 32$ | Dot product (sum of products) |
test_norm_l2 | $|[3,4]|_2 = 5$ | Euclidean norm (Pythagorean triple) |
test_norm_l1 | $|[-1,2,-3]|_1 = 6$ | L1 norm (sum of absolute values) |
test_normalize | Unit vector has length 1 | $|\hat{\mathbf{v}}| = 1$ |
test_orthogonal | $\langle [1,0], [0,1] \rangle = 0$ | Orthogonality |
Key Takeaways
- A vector is an ordered list of numbers ā the fundamental data structure in ML.
- Vector addition and scalar multiplication are component-wise operations.
- The dot product $\langle \mathbf{u}, \mathbf{v} \rangle$ measures alignment between vectors; it is the core operation in neural networks.
- The Euclidean norm $\|\mathbf{v}\|_2$ is the length of a vector; dividing by it produces a unit vector.
- Orthogonal vectors have a dot product of zero ā they point in perpendicular directions.
- Rust's iterator combinators (
zip,map,sum) map naturally onto component-wise vector math.