Linear Algebra: Vectors

Concept

A vector is an ordered list of numbers that represents a point or direction in space. Vectors are the fundamental data structure of machine learning — every input, every weight, every hidden state is a vector.

Why This Matters

Every ML model operates on vectors. An image is a vector of pixel values. A sentence is a vector of word embeddings. A layer's output is a vector. The operations you learn here — addition, scaling, dot products, norms — are the building blocks of every neural network, from a single neuron to a multi-modal transformer.


Mathematical Notation

A vector is written as a bold lowercase letter or with an arrow:

$$ \mathbf{v} = \begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{pmatrix} \quad\text{or}\quad \vec{v} $$

The entries $v_i$ are called components. The number of components is the dimension. We write $v_i$ to refer to the $i$-th component (1-indexed in math, 0-indexed in code).

A vector lives in $\mathbb{R}^n$ — the set of $n$-dimensional real vectors (introduced in ch01).

Vector Addition

Add component-wise:

$$ \mathbf{u} + \mathbf{v} = \begin{pmatrix} u_1 + v_1 \\ u_2 + v_2 \\ \vdots \\ u_n + v_n \end{pmatrix} $$

Scalar Multiplication

Multiply every component by a scalar $\alpha$ (alpha — introduced in ch01):

$$ \alpha \mathbf{v} = \begin{pmatrix} \alpha v_1 \\ \alpha v_2 \\ \vdots \\ \alpha v_n \end{pmatrix} $$

Dot Product (Inner Product)

The dot product takes two vectors and returns a single scalar. It is written with angle brackets or a dot:

$$ \langle \mathbf{u}, \mathbf{v} \rangle = \mathbf{u} \cdot \mathbf{v} = \sum_{i=1}^{n} u_i v_i $$

This is the inner product (notation introduced in ch01). It measures how much one vector points in the direction of another.

Norms

The Euclidean norm (L2 norm) is the length of a vector:

$$ \|\mathbf{v}\|_2 = \sqrt{\sum_{i=1}^{n} v_i^2} $$

The L1 norm (Manhattan norm) sums absolute values:

$$ \|\mathbf{v}\|_1 = \sum_{i=1}^{n} |v_i| $$

When the subscript is omitted, $\|\mathbf{v}\|$ defaults to the Euclidean norm.

Unit Vectors

A unit vector has length 1. To normalize any vector, divide by its norm:

$$ \hat{\mathbf{v}} = \frac{\mathbf{v}}{\|\mathbf{v}\|} $$

The hat ( $\hat{\mathbf{v}}$ ) indicates a unit vector.

Orthogonality

Two vectors are orthogonal (perpendicular) when their dot product is zero:

$$ \mathbf{u} \perp \mathbf{v} \quad \iff \quad \langle \mathbf{u}, \mathbf{v} \rangle = 0 $$

The symbol $\perp$ means "is orthogonal to."


Intuition

Geometric view: A vector is an arrow from the origin to a point. Adding vectors means placing them tip-to-tail. Scaling changes the arrow's length. The dot product tells you the angle: positive means they point in a similar direction, zero means perpendicular, negative means opposite.

Algebraic view: A vector is just a list of numbers. All operations happen component-by-component. This is the view that translates directly to code.

Why dot products matter in ML: A neural network layer computes $f(\mathbf{W}\mathbf{x} + \mathbf{b})$. Each row of $\mathbf{W}$ is a weight vector; the dot product with the input $\mathbf{x}$ measures how well the input matches that row. This is the core computation of every dense layer, every attention head, every linear transformation.


Rust Implementation

Create a new crate as a sibling to ch01 (the ch02 chapter is self-contained, so it does not depend on ch01 code):

cargo new --name ch02 --lib ch02
cd ch02

Replace src/lib.rs with:

/// A vector of f64 values.
/// We use a simple newtype wrapper around Vec<f64>.
#[derive(Debug, Clone, PartialEq)]
pub struct Vector(Vec<f64>);

impl Vector {
    /// Create a new vector from a list of components.
    pub fn new(components: Vec<f64>) -> Self {
        Vector(components)
    }

    /// The dimension (number of components) of this vector.
    pub fn dim(&self) -> usize {
        self.0.len()
    }

    /// Access the i-th component (0-indexed).
    pub fn get(&self, i: usize) -> f64 {
        self.0[i]
    }

    /// Add another vector component-wise.
    /// Panics if dimensions differ.
    pub fn add(&self, other: &Vector) -> Vector {
        assert_eq!(self.dim(), other.dim(), "Vectors must have the same dimension");
        let result: Vec<f64> = self.0.iter().zip(&other.0).map(|(a, b)| a + b).collect();
        Vector(result)
    }

    /// Multiply by a scalar (scale).
    pub fn scale(&self, alpha: f64) -> Vector {
        Vector(self.0.iter().map(|&x| alpha * x).collect())
    }

    /// Dot product with another vector.
    /// Corresponds to ⟨u, v⟩ = āˆ‘ u_i v_i.
    pub fn dot(&self, other: &Vector) -> f64 {
        assert_eq!(self.dim(), other.dim(), "Vectors must have the same dimension");
        self.0.iter().zip(&other.0).map(|(a, b)| a * b).sum()
    }

    /// Euclidean norm (L2): ‖v‖₂ = √(āˆ‘ v_i²).
    pub fn norm_l2(&self) -> f64 {
        self.dot(self).sqrt()
    }

    /// L1 norm: ‖v‖₁ = āˆ‘ |v_i|.
    pub fn norm_l1(&self) -> f64 {
        self.0.iter().map(|&x| x.abs()).sum()
    }

    /// Return a unit vector (normalized).
    /// Panics if the vector has zero length.
    pub fn normalize(&self) -> Vector {
        let norm = self.norm_l2();
        assert!(norm > 0.0, "Cannot normalize a zero vector");
        self.scale(1.0 / norm)
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_add() {
        let u = Vector::new(vec![1.0, 2.0, 3.0]);
        let v = Vector::new(vec![4.0, 5.0, 6.0]);
        let result = u.add(&v);
        assert_eq!(result, Vector::new(vec![5.0, 7.0, 9.0]));
    }

    #[test]
    fn test_scale() {
        let v = Vector::new(vec![1.0, 2.0, 3.0]);
        let result = v.scale(2.0);
        assert_eq!(result, Vector::new(vec![2.0, 4.0, 6.0]));
    }

    #[test]
    fn test_dot() {
        let u = Vector::new(vec![1.0, 2.0, 3.0]);
        let v = Vector::new(vec![4.0, 5.0, 6.0]);
        // 1*4 + 2*5 + 3*6 = 4 + 10 + 18 = 32
        assert_eq!(u.dot(&v), 32.0);
    }

    #[test]
    fn test_norm_l2() {
        // 3-4-5 triangle: ‖[3, 4]‖ = 5
        let v = Vector::new(vec![3.0, 4.0]);
        assert_eq!(v.norm_l2(), 5.0);
    }

    #[test]
    fn test_norm_l1() {
        let v = Vector::new(vec![-1.0, 2.0, -3.0]);
        // |āˆ’1| + |2| + |āˆ’3| = 1 + 2 + 3 = 6
        assert_eq!(v.norm_l1(), 6.0);
    }

    #[test]
    fn test_normalize() {
        let v = Vector::new(vec![3.0, 4.0]);
        let unit = v.normalize();
        // Unit vector of [3, 4] is [3/5, 4/5] = [0.6, 0.8]
        assert!((unit.get(0) - 0.6).abs() < 1e-10);
        assert!((unit.get(1) - 0.8).abs() < 1e-10);
        // Its length should be 1
        assert!((unit.norm_l2() - 1.0).abs() < 1e-10);
    }

    #[test]
    fn test_orthogonal() {
        // [1, 0] and [0, 1] are orthogonal (dot = 0)
        let u = Vector::new(vec![1.0, 0.0]);
        let v = Vector::new(vec![0.0, 1.0]);
        assert_eq!(u.dot(&v), 0.0);
    }
}

Run the tests:

cargo test

You should see:

running 7 tests
test tests::test_add ... ok
test tests::test_dot ... ok
test tests::test_norm_l1 ... ok
test tests::test_norm_l2 ... ok
test tests::test_normalize ... ok
test tests::test_orthogonal ... ok
test tests::test_scale ... ok

test result: ok. 7 passed; 0 failed; 0 ignored; 0 measured; 0 filtered

Walkthrough


Verification

The tests above verify:

TestWhat it checksMathematical invariant
test_add$[1,2,3] + [4,5,6] = [5,7,9]$Component-wise addition
test_scale$2 \cdot [1,2,3] = [2,4,6]$Scalar multiplication
test_dot$\langle [1,2,3], [4,5,6] \rangle = 32$Dot product (sum of products)
test_norm_l2$|[3,4]|_2 = 5$Euclidean norm (Pythagorean triple)
test_norm_l1$|[-1,2,-3]|_1 = 6$L1 norm (sum of absolute values)
test_normalizeUnit vector has length 1$|\hat{\mathbf{v}}| = 1$
test_orthogonal$\langle [1,0], [0,1] \rangle = 0$Orthogonality

Key Takeaways