Linear Algebra: Matrices

Concept

A matrix is a rectangular array of numbers that represents a linear transformation between vector spaces — it maps vectors to other vectors.

Why This Matters

A vector is a list of numbers. A matrix is a grid of numbers — like a spreadsheet, a table of data, or a grid of pixel values. Where vectors let you represent one thing (a point in space, a single image row), matrices let you represent many things at once: a batch of vectors, a collection of relationships, or a transformation that maps entire sets of inputs to outputs. Matrix multiplication is how you combine these grids together — it's the operation that makes matrices useful, and it will be the core computation of everything we build from here on.

Mathematical Notation

A matrix is written as an uppercase bold letter:

\mathbf{A} = \begin{pmatrix} a_{11} & a_{12} & \dots & a_{1n} \\ a_{21} & a_{22} & \dots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \dots & a_{mn} \end{pmatrix}

The dimensions are written as $m \times n$ (read "m by n"): $m$ rows and $n$ columns. We say $\mathbf{A} \in \mathbb{R}^{m \times n}$ — the set of $m \times n$ real matrices (notation introduced in ch01).

The entry in row $i$, column $j$ is written $a_{ij}$ or $\mathbf{A}_{ij}$. In math notation, indices are 1-based; in code they become 0-based.

Matrix Multiplication

For two matrices $\mathbf{A} \in \mathbb{R}^{m \times n}$ and $\mathbf{B} \in \mathbb{R}^{n \times p}$, their product $\mathbf{C} = \mathbf{A}\mathbf{B}$ is an $m \times p$ matrix where each entry is:

\mathbf{C}_{ij} = \sum_{k=1}^{n} \mathbf{A}_{ik} \mathbf{B}_{kj}

The inner dimension ($n$) must match — the number of columns in $\mathbf{A}$ must equal the number of rows in $\mathbf{B}$. This is called the compatibility condition for matrix multiplication.

Each entry $\mathbf{C}_{ij}$ is the dot product (introduced in ch02) of row $i$ of $\mathbf{A}$ with column $j$ of $\mathbf{B}$.

Transpose

The transpose of a matrix flips rows and columns. If $\mathbf{A} \in \mathbb{R}^{m \times n}$, then $\mathbf{A}^T \in \mathbb{R}^{n \times m}$:

(\mathbf{A}^T)_{ij} = \mathbf{A}_{ji}

Visually:

\mathbf{A} = \begin{pmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{pmatrix} \quad\longrightarrow\quad \mathbf{A}^T = \begin{pmatrix} 1 & 4 \\ 2 & 5 \\ 3 & 6 \end{pmatrix}

Identity Matrix

The identity matrix $\mathbf{I}_n$ is an $n \times n$ matrix with 1s on the diagonal and 0s elsewhere. It acts like the number 1 in multiplication: $\mathbf{A}\mathbf{I} = \mathbf{A}$ and $\mathbf{I}\mathbf{A} = \mathbf{A}$ for any compatible $\mathbf{A}$.

\mathbf{I}_3 = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix}

The identity is written with the Kronecker delta (δ, delta — introduced in ch01):

(\mathbf{I}_n)_{ij} = \delta_{ij} = \begin{cases} 1 & \text{if } i = j \\ 0 & \text{if } i \neq j \end{cases}

Worked Examples

Example 1: 2×2 times 2×2

\mathbf{A} = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix},\qquad \mathbf{B} = \begin{pmatrix} 5 & 6 \\ 7 & 8 \end{pmatrix}

Each entry $C_{ij}$ is the dot product of row $i$ of $\mathbf{A}$ with column $j$ of $\mathbf{B}$:

\begin{aligned} C_{11} &= 1 \cdot 5 + 2 \cdot 7 = 5 + 14 = 19 \\ C_{12} &= 1 \cdot 6 + 2 \cdot 8 = 6 + 16 = 22 \\ C_{21} &= 3 \cdot 5 + 4 \cdot 7 = 15 + 28 = 43 \\ C_{22} &= 3 \cdot 6 + 4 \cdot 8 = 18 + 32 = 50 \end{aligned}

\mathbf{A}\mathbf{B} = \begin{pmatrix} 19 & 22 \\ 43 & 50 \end{pmatrix}

Example 2: 2×3 times 3×2

\mathbf{A} = \begin{pmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{pmatrix},\qquad \mathbf{B} = \begin{pmatrix} 7 & 8 \\ 9 & 10 \\ 11 & 12 \end{pmatrix}

The inner dimension is 3 (columns of $\mathbf{A}$ = rows of $\mathbf{B}$), so the result is $2 \times 2$:

\begin{aligned} C_{11} &= 1\cdot7 + 2\cdot9 + 3\cdot11 = 7 + 18 + 33 = 58 \\ C_{12} &= 1\cdot8 + 2\cdot10 + 3\cdot12 = 8 + 20 + 36 = 64 \\ C_{21} &= 4\cdot7 + 5\cdot9 + 6\cdot11 = 28 + 45 + 66 = 139 \\ C_{22} &= 4\cdot8 + 5\cdot10 + 6\cdot12 = 32 + 50 + 72 = 154 \end{aligned}

\mathbf{A}\mathbf{B} = \begin{pmatrix} 58 & 64 \\ 139 & 154 \end{pmatrix}

These are the same matrices used in test_multiply_2x3_by_3x2 — you can run the code and check.

How the triple loop matches the formula

The naive multiplication code is a direct translation of $C_{ij} = \sum_k A_{ik} B_{kj}$:

for i:                # for each row of A (and result row)
  for j:              # for each column of B (and result column)
    sum = 0
    for k:            # sum over the shared dimension
      sum += A[i][k] * B[k][j]
    C[i][j] = sum

The inner dimension $n$ (columns of $\mathbf{A}$ = rows of $\mathbf{B}$) is what the k loop iterates over. If the dimensions don't match, the multiplication is invalid — enforced by assert_eq! in the Rust code.

Rust Implementation

Add a new crate to your workspace:

cd code && cargo new --lib --edition 2024 ch03-linear-algebra-matrices

Open code/ch03-linear-algebra-matrices/src/lib.rs and replace its contents with:

/// A matrix is stored in row-major order as a flat Vec<f64>.
///
/// Row-major means the first row occupies indices 0..cols,
/// the second row occupies indices cols..2*cols, and so on.
/// The element at (row, col) is at index row * cols + col.
#[derive(Debug, Clone, PartialEq)]
pub struct Matrix {
    data: Vec<f64>,
    rows: usize,
    cols: usize,
}

impl Matrix {
    /// Create a new matrix from a flat Vec<f64> in row-major order.
    ///
    /// # Panics
    /// Panics if data.len() != rows * cols.
    pub fn new(data: Vec<f64>, rows: usize, cols: usize) -> Self {
        assert_eq!(
            data.len(),
            rows * cols,
            "Data length {} does not match dimensions {}×{}",
            data.len(),
            rows,
            cols
        );
        Matrix { data, rows, cols }
    }

    /// Number of rows.
    pub fn rows(&self) -> usize {
        self.rows
    }

    /// Number of columns.
    pub fn cols(&self) -> usize {
        self.cols
    }

    /// Shape as (rows, cols).
    pub fn shape(&self) -> (usize, usize) {
        (self.rows, self.cols)
    }

    /// Access the element at (row, col) — 0-indexed.
    ///
    /// # Panics
    /// Panics if row or col is out of bounds.
    pub fn get(&self, row: usize, col: usize) -> f64 {
        assert!(row < self.rows, "Row {} out of bounds (rows={})", row, self.rows);
        assert!(col < self.cols, "Col {} out of bounds (cols={})", col, self.cols);
        self.data[row * self.cols + col]
    }

    /// Matrix multiplication: self * other.
    ///
    /// For A ∈ ℝ^{m×n} and B ∈ ℝ^{n×p}, the result C ∈ ℝ^{m×p}.
    /// C_{ij} = Σ_{k=0}^{n-1} A_{ik} * B_{kj}
    ///
    /// # Panics
    /// Panics if self.cols != other.rows (incompatible dimensions).
    pub fn multiply(&self, other: &Matrix) -> Matrix {
        assert_eq!(
            self.cols, other.rows,
            "Incompatible dimensions: {}×{} vs {}×{}",
            self.rows, self.cols, other.rows, other.cols
        );

        let m = self.rows;
        let n = self.cols;
        let p = other.cols;

        let mut data = Vec::with_capacity(m * p);

        for i in 0..m {
            for j in 0..p {
                let mut sum = 0.0;
                for k in 0..n {
                    sum += self.get(i, k) * other.get(k, j);
                }
                data.push(sum);
            }
        }

        Matrix { data, rows: m, cols: p }
    }

    /// Transpose the matrix: flip rows and columns.
    ///
    /// If A ∈ ℝ^{m×n}, then A^T ∈ ℝ^{n×m} with (A^T)_{ij} = A_{ji}.
    pub fn transpose(&self) -> Matrix {
        let mut data = Vec::with_capacity(self.rows * self.cols);

        for j in 0..self.cols {
            for i in 0..self.rows {
                data.push(self.get(i, j));
            }
        }

        Matrix {
            data,
            rows: self.cols,
            cols: self.rows,
        }
    }

    /// Create an n×n identity matrix.
    pub fn identity(n: usize) -> Matrix {
        let mut data = vec![0.0; n * n];
        for i in 0..n {
            data[i * n + i] = 1.0;
        }
        Matrix {
            data,
            rows: n,
            cols: n,
        }
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_create_and_get() {
        let m = Matrix::new(vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0], 2, 3);
        assert_eq!(m.shape(), (2, 3));
        assert!((m.get(0, 0) - 1.0).abs() < 1e-10);
        assert!((m.get(0, 2) - 3.0).abs() < 1e-10);
        assert!((m.get(1, 1) - 5.0).abs() < 1e-10);
    }

    #[test]
    fn test_multiply_2x3_by_3x2() {
        // A = [[1, 2, 3], [4, 5, 6]]
        // B = [[7, 8], [9, 10], [11, 12]]
        // C = A * B should be:
        //   C[0][0] = 1*7 + 2*9 + 3*11 = 7 + 18 + 33 = 58
        //   C[0][1] = 1*8 + 2*10 + 3*12 = 8 + 20 + 36 = 64
        //   C[1][0] = 4*7 + 5*9 + 6*11 = 28 + 45 + 66 = 139
        //   C[1][1] = 4*8 + 5*10 + 6*12 = 32 + 50 + 72 = 154
        let a = Matrix::new(vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0], 2, 3);
        let b = Matrix::new(vec![7.0, 8.0, 9.0, 10.0, 11.0, 12.0], 3, 2);
        let c = a.multiply(&b);

        assert_eq!(c.shape(), (2, 2));
        assert!((c.get(0, 0) - 58.0).abs() < 1e-10);
        assert!((c.get(0, 1) - 64.0).abs() < 1e-10);
        assert!((c.get(1, 0) - 139.0).abs() < 1e-10);
        assert!((c.get(1, 1) - 154.0).abs() < 1e-10);
    }

    #[test]
    fn test_multiply_square() {
        // A = [[1, 2], [3, 4]]
        // B = [[5, 6], [7, 8]]
        // C = [[1*5+2*7, 1*6+2*8], [3*5+4*7, 3*6+4*8]]
        //   = [[5+14, 6+16], [15+28, 18+32]]
        //   = [[19, 22], [43, 50]]
        let a = Matrix::new(vec![1.0, 2.0, 3.0, 4.0], 2, 2);
        let b = Matrix::new(vec![5.0, 6.0, 7.0, 8.0], 2, 2);
        let c = a.multiply(&b);

        assert!((c.get(0, 0) - 19.0).abs() < 1e-10);
        assert!((c.get(0, 1) - 22.0).abs() < 1e-10);
        assert!((c.get(1, 0) - 43.0).abs() < 1e-10);
        assert!((c.get(1, 1) - 50.0).abs() < 1e-10);
    }

    #[test]
    fn test_multiply_by_identity() {
        let a = Matrix::new(vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0], 2, 3);
        let i = Matrix::identity(3);
        let result = a.multiply(&i);
        // A * I = A
        for row in 0..2 {
            for col in 0..3 {
                assert!(
                    (result.get(row, col) - a.get(row, col)).abs() < 1e-10,
                    "Mismatch at ({}, {}): expected {}, got {}",
                    row,
                    col,
                    a.get(row, col),
                    result.get(row, col)
                );
            }
        }
    }

    #[test]
    fn test_transpose() {
        // A = [[1, 2, 3], [4, 5, 6]]
        // A^T = [[1, 4], [2, 5], [3, 6]]
        let a = Matrix::new(vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0], 2, 3);
        let t = a.transpose();

        assert_eq!(t.shape(), (3, 2));
        assert!((t.get(0, 0) - 1.0).abs() < 1e-10);
        assert!((t.get(0, 1) - 4.0).abs() < 1e-10);
        assert!((t.get(1, 0) - 2.0).abs() < 1e-10);
        assert!((t.get(1, 1) - 5.0).abs() < 1e-10);
        assert!((t.get(2, 0) - 3.0).abs() < 1e-10);
        assert!((t.get(2, 1) - 6.0).abs() < 1e-10);
    }

    #[test]
    fn test_transpose_twice_is_identity() {
        let a = Matrix::new(vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0], 2, 3);
        let t = a.transpose();
        let back = t.transpose();
        // (A^T)^T = A
        assert_eq!(a.shape(), back.shape());
        for row in 0..2 {
            for col in 0..3 {
                assert!(
                    (back.get(row, col) - a.get(row, col)).abs() < 1e-10,
                    "Mismatch at ({}, {}): expected {}, got {}",
                    row,
                    col,
                    a.get(row, col),
                    back.get(row, col)
                );
            }
        }
    }

    #[test]
    #[should_panic(expected = "Incompatible dimensions")]
    fn test_multiply_incompatible_dimensions() {
        let a = Matrix::new(vec![1.0, 2.0, 3.0, 4.0], 2, 2);
        let b = Matrix::new(vec![1.0, 2.0, 3.0], 3, 1); // a.cols=2 != b.rows=3
        a.multiply(&b);
    }
}

Run the tests:

cargo test -p ch03-linear-algebra-matrices

running 7 tests
test tests::test_create_and_get ... ok
test tests::test_multiply_2x3_by_3x2 ... ok
test tests::test_multiply_by_identity ... ok
test tests::test_multiply_incompatible_dimensions ... ok
test tests::test_multiply_square ... ok
test tests::test_transpose ... ok
test tests::test_transpose_twice_is_identity ... ok

test result: ok. 7 passed; 0 failed; 0 ignored; 0 measured; 0 filtered

Walkthrough

Matrix { data, rows, cols } — We store the matrix as a flat Vec<f64> with the shape stored alongside. Row-major means element $(i, j)$ lives at index $i \times \text{cols} + j$. This is the standard memory layout used by BLAS, numpy (row-major by default), and most ML frameworks. It also means a single Vec allocation holds all the data, which is cache-friendly.
rows() and cols() — These are public getters for the private rows and cols fields. In Rust, fields are private by default; exposing them through methods means we can change the internal representation later without breaking code that uses the Matrix struct.
shape() could call either — shape() returns (self.rows, self.cols) (the fields) rather than (self.rows(), self.cols()) (the getters). Both work inside the same impl block because the fields are visible here. However, if rows() or cols() were later changed to do something extra (a check, a transformation), shape() would silently bypass it by accessing the fields directly. For a simple case like this it doesn't matter, but it's worth knowing the difference: fields are the raw data, getters are the public interface, and which one you use inside the type is a design choice.
get(row, col) — Indexing into the flat array. We add bounds checks with descriptive panic messages, following the course style guide.
multiply — The naive triple-nested loop. The outer two loops iterate over every entry of the result matrix, and the inner loop sums the products. This is a direct translation of $C_{ij} = \sum_k A_{ik} B_{kj}$.
transpose — We iterate over columns first (outer loop), then rows (inner loop), pushing each element $A_{ij}$ into the new matrix's row-major order. After transposing, the matrix's rows and cols are swapped.
identity — Allocate a zero-filled vector, then set the diagonal elements to 1. This is a simple and efficient construction.
assert_eq! for dimensions — The constructor checks that the data length matches rows * cols, and multiply checks the compatibility condition. Both give descriptive error messages.
Transpose twice returns the original — test_transpose_twice_is_identity verifies the invariant $(\mathbf{A}^T)^T = \mathbf{A}$ by checking every element. This is a property-based test that catches bugs in both transpose and the indexing logic simultaneously.

Verification

The tests above verify:

Test	What it checks	Mathematical invariant
`test_create_and_get`	Element access and shape	Storage layout correctness
`test_multiply_2x3_by_3x2`	$C = AB$ with known result	Matrix multiplication formula
`test_multiply_square`	$2\times2$ multiplication	$C_{ij} = \sum_k A_{ik}B_{kj}$
`test_multiply_by_identity`	$AI = A$	Identity matrix property
`test_transpose`	Known transpose result	$(A^T)_{ij} = A_{ji}$
`test_transpose_twice_is_identity`	$(A^T)^T = A$	Involution property of transpose
`test_multiply_incompatible_dimensions`	Proper panic on mismatch	Dimension compatibility

Key Takeaways

A matrix $\mathbf{A} \in \mathbb{R}^{m \times n}$ is a rectangular array with $m$ rows and $n$ columns — the core data structure for linear transformations.
Matrix multiplication $\mathbf{C} = \mathbf{A}\mathbf{B}$ requires $\mathbf{A}$'s columns to match $\mathbf{B}$'s rows. Each entry $C_{ij}$ is the dot product of row $i$ of $\mathbf{A}$ with column $j$ of $\mathbf{B}$.
The transpose $\mathbf{A}^T$ swaps rows and columns; transposing twice recovers the original.
The identity matrix $\mathbf{I}$ is the multiplicative identity — it leaves any compatible matrix unchanged under multiplication.
Row-major flat storage (data[row * cols + col]) is the standard memory layout for matrices — cache-friendly and used by most numerical libraries.
Naive matrix multiplication is $O(n^3)$ — understanding this baseline is essential before exploring optimized algorithms.