Linear Transformations

Concept

A matrix is a linear transformation — it maps vectors to other vectors in a way that preserves straight lines and the origin. Every linear transformation can be represented as a matrix, and every matrix represents a linear transformation.

Why This Matters

A matrix doesn't just sit there — it does something. Multiplying a matrix by a vector transforms that vector into a new one: it can stretch it, shrink it, rotate it, flip it, or skew it. Understanding what a matrix does geometrically — not just how to multiply it — is the bridge from algebra to intuition. Later this will let you understand why certain weight matrices in neural networks work the way they do, but for now the goal is simpler: when you see a matrix, you should be able to picture the transformation it represents.

Mathematical Notation

Linear Transformation Definition

A function $T: \mathbb{R}^n \to \mathbb{R}^m$ is a linear transformation if it satisfies two properties:

Additivity: $T(\mathbf{u} + \mathbf{v}) = T(\mathbf{u}) + T(\mathbf{v})$
Homogeneity: $T(\alpha \mathbf{v}) = \alpha T(\mathbf{v})$

Together these give linearity:

T(\alpha \mathbf{u} + \beta \mathbf{v}) = \alpha T(\mathbf{u}) + \beta T(\mathbf{v})

Matrix of a Transformation

Every linear transformation $T: \mathbb{R}^n \to \mathbb{R}^m$ is represented by an $m \times n$ matrix $\mathbf{A}$ where the columns are the images of the standard basis vectors:

\mathbf{A} = \begin{pmatrix} | & | & & | \\ T(\mathbf{e}_1) & T(\mathbf{e}_2) & \dots & T(\mathbf{e}_n) \\ | & | & & | \end{pmatrix}

where $\mathbf{e}_i$ is the $i$-th standard basis vector (1 in position $i$, 0 elsewhere).

Applying the transformation to a vector $\mathbf{x}$ is matrix-vector multiplication:

T(\mathbf{x}) = \mathbf{A}\mathbf{x} = x_1 T(\mathbf{e}_1) + x_2 T(\mathbf{e}_2) + \dots + x_n T(\mathbf{e}_n)

This is a linear combination of the columns — the vector $\mathbf{x}$ provides the weights.

Composition

If $S: \mathbb{R}^n \to \mathbb{R}^m$ has matrix $\mathbf{A}$ and $T: \mathbb{R}^m \to \mathbb{R}^p$ has matrix $\mathbf{B}$, then the composition $T \circ S: \mathbb{R}^n \to \mathbb{R}^p$ has matrix $\mathbf{B}\mathbf{A}$:

(T \circ S)(\mathbf{x}) = T(S(\mathbf{x})) = \mathbf{B}(\mathbf{A}\mathbf{x}) = (\mathbf{B}\mathbf{A})\mathbf{x}

Composition is matrix multiplication — this is why matrix multiplication is defined the way it is.

Intuition

Basis vectors as anchors

The columns of a matrix tell you exactly what the transformation does. If you know where the basis vectors $\mathbf{e}_1, \mathbf{e}_2, \dots, \mathbf{e}_n$ go, you know where every vector goes. Everything else is just a linear combination.

Think of the basis vectors as the grid lines of the coordinate system. The transformation moves the grid — stretches it, rotates it, flips it — and every other point goes along for the ride.

Common transformations (2D)

Scaling — stretches or shrinks along each axis:

\begin{pmatrix} s_x & 0 \\ 0 & s_y \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} s_x x \\ s_y y \end{pmatrix}

The basis vector $\mathbf{e}_1 = (1, 0)$ goes to $(s_x, 0)$, and $\mathbf{e}_2 = (0, 1)$ goes to $(0, s_y)$. The grid gets stretched.

Rotation — rotates counter-clockwise by angle $\theta$ (theta, introduced in ch01):

\begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} x\cos\theta - y\sin\theta \\ x\sin\theta + y\cos\theta \end{pmatrix}

The first column $(\cos\theta, \sin\theta)$ is where $\mathbf{e}_1$ lands — it's the unit vector at angle $\theta$. The second column $(-\sin\theta, \cos\theta)$ is $\mathbf{e}_2$ rotated by $\theta$.

Reflection — flips across the $x$-axis:

\begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} x \\ -y \end{pmatrix}

The $x$-axis stays put (first column is $\mathbf{e}_1$), but the $y$-axis flips (second column is $-\mathbf{e}_2$).

Shear — shifts each row proportional to the other coordinate:

\begin{pmatrix} 1 & a \\ 0 & 1 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} x + a y \\ y \end{pmatrix}

The $y$-axis stays vertical, but the $x$-axis tilts. The parameter $a$ controls how much.

Visual summary

Each transformation is fully described by where it sends the basis vectors (these are the columns of the matrix):

e1 = (1,0)  ->  (2, 0)        (0.707, 0.707)  (1, 0)         (1, 0)
e2 = (0,1)  ->  (0, 1)        (-0.707, 0.707) (0, -1)        (1, 1)
               Scale(2,1)     Rotate(45)      Reflect x      Shear x(1)

The unit square with corners (0,0), (1,0), (1,1), (0,1) goes to the parallelogram spanned by the two image vectors.

Key insight: Reading down each column tells you the entire transformation. No need to visualize the grid — just look at where the basis vectors land. This is what transform(&[1.0, 0.0]) and transform(&[0.0, 1.0]) compute in code.

Rust Implementation

Add a new crate to your workspace:

cd code && cargo new --lib --edition 2024 ch05-linear-transformations

This crate builds on the Matrix struct from ch03. Open code/ch05-linear-transformations/src/lib.rs and start by copying the full Matrix impl from ch03-linear-algebra-matrices/src/lib.rs. Then add the following methods and free functions:

/// ── Linear transformations ──

impl Matrix {
    /// Apply this transformation to a vector (as a column matrix).
    ///
    /// If this matrix is m×n, the input vector must have n components,
    /// and the result has m components.
    ///
    /// Mathematically: T(v) = A * v  where A is this matrix.
    pub fn transform(&self, v: &[f64]) -> Vec<f64> {
        assert_eq!(
            self.cols,
            v.len(),
            "Matrix cols {} doesn't match vector length {}",
            self.cols,
            v.len()
        );

        let mut result = vec![0.0; self.rows];
        for i in 0..self.rows {
            let mut sum = 0.0;
            for j in 0..self.cols {
                sum += self.get(i, j) * v[j];
            }
            result[i] = sum;
        }
        result
    }
}

/// ── 2D transformation factory functions ──

/// Create a 2D scaling matrix: Scale(sx, sy).
///
/// [ sx   0 ]
/// [ 0   sy ]
pub fn scale_2d(sx: f64, sy: f64) -> Matrix {
    Matrix::new(vec![sx, 0.0, 0.0, sy], 2, 2)
}

/// Create a 2D rotation matrix (counter-clockwise by `angle` radians).
///
/// [ cosθ  -sinθ ]
/// [ sinθ   cosθ ]
pub fn rotate_2d(angle: f64) -> Matrix {
    let c = angle.cos();
    let s = angle.sin();
    Matrix::new(vec![c, -s, s, c], 2, 2)
}

/// Create a 2D reflection matrix across the x-axis.
///
/// [ 1   0 ]
/// [ 0  -1 ]
pub fn reflect_x_2d() -> Matrix {
    Matrix::new(vec![1.0, 0.0, 0.0, -1.0], 2, 2)
}

/// Create a 2D reflection matrix across the y-axis.
///
/// [ -1   0 ]
/// [  0   1 ]
pub fn reflect_y_2d() -> Matrix {
    Matrix::new(vec![-1.0, 0.0, 0.0, 1.0], 2, 2)
}

/// Create a 2D shear matrix (horizontal shear).
///
/// [ 1   shx ]
/// [ 0    1  ]
pub fn shear_x_2d(shx: f64) -> Matrix {
    Matrix::new(vec![1.0, shx, 0.0, 1.0], 2, 2)
}

/// Create a 2D shear matrix (vertical shear).
///
/// [ 1    0  ]
/// [ shy  1  ]
pub fn shear_y_2d(shy: f64) -> Matrix {
    Matrix::new(vec![1.0, 0.0, shy, 1.0], 2, 2)
}

#[cfg(test)]
mod tests {
    use super::*;

    fn approx_eq(a: f64, b: f64) -> bool {
        (a - b).abs() < 1e-10
    }

    #[test]
    fn test_scale_basis_vectors() {
        let s = scale_2d(2.0, 3.0);
        // e1 = (1, 0) → (2, 0)
        let r1 = s.transform(&[1.0, 0.0]);
        assert!(approx_eq(r1[0], 2.0) && approx_eq(r1[1], 0.0));
        // e2 = (0, 1) → (0, 3)
        let r2 = s.transform(&[0.0, 1.0]);
        assert!(approx_eq(r2[0], 0.0) && approx_eq(r2[1], 3.0));
    }

    #[test]
    fn test_scale_point() {
        let s = scale_2d(2.0, 3.0);
        let r = s.transform(&[4.0, 5.0]);
        // (4, 5) → (8, 15)
        assert!(approx_eq(r[0], 8.0) && approx_eq(r[1], 15.0));
    }

    #[test]
    fn test_rotation_90_degrees() {
        // Rotating (1, 0) by 90° CCW should give (0, 1)
        let r = rotate_2d(std::f64::consts::FRAC_PI_2);
        let v = r.transform(&[1.0, 0.0]);
        assert!(approx_eq(v[0], 0.0) && approx_eq(v[1], 1.0));
    }

    #[test]
    fn test_rotation_180_degrees() {
        // Rotating (1, 0) by 180° should give (-1, 0)
        let r = rotate_2d(std::f64::consts::PI);
        let v = r.transform(&[1.0, 0.0]);
        assert!(approx_eq(v[0], -1.0) && approx_eq(v[1], 0.0));
    }

    #[test]
    fn test_reflect_x() {
        let r = reflect_x_2d();
        let v = r.transform(&[3.0, 4.0]);
        // (3, 4) → (3, -4)
        assert!(approx_eq(v[0], 3.0) && approx_eq(v[1], -4.0));
    }

    #[test]
    fn test_reflect_y() {
        let r = reflect_y_2d();
        let v = r.transform(&[3.0, 4.0]);
        // (3, 4) → (-3, 4)
        assert!(approx_eq(v[0], -3.0) && approx_eq(v[1], 4.0));
    }

    #[test]
    fn test_composition_rotate_then_reflect() {
        let rotate = rotate_2d(std::f64::consts::FRAC_PI_2);  // 90° CCW
        let reflect = reflect_x_2d();

        // Compose: reflect ∘ rotate  (rotate first, then reflect)
        // Composition matrix = reflect * rotate
        let composed = reflect.multiply(&rotate);

        // Apply to (1, 0):
        //   rotate(1, 0) = (0, 1)
        //   reflect(0, 1) = (0, -1)
        let v = composed.transform(&[1.0, 0.0]);
        assert!(approx_eq(v[0], 0.0) && approx_eq(v[1], -1.0));

        // Apply sequentially to verify
        let step1 = rotate.transform(&[1.0, 0.0]);
        let step2 = reflect.transform(&step1);
        assert!(approx_eq(step2[0], 0.0) && approx_eq(step2[1], -1.0));
    }

    #[test]
    fn test_composition_scale_then_rotate() {
        // Scale by (2, 1) then rotate 45°
        let s = scale_2d(2.0, 1.0);
        let r = rotate_2d(std::f64::consts::FRAC_PI_4);

        let composed = r.multiply(&s);

        // Apply to (1, 0):
        //   scale(1, 0) = (2, 0)
        //   rotate(2, 0) = (2cos45°, 2sin45°) = (√2, √2) ≈ (1.4142, 1.4142)
        let v = composed.transform(&[1.0, 0.0]);
        let sqrt2 = std::f64::consts::FRAC_1_SQRT_2 * 2.0; // = 2/√2 = √2
        assert!(approx_eq(v[0], sqrt2) && approx_eq(v[1], sqrt2));
    }

    #[test]
    fn test_shear_x() {
        let sh = shear_x_2d(2.0);
        let v = sh.transform(&[3.0, 4.0]);
        // (3, 4) → (3 + 2*4, 4) = (11, 4)
        assert!(approx_eq(v[0], 11.0) && approx_eq(v[1], 4.0));
    }

    #[test]
    fn test_identity_transformation() {
        let i = Matrix::identity(3);
        let v = i.transform(&[5.0, 6.0, 7.0]);
        assert!(approx_eq(v[0], 5.0) && approx_eq(v[1], 6.0) && approx_eq(v[2], 7.0));
    }

    #[test]
    fn test_double_reflection_is_identity() {
        // Reflect across x-axis twice → back to original
        let r = reflect_x_2d();
        let double = r.multiply(&r); // (reflect ∘ reflect) = identity
        let v = double.transform(&[123.0, -456.0]);
        assert!(approx_eq(v[0], 123.0) && approx_eq(v[1], -456.0));
    }

    #[test]
    fn test_rotation_then_inverse() {
        // Rotating by θ then by -θ gives identity
        let theta = 0.7;
        let r = rotate_2d(theta);
        let r_inv = rotate_2d(-theta);
        // Compose: r_inv ∘ r = identity
        // (since rotation matrices satisfy R(-θ) = R(θ)^T = R(θ)^{-1})
        let composed = r_inv.multiply(&r);
        let v = composed.transform(&[3.0, 4.0]);
        assert!(approx_eq(v[0], 3.0) && approx_eq(v[1], 4.0));
    }
}

Run the tests:

cargo test -p ch05-linear-transformations

You should see:

running 12 tests
test tests::test_composition_rotate_then_reflect ... ok
test tests::test_composition_scale_then_rotate ... ok
test tests::test_double_reflection_is_identity ... ok
test tests::test_identity_transformation ... ok
test tests::test_reflect_x ... ok
test tests::test_reflect_y ... ok
test tests::test_rotation_180_degrees ... ok
test tests::test_rotation_90_degrees ... ok
test tests::test_rotation_then_inverse ... ok
test tests::test_scale_basis_vectors ... ok
test tests::test_scale_point ... ok
test tests::test_shear_x ... ok

test result: ok. 12 passed; 0 failed; 0 ignored; 0 measured; 0 filtered

Walkthrough

transform(&self, v: &[f64]) — Matrix-vector multiplication. This is the core operation of a linear transformation: for each row of the matrix, compute the dot product with the input vector. The result is a new vector.
Factory functions — scale_2d, rotate_2d, reflect_x_2d, reflect_y_2d, shear_x_2d, shear_y_2d are free functions (not methods) that construct 2×2 matrices. They're defined outside impl Matrix because they don't operate on an existing matrix.
Composition is multiplication — Applying transformation $\mathbf{B}$ after $\mathbf{A}$ means computing $\mathbf{B}(\mathbf{A}\mathbf{x})$, which is $(\mathbf{B}\mathbf{A})\mathbf{x}$. The matrix for the composed transformation is $\mathbf{B}\mathbf{A}$ — matrix multiplication in action. This is tested in test_composition_rotate_then_reflect and test_composition_scale_then_rotate.
Column interpretation — The first column of a transformation matrix is where $\mathbf{e}_1 = (1, 0)$ goes. For scale_2d(2, 3), the first column is $(2, 0)$ — $\mathbf{e}_1$ is scaled by 2 along x. The second column is $(0, 3)$ — $\mathbf{e}_2$ is scaled by 3 along y. This is verified in test_scale_basis_vectors.
Inverse transformations — Some transformations can be undone. Rotating by $-\theta$ undoes a rotation by $\theta$. Reflecting twice gives back the original. These are tested in test_double_reflection_is_identity and test_rotation_then_inverse.

Verification

Test	What it checks	Invariant
`test_scale_basis_vectors`	Columns of scale matrix = scaled basis vectors	$A\mathbf{e}_i = i\text{th column}$
`test_scale_point`	Scale transforms a point correctly	$(x, y) \to (s_x x, s_y y)$
`test_rotation_90_degrees`	90° rotation of (1, 0) → (0, 1)	$\mathbf{e}_1 \to \mathbf{e}_2$
`test_rotation_180_degrees`	180° rotation of (1, 0) → (-1, 0)	$\mathbf{e}_1 \to -\mathbf{e}_1$
`test_reflect_x`	Reflection across x-axis flips y	$(x, y) \to (x, -y)$
`test_composition_rotate_then_reflect`	Composed matrix = sequential application	$\mathbf{BAx} = \mathbf{B}(\mathbf{Ax})$
`test_double_reflection_is_identity`	Reflect twice = do nothing	$R_x \circ R_x = I$
`test_rotation_then_inverse`	Rotate by θ then -θ = identity	$R_{-\theta} R_{\theta} = I$

Key Takeaways

A linear transformation is any function $T$ satisfying $T(\alpha\mathbf{u} + \beta\mathbf{v}) = \alpha T(\mathbf{u}) + \beta T(\mathbf{v})$ — it preserves lines and the origin.
Every linear transformation is represented by a matrix whose columns are where the basis vectors land.
Applying a transformation is matrix-vector multiplication — a linear combination of the columns.
Composing transformations is matrix multiplication — the order matters (rightmost is applied first).
Common 2D transformations (scale, rotate, reflect, shear) are simple 2×2 matrices with clear geometric meaning.
The identity matrix does nothing; an inverse transformation undoes a transformation.