Numerical Literacy
Concept
Read mathematical notation fluently — Greek letters, summation, products, sets, and common operators.
Why This Matters
Every ML paper is written in the language of mathematics. If the notation is a barrier, the paper is opaque. This chapter removes that barrier by teaching you to read notation the same way you read code: one symbol at a time.
Full reference: A complete Greek alphabet table and mathematical notation cheat sheet lives at
greek-glossary.html. This chapter covers the subset you'll see in every paper.
The Greek Alphabet — Essential Five
These five letters appear in every single ML paper. Learn them first.
| Letter | Name | Pronounced | Used for |
|---|---|---|---|
| $\alpha$ | alpha | al-fah | Learning rate |
| $\beta$ | beta | bay-tah | Momentum, regularization coefficient |
| $\theta$ | theta | thay-tah | Model parameters / weights |
| $\sigma$, $\Sigma$ | sigma | sig-mah | Standard deviation ($\sigma$), summation ($\Sigma$) |
| $\lambda$ | lambda | lam-dah | Regularization strength, eigenvalues |
Common Seven — You'll See Them Soon
| Letter | Name | Pronounced | Used for |
|---|---|---|---|
| $\gamma$ | gamma | gam-ah | Learning rate schedule, discount factor |
| $\delta$, $\Delta$ | delta | del-tah | Small change ($\delta$), large change ($\Delta$) |
| $\epsilon$ | epsilon | ep-sih-lon | Small constant (avoiding division by zero) |
| $\mu$ | mu | myoo | Mean of a distribution |
| $\nu$ | nu | nyoo | Degrees of freedom |
| $\rho$ | rho | roe | Correlation coefficient |
| $\phi$ | phi | fye / fee | Activation function, feature map |
| $\omega$ | omega | oh-may-gah | Angular frequency |
Rare but Memorable
| Letter | Name | Pronounced | Used for |
|---|---|---|---|
| $\zeta$ | zeta | zay-tah | Riemann zeta function (rare in ML, notable in theory) |
| $\xi$ | xi | ksee | Random noise variable, latent variable |
| $\psi$ | psi | sigh / psigh | Wavefunction, state representation |
Note: $\Sigma$ (uppercase sigma, "sig-mah") and $\Delta$ (uppercase delta, "del-tah") are the only two Greek letters commonly used in both cases with different meanings. Lowercase $\sigma$ = standard deviation; uppercase $\Sigma$ = summation. Lowercase $\delta$ = small change; uppercase $\Delta$ = large change.
Summation Notation
The notation $\sum_{i=1}^{n} x_i$ means "sum all $x_i$ from $i=1$ to $i=n$":
$$ \sum_{i=1}^{n} x_i = x_1 + x_2 + \dots + x_n $$
Read it aloud as: "sum from i equals 1 to n of x-sub-i."
Example
If $x = [3, 7, 2, 9]$, then:
$$ \sum_{i=1}^{4} x_i = 3 + 7 + 2 + 9 = 21 $$
Double Summation
A matrix $A$ with entries $a_{ij}$ is summed over both rows and columns:
$$ \sum_{i=1}^{m} \sum_{j=1}^{n} a_{ij} $$
This means: for each row $i$, sum across all columns $j$, then sum those row totals.
Product Notation
Similarly, $\prod_{i=1}^{n} x_i$ means "multiply all $x_i$ from $i=1$ to $i=n$":
$$ \prod_{i=1}^{n} x_i = x_1 \cdot x_2 \cdot \dots \cdot x_n $$
Read it aloud as: "product from i equals 1 to n of x-sub-i."
The symbol $\prod$ is the Greek capital letter pi (pronounced "pie").
Set Notation
A set is a collection of distinct elements. In ML, sets are used to describe what kind of values a variable can take.
| Notation | Read as | Meaning |
|---|---|---|
| $x \in \mathbb{R}$ | "x is in R" | $x$ is a real number |
| $x \notin \mathbb{R}$ | "x is not in R" | $x$ is not a real number |
| $A \subseteq B$ | "A is a subset of B" | Every element of $A$ is also in $B$ |
| $\mathbb{R}^n$ | "R n" | The set of $n$-dimensional real vectors |
| $\mathbb{R}^{m \times n}$ | "R m by n" | The set of $m \times n$ real matrices |
| $\{x \mid x > 0\}$ | "the set of x such that x is greater than 0" | Set-builder notation |
| $\emptyset$ | "empty set" | The set with no elements |
| $\mathbb{N}$ | "N" | Natural numbers $\{1, 2, 3, \dots\}$ |
| $\mathbb{Z}$ | "Z" | Integers $\{\dots, -2, -1, 0, 1, 2, \dots\}$ |
Why this matters for ML
When a paper says "let $W \in \mathbb{R}^{d \times k}$," it means: $W$ is a matrix with $d$ rows and $k$ columns, and every entry is a real number. This tells you the shape of the weights before you see a diagram.
Common Operators
| Symbol | Name | Meaning | First use in course |
|---|---|---|---|
| $\nabla$ | Nabla / del | Gradient — vector of partial derivatives | Optimization |
| $\partial$ | Partial derivative | Derivative w.r.t. one variable | Calculus |
| $\langle u, v \rangle$ | Inner product | Dot product of vectors | Linear algebra |
| $|v|$ | Norm | Length of a vector (default: Euclidean) | Linear algebra |
| $|v|_p$ | p-norm | Generalized length ($|v|_2$ = Euclidean) | Linear algebra |
| $\otimes$ | Tensor product | Kronecker / outer product | Deep learning |
| $\odot$ | Hadamard product | Element-wise multiplication | Neural networks |
| $\propto$ | Proportional to | Equals up to a constant factor | Probability |
| $\sim$ | Distributed as | $x \sim \mathcal{N}(0,1)$ means $x$ is normally distributed | Probability |
Rust Implementation
Let's translate summation and product into Rust. Open your terminal and create a new crate:
cargo new --name ch01 --lib ch01
cd ch01
Replace the contents of src/lib.rs with:
/// Sum of elements in a slice.
/// Corresponds to ∑_{i} x_i.
pub fn sum(x: &[f64]) -> f64 {
let mut total = 0.0;
for &val in x {
total += val;
}
total
}
/// Product of elements in a slice.
/// Corresponds to ∏_{i} x_i.
pub fn product(x: &[f64]) -> f64 {
let mut total = 1.0;
for &val in x {
total *= val;
}
total
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_sum() {
assert_eq!(sum(&[1.0, 2.0, 3.0]), 6.0);
assert_eq!(sum(&[]), 0.0);
}
#[test]
fn test_product() {
assert_eq!(product(&[2.0, 3.0, 4.0]), 24.0);
assert_eq!(product(&[]), 1.0);
}
}
Run the tests to verify:
cargo test
You should see:
running 2 tests
test tests::test_sum ... ok
test tests::test_product ... ok
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered
Walkthrough
&[f64]— a slice of 64-bit floats. The&means we borrow the data without taking ownership.for &val in x— iterate over each element, dereferencing automatically with&val.0.0and1.0— the identity elements for addition and multiplication: adding 0 or multiplying by 1 leaves the value unchanged. Notice thatsum(&[])returns0.0andproduct(&[])returns1.0— this is mathematically correct (the empty sum is 0, the empty product is 1).#[cfg(test)]— only compiles this module when runningcargo test.
Verification
The tests above verify that:
sumof $[1, 2, 3]$ equals $6$ — which matches $1 + 2 + 3 = 6$.sumof an empty slice equals $0$ — the identity element for addition.productof $[2, 3, 4]$ equals $24$ — which matches $2 \cdot 3 \cdot 4 = 24$.productof an empty slice equals $1$ — the identity element for multiplication.
Crucial habit: Always test empty inputs. If your function panics on an empty slice, that's a bug — ML data can be empty, missing, or malformed.
Key Takeaways
- Greek letters are just names — learn the 5 essential ones ($\alpha, \beta, \theta, \sigma, \lambda$) and you can read 80% of notation.
- Pronounce them aloud — saying "theta" or "epsilon" locks it in your memory.
- $\sum$ means add, $\prod$ means multiply — everything else is just details about the range.
- Set notation tells you the shape — $\mathbb{R}^{d \times k}$ immediately tells you "d by k matrix of real numbers."
- Every operator has a name — when you forget, the glossary is at
greek-glossary.html. - Implement the math in Rust — writing code that matches the notation is the fastest way to internalize it.