Transformations in Image Processing
📐 Overview
Geometric transformations modify the spatial position of pixels. Unlike arithmetic operations, they do not change intensity values, only coordinates.
The most fundamental transforms are:
- Translation (Move)
- Rotation
- Scaling
All are usually expressed using matrix multiplication.
➡️ Translation (Move)
Definition
Move every pixel by a fixed offset ((t_x, t_y)).
Coordinate Form
\(\begin{aligned} x' &= x + t_x \\ y' &= y + t_y \end{aligned}\)
Homogeneous Matrix Form
\(\begin{bmatrix} x' \\ y' \\ 1 \end{bmatrix} = \begin{bmatrix} 1 & 0 & t_x \\ 0 & 1 & t_y \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}\)
Usage
- ROI shifting
- Camera motion compensation
🔄 Rotation
Definition
Rotate around the origin by angle (\theta).
Coordinate Form
\(\begin{aligned} x' &= x\cos\theta - y\sin\theta \\ y' &= x\sin\theta + y\cos\theta \end{aligned}\)
Matrix Form
\(\begin{bmatrix} x' \\ y' \end{bmatrix} = \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}\)
Rotation About an Arbitrary Center ((c_x, c_y))
\(\begin{aligned} x' &= (x-c_x)\cos\theta - (y-c_y)\sin\theta + c_x \\ y' &= (x-c_x)\sin\theta + (y-c_y)\cos\theta + c_y \end{aligned}\)
Usage
- Object alignment
- Orientation normalization
🔍 Scaling
Definition
Resize coordinates by scale factors (s_x, s_y).
Coordinate Form
\(\begin{aligned} x' &= s_x \cdot x \\ y' &= s_y \cdot y \end{aligned}\)
Matrix Form
\(\begin{bmatrix} x' \\ y' \end{bmatrix} = \begin{bmatrix} s_x & 0 \\ 0 & s_y \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}\)
Uniform Scaling
\(s_x = s_y = s\)
Usage
- Image pyramid
- Resolution normalization
🔗 Combined Transform (Affine)
Translation, rotation, and scaling can be combined:
\[\begin{bmatrix} x' \\ y' \\ 1 \end{bmatrix} = \begin{bmatrix} a & b & t_x \\ c & d & t_y \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}\]Where (a,b,c,d) encode rotation + scaling.
⚠️ Forward vs Backward Mapping
- Forward mapping: holes may appear
- Backward mapping: preferred (interpolation)
Interpolation methods:
- Nearest
- Bilinear
- Bicubic
🧠 Practical Notes
- Most CV libraries use backward mapping
- Use homogeneous coordinates to unify transforms
- Rotation + scaling = linear
- Translation requires homogeneous form
🎯 Takeaway
Geometric transforms change where pixels are, not what they are.
Understanding the math behind:
- Translation
- Rotation
- Scaling
is essential for:
- Registration
- Alignment
- Tracking
- Industrial inspection