Post

Transformations in Image Processing

Transformations in Image Processing

📐 Overview

Geometric transformations modify the spatial position of pixels. Unlike arithmetic operations, they do not change intensity values, only coordinates.

The most fundamental transforms are:

  • Translation (Move)
  • Rotation
  • Scaling

All are usually expressed using matrix multiplication.


➡️ Translation (Move)

Definition

Move every pixel by a fixed offset ((t_x, t_y)).

Coordinate Form

\(\begin{aligned} x' &= x + t_x \\ y' &= y + t_y \end{aligned}\)

Homogeneous Matrix Form

\(\begin{bmatrix} x' \\ y' \\ 1 \end{bmatrix} = \begin{bmatrix} 1 & 0 & t_x \\ 0 & 1 & t_y \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}\)

Usage

  • ROI shifting
  • Camera motion compensation

🔄 Rotation

Definition

Rotate around the origin by angle (\theta).

Coordinate Form

\(\begin{aligned} x' &= x\cos\theta - y\sin\theta \\ y' &= x\sin\theta + y\cos\theta \end{aligned}\)

Matrix Form

\(\begin{bmatrix} x' \\ y' \end{bmatrix} = \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}\)

Rotation About an Arbitrary Center ((c_x, c_y))

\(\begin{aligned} x' &= (x-c_x)\cos\theta - (y-c_y)\sin\theta + c_x \\ y' &= (x-c_x)\sin\theta + (y-c_y)\cos\theta + c_y \end{aligned}\)

Usage

  • Object alignment
  • Orientation normalization

🔍 Scaling

Definition

Resize coordinates by scale factors (s_x, s_y).

Coordinate Form

\(\begin{aligned} x' &= s_x \cdot x \\ y' &= s_y \cdot y \end{aligned}\)

Matrix Form

\(\begin{bmatrix} x' \\ y' \end{bmatrix} = \begin{bmatrix} s_x & 0 \\ 0 & s_y \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}\)

Uniform Scaling

\(s_x = s_y = s\)

Usage

  • Image pyramid
  • Resolution normalization

🔗 Combined Transform (Affine)

Translation, rotation, and scaling can be combined:

\[\begin{bmatrix} x' \\ y' \\ 1 \end{bmatrix} = \begin{bmatrix} a & b & t_x \\ c & d & t_y \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}\]

Where (a,b,c,d) encode rotation + scaling.


⚠️ Forward vs Backward Mapping

  • Forward mapping: holes may appear
  • Backward mapping: preferred (interpolation)

Interpolation methods:

  • Nearest
  • Bilinear
  • Bicubic

🧠 Practical Notes

  • Most CV libraries use backward mapping
  • Use homogeneous coordinates to unify transforms
  • Rotation + scaling = linear
  • Translation requires homogeneous form

🎯 Takeaway

Geometric transforms change where pixels are, not what they are.

Understanding the math behind:

  • Translation
  • Rotation
  • Scaling

is essential for:

  • Registration
  • Alignment
  • Tracking
  • Industrial inspection
This post is licensed under CC BY 4.0 by the author.