Post

Components - CPU

Components - CPU

🧠 CPU Execution Flow — A Visual & Intuitive Guide

A clean, readable, engineer‑friendly walkthrough of how a modern CPU executes instructions — from prediction to commit.

CPU


📌 At a Glance — Execution Pipeline

1
Predict → Fetch → Decode → Rename → Dispatch → Execute → Writeback → Commit

Design philosophy:
> 🚀 Speculate far ahead → ✅ Commit safely in order


🏎 PART 1 — Runtime Execution Flow (Story Mode)

1️⃣ Instruction Fetch & Branch Prediction

The CPU predicts where execution will go next before knowing for sure.

🔮 Prediction Engines

Predictor Purpose Example ———————— ———————————- ———– BTB Predicts branch target address if / loop GHB Tracks taken/not‑taken history T T T T N Indirect Predictor Predicts multi‑target jumps switch Return Stack Predicts function return address return

📥 Fetch Path

1
PC → ITLB → L1 Instruction Cache → Fetch Queue → Decode

❗ If wrong → Flush pipeline & refetch


2️⃣ Decode → Micro‑Operations

Instructions become internal micro‑operations (uOps) describing: - Execution unit type - Register reads / writes - Memory access - Flag dependencies

Complex instructions → multiple uOps


3️⃣ Register Rename — Removing Fake Dependencies

Renaming maps architectural registers → physical registers to avoid: - WAR - WAW

🎯 Result

  • Higher parallelism
  • Fewer stalls
  • ROB entries allocated

4️⃣ Dispatch & Out‑of‑Order Scheduling

uOps wait in execution queues: - Reservation Stations - Issue Queue - Load / Store Queue

📌 Execution starts when operands are ready, not by original order.


5️⃣ Execution Units — Parallel Engines

Unit Role ———————- ——————– 🧮 ALU / Shift Integer arithmetic ✖ IMAC / DIV Multiply & divide 🧬 SIMD / Crypto Vector & crypto 🔀 Branch Unit Flow control 📍 AGU Address generation 💾 Load / Store Memory access


6️⃣ Memory Access & Cache Flow

📤 Load Path

1
AGU → DTLB → L1 → L2 → L3 → DRAM

📥 Store Path

Stores enter Store Buffer → visible at commit

📦 Miss Handling

  • Fill Buffer = incoming cache lines
  • Evict Buffer = outgoing cache lines
  • Dirty → Writeback

7️⃣ Multi‑Core Cache Coherency

When cache miss happens: 1. Core becomes Master 2. Other cores snooped 3. Peer cache supplies data OR RAM fetch 4. Fill Buffer → Cache

Protocols: MESI / MOESI


8️⃣ Writeback

Results written to Physical Register File (PRF)
Dependent instructions wake up


9️⃣ Commit — Making Results Official

✔ In‑order architectural update
✔ Precise exceptions
✔ Correct interrupt timing

Execution is chaotic — commit is disciplined


🔟 Branch Misprediction Recovery

❌ Flush wrong‑path instructions
🔁 Restore correct PC
🚨 Largest CPU performance penalty


1️⃣1️⃣ Interrupts, Timers & Debug

Component Function ————- ——————– 🛎 GIC Interrupt routing ⏱ Timer OS scheduling 🧪 ETM Execution trace 🔗 CTI Debug trigger sync

Interrupts occur at safe commit points


📚 PART 2 — CPU Terminology Glossary (Reference Mode)

🔮 Branch Prediction

BTB — Branch target cache
GHB — Branch history tracker
Indirect Predictor — Multi‑target predictor
Return Stack Buffer — Predicts return addresses


🧭 Fetch & Address Translation

PC — Next instruction pointer
ITLB — Virtual → physical instruction translation
L1 I‑Cache — Fast instruction cache
Instruction Fetch — Loads next instruction


🧩 Pipeline Stages

Decode — Instruction → uOps
Rename — Removes false register hazards
Dispatch — Sends uOps to execution units
ROB — Tracks in‑flight instructions
Commit — Makes results architecturally visible


⚙ Execution Units

ALU — Integer arithmetic
IMAC — Multiply‑accumulate
DIV — Division
CRC — Checksum
SIMD / ASIMD — Vector compute
Crypto Unit — Cryptography
AGU — Address calculation
DTLB — Data address translation


🧱 Cache Architecture

TagRAM — Stores cache tags
DataRAM — Stores cached data
L2 Cache — Secondary cache
Prefetcher — Predicts memory access


🔁 Memory Coherency

Master — Core initiating request
Slave — Responding cache/memory
Snoop — Check other caches
Fill Buffer — Incoming cache storage
Evict Buffer — Outgoing cache storage
ACP — Coherent accelerator port


🐞 Debug & Tracing

ETM — Execution tracing hardware
CTI — Cross‑trigger debugger
GIC — Interrupt controller
Timer — Hardware clock


🎯 Why This Guide Works

✔ Visual, clean, and readable
✔ Blog‑ready & portfolio‑ready
✔ Interview‑grade CPU knowledge
✔ Designed for long‑term reference


If you’d like, I can also generate: - 🎨 Matching SVG diagrams - 🧠 Cycle‑by‑cycle execution visualizations - 📘 Advanced CPU Architecture Series - 🏗 ARM vs x86 breakdown posts


This post is licensed under CC BY 4.0 by the author.