Components - CPU
🧠 CPU Execution Flow — A Visual & Intuitive Guide
A clean, readable, engineer‑friendly walkthrough of how a modern CPU executes instructions — from prediction to commit.
📌 At a Glance — Execution Pipeline
1
Predict → Fetch → Decode → Rename → Dispatch → Execute → Writeback → Commit
Design philosophy:
> 🚀 Speculate far ahead → ✅ Commit safely in order
🏎 PART 1 — Runtime Execution Flow (Story Mode)
1️⃣ Instruction Fetch & Branch Prediction
The CPU predicts where execution will go next before knowing for sure.
🔮 Prediction Engines
Predictor Purpose Example ———————— ———————————- ———– BTB Predicts branch target address if / loop GHB Tracks taken/not‑taken history T T T T N Indirect Predictor Predicts multi‑target jumps switch Return Stack Predicts function return address return
📥 Fetch Path
1
PC → ITLB → L1 Instruction Cache → Fetch Queue → Decode
❗ If wrong → Flush pipeline & refetch
2️⃣ Decode → Micro‑Operations
Instructions become internal micro‑operations (uOps) describing: - Execution unit type - Register reads / writes - Memory access - Flag dependencies
Complex instructions → multiple uOps
3️⃣ Register Rename — Removing Fake Dependencies
Renaming maps architectural registers → physical registers to avoid: - WAR - WAW
🎯 Result
- Higher parallelism
- Fewer stalls
- ROB entries allocated
4️⃣ Dispatch & Out‑of‑Order Scheduling
uOps wait in execution queues: - Reservation Stations - Issue Queue - Load / Store Queue
📌 Execution starts when operands are ready, not by original order.
5️⃣ Execution Units — Parallel Engines
Unit Role ———————- ——————– 🧮 ALU / Shift Integer arithmetic ✖ IMAC / DIV Multiply & divide 🧬 SIMD / Crypto Vector & crypto 🔀 Branch Unit Flow control 📍 AGU Address generation 💾 Load / Store Memory access
6️⃣ Memory Access & Cache Flow
📤 Load Path
1
AGU → DTLB → L1 → L2 → L3 → DRAM
📥 Store Path
Stores enter Store Buffer → visible at commit
📦 Miss Handling
- Fill Buffer = incoming cache lines
- Evict Buffer = outgoing cache lines
- Dirty → Writeback
7️⃣ Multi‑Core Cache Coherency
When cache miss happens: 1. Core becomes Master 2. Other cores snooped 3. Peer cache supplies data OR RAM fetch 4. Fill Buffer → Cache
Protocols: MESI / MOESI
8️⃣ Writeback
Results written to Physical Register File (PRF)
Dependent instructions wake up
9️⃣ Commit — Making Results Official
✔ In‑order architectural update
✔ Precise exceptions
✔ Correct interrupt timing
Execution is chaotic — commit is disciplined
🔟 Branch Misprediction Recovery
❌ Flush wrong‑path instructions
🔁 Restore correct PC
🚨 Largest CPU performance penalty
1️⃣1️⃣ Interrupts, Timers & Debug
Component Function ————- ——————– 🛎 GIC Interrupt routing ⏱ Timer OS scheduling 🧪 ETM Execution trace 🔗 CTI Debug trigger sync
Interrupts occur at safe commit points
📚 PART 2 — CPU Terminology Glossary (Reference Mode)
🔮 Branch Prediction
BTB — Branch target cache
GHB — Branch history tracker
Indirect Predictor — Multi‑target predictor
Return Stack Buffer — Predicts return addresses
🧭 Fetch & Address Translation
PC — Next instruction pointer
ITLB — Virtual → physical instruction translation
L1 I‑Cache — Fast instruction cache
Instruction Fetch — Loads next instruction
🧩 Pipeline Stages
Decode — Instruction → uOps
Rename — Removes false register hazards
Dispatch — Sends uOps to execution units
ROB — Tracks in‑flight instructions
Commit — Makes results architecturally visible
⚙ Execution Units
ALU — Integer arithmetic
IMAC — Multiply‑accumulate
DIV — Division
CRC — Checksum
SIMD / ASIMD — Vector compute
Crypto Unit — Cryptography
AGU — Address calculation
DTLB — Data address translation
🧱 Cache Architecture
TagRAM — Stores cache tags
DataRAM — Stores cached data
L2 Cache — Secondary cache
Prefetcher — Predicts memory access
🔁 Memory Coherency
Master — Core initiating request
Slave — Responding cache/memory
Snoop — Check other caches
Fill Buffer — Incoming cache storage
Evict Buffer — Outgoing cache storage
ACP — Coherent accelerator port
🐞 Debug & Tracing
ETM — Execution tracing hardware
CTI — Cross‑trigger debugger
GIC — Interrupt controller
Timer — Hardware clock
🎯 Why This Guide Works
✔ Visual, clean, and readable
✔ Blog‑ready & portfolio‑ready
✔ Interview‑grade CPU knowledge
✔ Designed for long‑term reference
If you’d like, I can also generate: - 🎨 Matching SVG diagrams - 🧠 Cycle‑by‑cycle execution visualizations - 📘 Advanced CPU Architecture Series - 🏗 ARM vs x86 breakdown posts
