Memory Layout in C++
Prerequisites
1. Memory Layout
Memory layout defines how data is physically arranged in memory.
👉 Performance depends heavily on:
- Cache line usage
- Data locality
- Access pattern
2. Struct
1
2
3
4
5
| struct A
{
char c;
int i;
};
|
Memory (with padding):
1
| [c][pad][pad][pad][i][i][i][i]
|
- CPU prefers aligned access
- Compiler inserts padding automatically
Why important?
- Affects size
- Affects cache efficiency
✔ Optimization by Reordering
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| // ❌ Inefficient
struct A
{
char c;
int i;
char d;
};
// ✔ Optimized
struct A
{
int i;
char c;
char d;
};
|
Less padding → better cache usage
3. Class
✔ Same as struct (mostly)
1
2
3
4
5
| class A
{
int x;
float y;
};
|
Same layout as struct
✔ With Virtual Function
1
2
3
4
5
6
| class Base
{
public:
virtual void foo() {}
int x;
};
|
Memory:
- vptr = pointer to virtual table
- Adds memory + runtime overhead
4. Optimization
AoS (Array of Structures)
1
2
3
4
5
6
7
| struct Particle
{
float x, y, z;
float vx, vy, vz;
};
Particle particles[N];
|
Memory:
1
| [x y z vx vy vz][x y z vx vy vz][x y z vx vy vz]...
|
✔ Problem
1
2
| for (int i = 0; i < N; i++)
particles[i].x += 1.0f;
|
- Loads entire struct
- Uses only
x
❌ Wasted memory bandwidth
❌ Poor cache efficiency
5. SoA (Structure of Arrays)
✔ Definition
1
2
3
4
5
| struct Particles
{
float x[N], y[N], z[N];
float vx[N], vy[N], vz[N];
};
|
Memory:
1
2
3
| [x x x x x ...]
[y y y y y ...]
[z z z z z ...]
|
✔ Advantage
1
2
| for (int i = 0; i < N; i++)
particles.x[i] += 1.0f;
|
- Only loads required data
- Perfect sequential access
✔ Cache-friendly
✔ SIMD-friendly
Cache Perspective
1
| CPU reads memory in **cache lines (~64 bytes)**
|
AoS
1
| [x y z vx vy vz] → unnecessary data loaded
|
SoA
1
| [x x x x x] → only needed data loaded
|
- SoA → efficient bandwidth