Branch Optimization
Prerequisites
1. Branch Optimization
Modern CPUs use branch prediction to guess the next execution path.
π If prediction is correct β fast
π If prediction fails β pipeline flush β huge performance penalty
A mispredicted branch can cost 10~20+ cycles
- Can this branch be removed?
- If not, can it be made predictable?
- Can it be converted to branchless logic?
2. Methods of Branch Optimization
2-1. Branch Minimization
β Branch-heavy code
1
2
3
4
| if (x > y)
result = a;
else
result = b;
|
- Branch may be unpredictable
- Causes pipeline stalls
β Branchless version
(x > 0) β 0 or 1- Eliminates branch
1
2
| int flag = (x > y); // 0 or 1
result = flag * a + (1 - flag) * b;
|
- No branch misprediction
- Better SIMD/vectorization
- Stable performance
| Condition | Recommendation |
|---|
| Unpredictable branch | Use branchless |
| Tight loop | Prefer branchless |
| SIMD/vectorization | Required |
2-2. Make it predictable on Branch
1
2
3
4
| if (likely(x > 0))
{
// hot path
}
|
Compilers may optimize based on probability
1
2
3
4
5
6
7
8
| // β Better
if (x == 0)
{
// common case
}
else {
// rare case
}
|
β Bad ordering
1
2
3
4
5
6
7
8
| if (x != 0)
{
// rare case
}
else
{
// common case
}
|