Post

Branch Optimization

Branch Optimization

Branch Optimization


Prerequisites


1. Branch Optimization

Modern CPUs use branch prediction to guess the next execution path.

πŸ‘‰ If prediction is correct β†’ fast
πŸ‘‰ If prediction fails β†’ pipeline flush β†’ huge performance penalty

A mispredicted branch can cost 10~20+ cycles

  1. Can this branch be removed?
  2. If not, can it be made predictable?
  3. Can it be converted to branchless logic?

2. Methods of Branch Optimization

2-1. Branch Minimization

❌ Branch-heavy code
1
2
if (x > 0) 
    sum += x;
1
2
3
4
if (x > y) 
    result = a;
else 
    result = b;
  • Branch may be unpredictable
  • Causes pipeline stalls
βœ” Branchless version
1
sum += (x > 0) * x;
  • (x > 0) β†’ 0 or 1
  • Eliminates branch
1
2
int flag = (x > y);  // 0 or 1
result = flag * a + (1 - flag) * b;
  • No branch misprediction
  • Better SIMD/vectorization
  • Stable performance
ConditionRecommendation
Unpredictable branchUse branchless
Tight loopPrefer branchless
SIMD/vectorizationRequired

2-2. Make it predictable on Branch

1
2
3
4
if (likely(x > 0))
{
    // hot path
}

Compilers may optimize based on probability

1
2
3
4
5
6
7
8
// βœ” Better
if (x == 0) 
{
    // common case
} 
else {
    // rare case
}
❌ Bad ordering
1
2
3
4
5
6
7
8
if (x != 0) 
{
    // rare case
} 
else 
{
    // common case
}
This post is licensed under CC BY 4.0 by the author.