Lbdiv32:
Local branch divergence is a measure of how branchy a region of code is, i.e. how
much a region of code suffers from branch divergence.
Each branch operation evaluates to either being taken or not.
A branch operation that consistently makes the same branch decision is said to have
low divergence while one that frequently switched between taking and not taking the branch has high divergence.
Lbdiv32 tracks these decisions in branch decision windows of size 32 and this
distinction (low divergence vs high divergence) is made based on these windows.
Classifying Lbdiv32:
Lbdiv32 can be classified into low, medium or high as follows:
Bucket | Condition |
---|---|
Low |
The region of code contains no branch conditions.
OR
The region of code contains highly repeated branch operations that
make the same branch decision in almost all iterations.
|
Medium | The region of code contains a mix of branch operations that change decisions frequently (within every 32 decisions) and others that make the same decision in almost all iterations. |
High | The region of code contains a highly repeated branch operation that changes its decision frequently, i.e. within every 32 decisions. |
Example 1:
Consider the following code:
for (i=0; i<1000; i++){ if(i%2 == 0){ result[i] = arrayOne[i] + arrayTwo[i]; } if(i == 999) print(“This is the last loop iteration\n”); }This region contains two branch conditions: (i%2 == 0) & (i == 999). It’s quite obvious in this case that the first condition (i%2 == 0) changes its decision (taking the branch or not) frequently (every second decision is different in a window of 32 decisions) while the second condition (i == 999) makes the same decision (not branching) for almost all iterations of this loop. Therefore, a good estimate of Lbdiv32 is mid.
Example 2:
Let's look at another example:
for (i=0; i<1000; i++){ if(i%2 == 0){ for(j=0; j<10; j++){ if(visited[i][j] == true){ result[i] = arrayOne[i] + arrayTwo[i]; visited[i][j] = true; } } } }This region of code contains a second nested loop and branch expression. The innermost branch condition (visited[i][j] == true) is executed 10 times more than the outer branch (i%2 == 0). The inner branch is therefore highly repeated and estimating its divergence should be enough.