Note on Stride:
Memory stride is the distance between memory accesses and is measured as:
Local stride: This is the memory stride between two memory accesses for the same memory reference.
Global stride: This is the memory stride between memory accesses for consecutive memory references.
Consider the following example:
for (i=0; i<1000; i++){ for(j=0; j<10; j++){ sum += arrayOne[i] + arrayTwo[j]; } result[i] = sum; }The memory stride between consecutive memory accesses for the arrayOne memory reference is its local stride, i.e. (starting address of arrayOne[50] - ending address of arrayOne[49] ) is the local stride of arrayOne.
Note: This also means that if a region of code contains only 1 memory reference, global stride will be the same as local stride.
locStride1:
locStride1 or local stride 1 looks at the number of memory references that have a local stride of 1.
It effectively measures the number of references that access successive memory locations.
The final value of locStride1 is the aggregate of the locStride1 value for all memory references.
Classifying locStride1:
locStride1 can be classified as low, medium or high as follows:
Bucket | Condition |
---|---|
Low |
The region of code contains no memory references.
OR
The region of code contains a number of memory references and 1 or
fewer references out of every 3 has a local stride of 1.
|
Medium | The region of code contains a number of memory references and about 1 out of every 3 references has a local stride of 1. |
High | The region of code contains a number of memory references and 2 or more references out of every 3 have a local stride of 1. |
Example 1:
Consider the following code:
for (i=0; i<1000; i++){ for(j=0; j<10; j++){ sum += arrayOne[i] + arrayTwo[j]; } result[i] = sum; }and assume that all the elements in this region belong to the double data type. This region consists of 3 memory accesses but the accesses to arrayOne & arrayTwo are the most repeated. Therefore, we can ignore the accesses made to result.
Example 2:
Let's look at another example.
for(i=0; i<num; i++){ if(i > 0 && i < num-1){ part1 = one[i-1] * two[i-1] * two[i-1]; part2 = one[i] * two[i]; part3 = one[i+1]; } else if(i == 0){ part1 = one[i] * two[i] * two[i]; part2 = one[i+1] * two[i+1]; part3 = one[i+2]; } else if(i == num-1){ part1 = one[i] * two[i] * two[i]; part2 = one[i-1] * two[i-1]; part3 = one[i-2]; } result[i] = part1 + part2 + part3; }This code contains a number of memory references, with a few using additional offsets in their references, like one[i-1].