Implement Fixed-Point Square Root Using Lookup Table

This example shows how to implement fixed-point square root using a lookup table. Lookup tables generate efficient code for embedded devices.

Setup

To assure that this example does not change your preferences or settings, this code stores the original state, and you will restore it at the end.

originalFormat = get(0, 'format'); format long g
originalWarningState = warning('off','fixed:fi:underflow');
originalFiprefState = get(fipref); reset(fipref)

Square Root Implementation

The square root algorithm is summarized here.

  1. Declare the number of bits in a byte, B, as a constant. In this example, B=8.

  2. Use function fi_normalize_unsigned_8_bit_byte() described in example Normalize Data for Lookup Tables to normalize the input u>0 such that u = x * 2^n, 0.5 <= x < 2, and n is even.

  3. Extract the upper B-bits of x. Let x_B denote the upper B-bits of x.

  4. Generate lookup table, SQRTLUT, such that the integer i = x_B- 2^(B-2) + 1 is used as an index to SQRTLUT so that sqrt(x_B) can be evaluated by looking up the index sqrt(x_B) = SQRTLUT(i).

  5. Use the remainder, r = x - x_B, interpreted as a fraction, to linearly interpolate between SQRTLUT(i) and the next value in the table SQRTLUT(i+1). The remainder, r, is created by extracting the lower w - B bits of x, where w denotes the word-length of x. It is interpreted as a fraction by using function reinterpretcast().

  6. Finally, compute the output using the lookup table and linear interpolation:

sqrt( u ) = sqrt( x * 2^n )
          = sqrt(x) * 2^(n/2)
          = ( SQRTLUT( i ) + r * ( SQRTLUT( i+1 ) - SQRTLUT( i ) ) ) * 2^(n/2)
function y = fi_sqrtlookup_8_bit_byte(u)  %#codegen
    % Load the lookup table
    SQRTLUT = sqrt_lookup_table();
    % Remove fimath from the input to insulate this function from math
    % settings declared outside this function.
    u = removefimath(u);
    % Declare the output
    y = coder.nullcopy(fi(zeros(size(u)), numerictype(SQRTLUT), fimath(SQRTLUT)));
    B = 8; % Number of bits in a byte
    w = u.WordLength;
    for k = 1:numel(u)
        assert(u(k)>=0,'Input must be non-negative.');
        if u(k)==0
            y(k)=0;
        else
            % Normalize the input such that u = x * 2^n and 0.5 <= x < 2
            [x,n] = fi_normalize_unsigned_8_bit_byte(u(k));
            isodd = storedInteger(bitand(fi(1,1,8,0),fi(n)));
            x = bitsra(x,isodd);
            n = n + isodd;
            % Extract the high byte of x
            high_byte = storedInteger(bitsliceget(x, w, w - B + 1));
            % Convert the high byte into an index for SQRTLUT
            i =  high_byte - 2^(B-2) + 1;
            % The upper byte was used for the index into SQRTLUT.
            % The remainder, r, interpreted as a fraction, is used to
            % linearly interpolate between points.
            T_unsigned_fraction = numerictype(0, w-B, w-B);
            r = reinterpretcast(bitsliceget(x,w-B,1), T_unsigned_fraction);
            y(k) = bitshift((SQRTLUT(i) + r*(SQRTLUT(i+1) - SQRTLUT(i))),...
                            bitsra(n,1));
        end
    end
    % Remove fimath from the output to insulate the caller from math settings
    % declared inside this function.
    y = removefimath(y);
end

Square Root Lookup Table

Function sqrt_lookup_table loads the lookup table of square-root values. You can create the table by running:

sqrt_table = sqrt( (2^(B-2):2^(B))/2^(B-1) );
function SQRTLUT = sqrt_lookup_table()
    B = 8;  % Number of bits in a byte
    % sqrt_table = sqrt( (2^(B-2):2^(B))/2^(B-1) )
    sqrt_table = [0.707106781186548   0.712609640686961   0.718070330817254   0.723489806424389 ...
                  0.728868986855663   0.734208757779421   0.739509972887452   0.744773455488312 ...
                  0.750000000000000   0.755190373349661   0.760345316287277   0.765465544619743 ...
                  0.770551750371122   0.775604602874429   0.780624749799800   0.785612818123533 ...
                  0.790569415042095   0.795495128834866   0.800390529679106   0.805256170420320 ...
                  0.810092587300983   0.814900300650331   0.819679815537750   0.824431622392057 ...
                  0.829156197588850   0.833854004007896   0.838525491562421   0.843171097702003 ...
                  0.847791247890659   0.852386356061616   0.856956825050130   0.861503047005639 ...
                  0.866025403784439   0.870524267324007   0.875000000000000   0.879452954966893 ...
                  0.883883476483184   0.888291900221993   0.892678553567856   0.897043755900458 ...
                  0.901387818865997   0.905711046636840   0.910013736160065   0.914296177395487 ...
                  0.918558653543692   0.922801441264588   0.927024810886958   0.931229026609459 ...
                  0.935414346693485   0.939581023648307   0.943729304408844   0.947859430506444 ...
                  0.951971638232989   0.956066158798647   0.960143218483576   0.964203038783845 ...
                  0.968245836551854   0.972271824131503   0.976281209488332   0.980274196334883 ...
                  0.984250984251476   0.988211768802619   0.992156741649222   0.996086090656827 ...
                  1.000000000000000   1.003898650263063   1.007782218537319   1.011650878514915 ...
                  1.015504800579495   1.019344151893756   1.023169096484056   1.026979795322186 ...
                  1.030776406404415   1.034559084827928   1.038327982864759   1.042083250033317 ...
                  1.045825033167594   1.049553476484167   1.053268721647045   1.056970907830485 ...
                  1.060660171779821   1.064336647870400   1.068000468164691   1.071651762467640 ...
                  1.075290658380328   1.078917281352004   1.082531754730548   1.086134199811423 ...
                  1.089724735885168   1.093303480283494   1.096870548424015   1.100426053853688 ...
                  1.103970108290981   1.107502821666834   1.111024302164449   1.114534656257938 ...
                  1.118033988749895   1.121522402807898   1.125000000000000   1.128466880329237 ...
                  1.131923142267177   1.135368882786559   1.138804197393037   1.142229180156067 ...
                  1.145643923738960   1.149048519428140   1.152443057161611   1.155827625556683 ...
                  1.159202311936963   1.162567202358642   1.165922381636102   1.169267933366857 ...
                  1.172603939955857   1.175930482639174   1.179247641507075   1.182555495526531 ...
                  1.185854122563142   1.189143599402528   1.192424001771182   1.195695404356812 ...
                  1.198957880828180   1.202211503854459   1.205456345124119   1.208692475363357 ...
                  1.211919964354082   1.215138880951474   1.218349293101120   1.221551267855754 ...
                  1.224744871391589   1.227930169024281   1.231107225224513   1.234276103633219 ...
                  1.237436867076458   1.240589577579950   1.243734296383275   1.246871083953750 ...
                  1.250000000000000   1.253121103485214   1.256234452640111   1.259340104975618 ...
                  1.262438117295260   1.265528545707287   1.268611445636527   1.271686871835988 ...
                  1.274754878398196   1.277815518766305   1.280868845744950   1.283914911510884 ...
                  1.286953767623375   1.289985465034393   1.293010054098575   1.296027584582983 ...
                  1.299038105676658   1.302041665999979   1.305038313613819   1.308028096028522 ...
                  1.311011060212689   1.313987252601790   1.316956719106592   1.319919505121430 ...
                  1.322875655532295   1.325825214724777   1.328768226591831   1.331704734541407 ...
                  1.334634781503914   1.337558409939543   1.340475661845451   1.343386578762792 ...
                  1.346291201783626   1.349189571557681   1.352081728298996   1.354967711792425 ...
                  1.357847561400027   1.360721316067327   1.363589014329464   1.366450694317215 ...
                  1.369306393762915   1.372156150006259   1.375000000000000   1.377837980315538 ...
                  1.380670127148408   1.383496476323666   1.386317063301177   1.389131923180804 ...
                  1.391941090707505   1.394744600276337   1.397542485937369   1.400334781400505 ...
                  1.403121520040228   1.405902734900249   1.408678458698081   1.411448723829527 ...
                  1.414213562373095];
    % Cast to fixed point with the most accurate rounding method
    WL = 4*B;  % Word length
    FL = 2*B;  % Fraction length
    SQRTLUT = fi(sqrt_table, 1, WL, FL, 'RoundingMethod','Nearest');
    % Set fimath for the most efficient math operations
    F = fimath('OverflowAction','Wrap',...
               'RoundingMethod','Floor',...
               'SumMode','KeepLSB',...
               'SumWordLength',WL,...
               'ProductMode','KeepLSB',...
               'ProductWordLength',WL);
    SQRTLUT = setfimath(SQRTLUT, F);
end

Example

u = fi(linspace(0,128,1000),0,16,12);

y = fi_sqrtlookup_8_bit_byte(u);

y_expected = sqrt(double(u));
clf
subplot(211)
plot(u,y,u,y_expected)
legend('Output','Expected output','Location','Best')

subplot(212)
plot(u,double(y)-y_expected,'r')
legend('Error')
figure(gcf)

Cleanup

Restore original state.

set(0, 'format', originalFormat);
warning(originalWarningState);
fipref(originalFiprefState);