Implement Hardware-Efficient Burst QR Decomposition

Open Script

This example demonstrates how to compute the QR decomposition of matrices using hardware-efficient MATLAB® code in Simulink®. This model shares computational resources across steps of the QR decomposition algorithm. It thus uses fewer on-chip resources than a fully pipelined approach, while sacrificing some total throughput.

Solving Matrix Equations with QR Decomposition

When solving matrix equations, it is seldom, if ever, necessary to compute the inverse of a matrix [1][3]. The Real Burst QR Decomposition block provides a hardware efficient way of solving the equation

$Ax = B$

where $A$ is an m x n matrix, $x$ is an n x p matrix, and $B$ is an m x p matrix. This equation can be converted to the form

$Rx = Q^TB$

through a series of orthogonal transformations. Here, $R$ is an m x n upper triangular matrix such that $QR = A$ . The Real Burst QR Decomposition block retains only the non-zero rows of $R$ , along with the corresponding rows of $Q^TB$ .

The QR decomposition is well suited to fixed-point architectures because it can be entirely performed with Givens rotations. These have an efficient fixed-point implementation in terms of the CORDIC algorithm. For more details on the algorithm for fixed-point QR decomposition, see Perform QR Factorization Using CORDIC.

To begin, open the Simulink model fxpdemo_real_burst_QR_model.

open_system('fxpdemo_real_burst_QR_model');

This model works with fixed-point, double, single, and scaled double data types. A fully pipelined version of this algorithm is implemented in Simulink. See Implement Hardware-Efficient QR Decomposition Using CORDIC in a Systolic Array.

Input Parameters for Real Burst QR Decomposition

The Real Burst QR Decomposition block takes four input parameters. The mask for this block is shown below:

You must assign input data to variables in the MATLAB workspace. To customize this algorithm, change the data according to the procedures used in the following sections.

Define Matrix Dimensions

The model's architecture allocates the minimal memory necessary to store the data needed to perform the QR decomposition. Therefore, the size of the input matrices must be known at compile time. Here, m is the number of rows in matrices A and B, n is the number of columns in A, and p is the number of columns in B.

Additionally, since the block can handle decomposition of an indefinite number of matrices in series, a number of samples is specified. The model terminates when all of the samples have been used.

m = 4;
n = 4;
p = 4;
num_samples = 100;

Define Input Datatypes

Before generating input data, it is important to specify the data type of the matrix data as shown below. This example uses signed fixed-point types with 16-bit word lengths. Because the data can grow by a factor of $sqrt(m)$ when computing the matrix R, we use this constraint to make sure that the integer part of the data type is large enough. See Perform QR Factorization Using CORDIC for further details.

word_length = 16;
integer_length = nextpow2(sqrt(m)) + 1;
fraction_length = word_length - 1 - integer_length;
nt = numerictype(1,word_length,fraction_length);

Define the Number of CORDIC Iterations

The CORDIC approximation for Givens rotations is an iterative algorithm that gives an additional bit of accuracy per iteration. For signed fixed-point datatypes, the greatest possible accuracy is achieved when the number of iterations performed is one fewer than the wordlength.

NumberOfCORDICIterations = word_length - 1;

Initialize Random Data

The data handler in this example takes matrices A and B as inputs. In this example, A and B are defined as random matrices elements drawn from a uniform distribution from -1 to 1. Note that A and B are each defined as three-dimensional arrays, where each sample matrix is stored in the first two dimensions.

A = fi(2*(rand(m,n,num_samples) - 0.5),nt);
B = fi(2*(rand(m,p,num_samples) - 0.5),nt);

Transferring Data to Real Burst QR Decomposition

The ready port triggers the Data Handler subsystem. When ready is high, the block asserts validIn and sends the next row of A and B to aIn and bIn. The protocol for transferring data allows data to be sent whenever ready is high, ensuring all data is processed. If data is sent when ready is not high, it will not be processed.

Handling Outputs

The Real Burst QR Decomposition block outputs data one row at a time. Whenever the block outputs a result row, it asserts validOut. Note that rOut outputs the rows of R, while cOut outputs the rows of $Q^TB$ . The rows are output in reverse order, as this is the natural order to consume them for back substitution. In this example, they are saved as m * num_samples x n matrices, with the rows ordered as they were received.

Simulate

sim fxpdemo_real_burst_QR_model

Reconstruct the Solutions from the Output Data

Because the data are output in reverse order, you must reconstruct the data to interpret the results. The following code puts the data in the correct order.

C = cell(1,num_samples);
R = cell(1,num_samples);
for ii = 1:num_samples
    C{ii} = flipud(C_Out((ii-1)*m + 1:ii*m,:));
    R{ii} = flipud(R_Out((ii-1)*m + 1:ii*m,:));
end

Evaluate the Accuracy of the Result

To evaluate the accuracy of the Real Burst QR Decomposition block, examine the magnitude of the difference between the solution to the matrix equation using the Real Burst QR Decomposition block and that obtained using MATLAB's built-in backsolve for doubles. Plot the absolute error and condition number for each sample. The data show that the accuracy of the solution tracks the condition number, as expected [2].

xAct = cell(1,num_samples);
xExp = cell(1,num_samples);
xErr = zeros(1,num_samples);
condNumber = zeros(1,num_samples);
for ii = 1:num_samples
    xAct{ii} = double(R{ii})\double(C{ii});
    xExp{ii} = double(A(:,:,ii))\double(B(:,:,ii));
    xErr(ii) = norm(xAct{ii} - xExp{ii});
    condNumber(ii) = cond(double(A(:,:,ii)));
end
figure(1)
clf
h1 = subplot(2,1,1);
hold on;
h1.YScale = 'log';
plot(xErr)
grid on
title('Error of ''Real Burst QR Decomposition''');
ylabel('norm(R\\C - A\\B)');
h2 = subplot(2,1,2);
hold on;
h2.YScale = 'log';
plot(condNumber);
grid on
title('Condition Number of Samples');
ylabel('cond(A)');
xlabel('Test Point');
linkaxes([h1,h2],'x');

Close the Model

close_system fxpdemo_real_burst_QR_model

References

[1] George E. Forsythe, M.A. Malcom, and Cleve B. Moler. Computer Methods for Mathematical Computations. Englewood Cliffs, N.J.: Prentice-Hall, 1977.

[2] Cleve B. Moler. What is the Condition Number of a Matrix?, The MathWorks, Inc. 2017.

[3] Cleve B. Moler. Numerical Computing with MATLAB. SIAM, 2004. ISBN: 978-0-898716-60-3.

Documentation