Linear Least Square Problem (linear least squares), or ordinary least squares (OLS), is an important computational problem, that arises primarily in applications when it is desired to fit a linear mathematical model to measurements obtained from experiments. The goals of linear least squares are to extract predictions from the measurements and to reduce the effect of measurement errors. Mathematically, it can be stated as the problem of finding an approximate solution to an overdetermined system of linear equations. In statistics, it corresponds to the maximum likelihood estimate for a linear regression with normally distributed error.
Linear least square problems admit a closed-form solution, in contrast to non-linear least squares problems, which often have to be solved by an iterative procedure.
Motivational example
As a result of an experiment, four data points were obtained, , , , and (shown in red in the picture on the right). It is desired to find a line that fits “best” these four points. In other words, we would like to find the numbers and that approximately solve the overdetermined linear system
.
of four equations in two unknowns in some “best” sense.
The least squares approach to solving this problem is to try to make as small as possible the sum of squares of “errors” between the right- and left-hand sides of these equations, that is, to find the minimum of the function
.
The minimum is determined by calculating the partial derivatives of in respect to and and setting them to zero. This results in a system of two equations in two unknowns, called the normal equations, which, when solved, gives the solution
and the equation of the line of best fit. The residuals, that is, the discrepancies between the values from the experiment and the values calculated using the line of best fit are then found to be , , , and (see the picture on the right). The minimum value of the sum of squares is
.
The general problem
Consider an overdetermined system (there are more equations than unknowns)
,
of linear equations in n unknown coefficients, ,,…,, with , written in matrix form as ,where
.
Such a system usually has no solution, and the goal is then to find the coefficients which fit the equations “best”, in the sense of solving the quadratic minimization problem.
.
A justification for choosing this criterion is given in properties below. This minimization problem has a unique solution, provided that the columns of the matrix are linearly independent, given by solving the normal equations
.
Singular value decomposition
In linear algebra, the singular value decomposition (SVD) is an important factorization of a rectangular real or complex matrix, with many applications in signal processing and statistics. Applications which employ the SVD include computing the pseudo-inverse, least squares fitting of data, matrix approximation, and determining the rank, range and null space of a matrix.
Suppose is an -by- matrix whose entries come from the field , which is either the field of real numbers or the field of complex numbers. Then there exists a factorization of the form
,
where is an -by- unitary matrix over , the matrix is -by- diagonal matrix with nonnegative real numbers on the diagonal, and denotes the conjugate transpose of , an -by- unitary matrix over . Such a factorization is called a singular-value decomposition of . A common convention is to order the diagonal entries in non-increasing fashion. In this case, the diagonal matrix is uniquely determined by (though the matrices and are not). The diagonal entries of are known as the singular values of .
In practice, matrix consists of eigenvectors of matrix while matrix consists of eigenvectors of matrix . Finally, matrix is the square root of a diagonal matrix forming by eigenvalues of either matrix or matrix .
Example
Let us consider the case when
.
Having yields
and
.
By simple calculation, one has
being eigenvalues of matrix (in the descending form), which helps us to write down
To construct matrix , one has to find eigenvectors corresponding to eigenvalues and . For the first one, that is, , one has the following linear system
which implies, for example,
.
Similar to , one has its corresponding eigenvector as follows
.
Thus
.
The numerical solution of is
which gives us
.
Having we can compute by the following formula or you can perform the same routine as above. That is
.
It is easy to see that
.
Having a SVD result, that is
one can see that the solution of can be computed by
.
Note that matrices and are not unique. Using Maple software, you can compute SVD as shown below.
Useful links