Ngô Quốc Anh

June 30, 2009

A quick introduction to Simplex Method via an example

Filed under: Giải tích 9 (MA5265), Linh Tinh, Nghiên Cứu Khoa Học — Ngô Quốc Anh @ 14:36

Below is the
original problem:
objective function
is in green.
Red variables below
are called
Below is the
Compare RED symbols
with Z = x1 + 2x2 x3.
Blue numbers below
are the “ISM“.
is below. Note
missing z-column
(correction by Steve Hulet)
See steps 3,4,5 of
as you handle INDICATORS,
Named below are the 4
row operations needed to
pivot on the number
“5” encircled in red
Below are the
results of the
row operations
named above
are the
new ISM
(namely -1/5) remains
negative, we must
repeat steps 3-9
Below is the result
of changing our pivot
to “1”
Named below are
4 row operations
needed to pivot
on the number(16/5)
encircled in red
Above there was a tie for least non-negative ratio:
either row 1 or row 2 could have become the pivot row, and either choice leads to the final tableau after one additional pivoting. At the right is the result of the final 3 row operations.
All indicators {0, 0, 49
, 0, 1
and 3
} are now zero or bigger (“13” is NOT an indicator).
Thus, as in step 8 of the SIMPLEX METHOD, the last tableau is a FINAL TABLEAU.
Row operations of SIMPLEX METHOD are done.
Thus, the basic solution for the tableau above is the solution to our original problem.
[1st] set equal to 0 all variables NOT associated with the blue ISM, as at the right. Each column of the final tableau has a label naming it’s variable.
[2nd] convert each row of the final tableau (except the bottom row) back into equation form (as at the right) to find the values of the remaining variables. The value of the objective function is in the lower right corner of the final tableau.



June 29, 2009

Karush–Kuhn–Tucker conditions, the 1st order optimality condition

In mathematics, the Karush–Kuhn–Tucker conditions (also known as the Kuhn-Tucker or the KKT conditions) are necessary for a solution in nonlinear programming to be optimal, provided some regularity conditions are satisfied. It is a generalization of the method of Lagrange multipliers to inequality constraints. The conditions are named for William Karush, Harold W. Kuhn, and Albert W. Tucker.

Let us consider the following nonlinear optimization problem:



subject to

g_i(x) \le 0, \quad h_j(x) = 0

where f(x) is the function to be minimized, g_i (x) (i = 1,...,m) are the inequality constraints and h_j (x) (j = 1,...,l) are the equality constraints, and m and l are the number of inequality and equality constraints, respectively.

The necessary conditions for this general equality-inequality constrained problem were first published in the Masters thesis of William Karush, although they only became renowned after a seminal conference paper by Harold W. Kuhn and Albert W. Tucker.

Necessary conditions

Suppose that the objective function, i.e., the function to be minimized, is f : \mathbb{R}^n \rightarrow \mathbb{R} and the constraint functions are g_i : \mathbb{R}^n \rightarrow \mathbb{R} and h_j : \mathbb{R}^n \rightarrow \mathbb{R}. Further, suppose they are continuously differentiable at a point x^\star. If x^\star is a local minimum that satisfies some regularity conditions, then there exist constants \mu_i\ (i = 1,...,m) and \lambda_j (j = 1,...,l) such that


\displaystyle\nabla f(x^\star) + \sum_{i=1}^m \mu_i \nabla g_i(x^\star) + \sum_{j=1}^l \lambda_j \nabla h_j(x^\star) = 0,

Primal feasibility:

\displaystyle g_i(x^\star) \le 0, \qquad\forall i = 1,..., m
\displaystyle h_j(x^\star) = 0, \qquad \forall j = 1,..., l.

Dual feasibility:

\displaystyle \mu_i \ge 0 (i = 1,...,m) .

Complementary slackness:

\displaystyle \mu_i g_i (x^\star) = 0 \qquad \forall i = 1,...,m.

Regularity conditions (or constraint qualifications)

In order for a minimum point x^\star be KKT, it should satisfy some regularity condition, the most used ones are listed below

  • Linear independence constraint qualification (LICQ): the gradients of the active inequality constraints and the gradients of the equality constraints are linearly independent at x^\star.
  • Mangasarian-Fromowitz constraint qualification (MFCQ): the gradients of the active inequality constraints and the gradients of the equality constraints are positive-linearly independent at x^\star.
  • Constant rank constraint qualification (CRCQ): for each subset of the gradients of the active inequality constraints and the gradients of the equality constraints the rank at a vicinity of x^\star is constant.
  • Constant positive linear dependence constraint qualification (CPLD): for each subset of the gradients of the active inequality constraints and the gradients of the equality constraints, if it is positive-linear dependent at x^\star then it is positive-linear dependent at a vicinity of x^\star (v_1,...,v_n) is positive-linear dependent if there exists a_1\geq 0,…,a_n\geq 0 not all zero such that a_1v_1+...+a_nv_n=0).
  • Quasi-normality constraint qualification (QNCQ): if the gradients of the active inequality constraints and the gradients of the equality constraints are positive-linearly independent at x^\star with associated multipliers \lambda_i for equalities and \mu_j for inequalities than it doesn’t exist a sequence x_k\to x^\star such that \lambda_i \ne 0 therefore \lambda_i h_i(x_k)>0 and \mu_j \ne 0 thus \mu_j g_j(x_k)>0.
  • Slater condition: for a convex problem, there exists a point x such that h(x) = 0 and g_i(x) < 0 for all i active in x^\star.
  • Linearity constraints: If f and g are affine functions, then no other condition is needed to assure that the minimum point is KKT.

It can be shown that



(and the converses are not true), although MFCQ is not equivalent to CRCQ. In practice weaker constraint qualifications are preferred since they provide stronger optimality conditions.

Sufficient conditions

In some cases, the necessary conditions are also sufficient for optimality. This is the case when the objective function f and the inequality constraints g_j are continuously differentiable convex functions and the equality constraints hi are affine functions. It was shown by Martin in 1985 that the broader class of functions in which KKT conditions guarantees global optimality are the so called invex functions. So if equality constraints are affine functions, inequality constraints and the objective function are continuously differentiable invex functions then KKT conditions are sufficient for global optimality.

Value function

If we reconsider the optimization problem as a maximization problem with constant inequality constraints



subject to

g_i(x) \le a_i , \quad h_j(x) = 0

The value function is defined as

\displaystyle V(a_1, ..., a_n) = \sup\limits_{x} f(x)

subject to

\displaystyle g_i(x) \le a_i , \quad h_j(x) = 0 j \in \{1,...,l\}, i\in{1,...,m}.

(So the domain of V is

\displaystyle\{a \in \mathbb{R}^m | \mbox{for some }x\in X, g_i(x) \leq a_i, i \in \{1,...,m\}.)

Given this definition, each coefficient, \lambda_i, is the rate at which the value function increases as a_i increases. Thus if each a_i is interpreted as a resource constraint, the coefficients tell you how much increasing a resource will increase the optimum value of our function f. This interpretation is especially important in economics and is used, for instance, in utility maximization problems.


June 20, 2009

How to apply the Runge–Kutta 4 method to the system of 1st order ODEs?

In numerical analysis, the Runge–Kutta methods are an important family of implicit and explicit iterative methods for the approximation of solutions of ordinary differential equations.

The common fourth-order Runge–Kutta method

One member of the family of Runge–Kutta methods is so commonly used that it is often referred to as “RK4” or simply as “the Runge–Kutta method”. Let an initial value problem be specified as follows.

y' = f(t, y), a \leq t \leq b with y(a) = y_0.

We equally split the interval [a,b] into N subintervals by the following interior points

a = t_0 <t_1<...<t_{N-1}<t_N = b.

Of course, t_i = a + i\frac{b-a}{N}. The number \frac{b-a}{N}, which will be denoted by h, is called the step size. The value of y at step i is usually denoted by y_i, that is, y_i = y(t_i). Therefore, the initial value problem can be rewritten as y' = f(t, y), where a \leq t \leq b and with y(t_0) = y_0. The Runge–Kutta 4 method tells us that, having the value of y at t_i, we can compute the value of y at t_{i+1} by using the following scheme

\begin{gathered}   {K^1} = hf\left( {{t_i},{y_i}} \right) \hfill \\   {K^2} = hf\left( {{t_i} + \frac{h} {2},{y_i} + \frac{1}...

Thus, for each step, we need to compute all four K^i, i =\overline{1,4}, constants.

The Runge–Kutta 4 method for a system of ODEs

Now we investigate the question in the title of this entry. An mth-order system of 1st initial value problems has the form

\begin{gathered}   {{y'}_1}\left( t \right) = {f_1}\left( {t,{y_1},...,{y_m}} \right), \hfill \\   {{y'}_2}\left( t \right) =...

for a \leq t \leq b, with the initial conditions

y_1(a)=\alpha_1, y_2(a)=\alpha_2,…,y_m(a)=\alpha_m.

Let t_i be a+i\frac{b-a}{N}. We denote by y_{i,j} the following number

{y_{i,j}} = {y_i}\left( {{t_j}} \right), \quad i = \overline {1,m} , \quad j = \overline {0,N} .

At each step j+1, having all y_{i,j} we can compute y_{i,j+1} for all i=\overline{1,m}. To do this, for each i, we need to compute all K^k, k=\overline{1,4}. More precisely, we need to compute

\begin{gathered}   K_{i,j}^1 = h{f_i}\left( {{t_j},{y_{1,j}},{y_{2,j}},...,{y_{m,j}}} \right), \hfill \\   K_{i,j}^2 = h{f_i}...

As can be seen, at step j and for each k=\overline{1,3}, all the values K_{i,j}^k, with i=\overline{1,m} must be computed before any of the expressions K_{i,j}^{k+1}.


We now consider the following example

\begin{gathered}   u'\left( t \right) = f\left( {t,u,v} \right), \hfill \\   v'\left( t \right) = g\left( {t,u,v} \right). \h...

Therefore, at the step j+1, we can compute u(t_{j+1}) and v(t_{j+1}) by using the following scheme

\begin{gathered}   {K^1} = hf\left( {{t_j},{u_j},{v_j}} \right), \hfill \\   {\overline K ^1} = hg\left( {{t_j},{u_j},{v_j}} ...

June 17, 2009

Linear Least Square Problem and QR Decomposition

In linear algebra, a QR decomposition (also called a QR factorization) of a matrix is a decomposition of the matrix into an orthogonal and a right triangular matrix. QR decomposition is often used to solve the linear least squares problem, and is the basis for a particular eigenvalue algorithm, the QR algorithm.

Square matrix

A QR decomposition of a real square matrix A is a decomposition of A as A = QR, where Q is an orthogonal matrix (meaning that Q^TQ = I ) and R is an upper triangular matrix (also called right triangular matrix). This generalizes to a complex square matrix A and a unitary matrix Q. If A is nonsingular, then this factorization is unique if we require that the diagonal elements of R are positive.

Rectangular matrix

More generally, we can factor a complex m \times n matrix A, with m > n, as the product of an m \times m unitary matrix Q and an m\times n upper triangular matrix R. As the bottom m-n rows of an m \times n upper triangular matrix consist entirely of zeroes, it is often useful to partition R, or both R and Q:

A = QR = Q \begin{pmatrix} R_1 \\ 0 \end{pmatrix} = \begin{pmatrix} Q_1, Q_2 \end{pmatrix} \begin{pmatrix} R_1 \\ 0 \end{pmat...

where R_1 is an n\times n upper triangular matrix, Q_1 is m\times n, Q_2 is m\times (m-n), and Q_1 and Q_2 both have orthogonal columns.

Golub & Van Loan call Q_1R_1 the thin QR factorization of A. If A is of full rank n and we require that the diagonal elements of R_1 are positive then R_1 and Q_1 are unique, but in general Q_2 is not. R_1 is then equal to the upper triangular factor of the Cholesky decomposition of A^\star A (= A^TA if A is real).

QL, RQ and LQ decompositions

Analogously, we can define QL, RQ, and LQ decompositions, with L being a lower triangular matrix.

How to solve the Linear Least Square problems

Having the QR decomposition, that is, A=QR, one can solve the Least Square Problem Ax=b where A a m \times n matrix, x \in \mathbb R^n and b \in \mathbb R^m with m >n. To be precise, by mean of A=QR, we see that Q is an m \times m matrix and R is an m \times n matrix. Therefore, we can assume both Q and R to be

Q = \left( {\begin{array}{*{20}{c}}    {{{\left[ {{Q_1}} \right]}_{m \times n}}} & {{{\left[ {{Q_2}} \right]}_{m \times \...

Now equation Ax=b is equivalent to Q_1R_1x =b. Thus x=R_1^{-1}Q_1^Tb.

Computing the QR decomposition

There are several methods for actually computing the QR decomposition, such as by means of the Gram–Schmidt process, Householder transformations, or Givens rotations. Each has a number of advantages and disadvantages. In this section, we mainly consider the Gram–Schmidt process. More precisely, consider the Gram–Schmidt process, with the vectors to be considered in the process as the columns of the matrix A=(\mathbf{a}_1| \cdots|\mathbf{a}_n). Assume that after finishing the Gram–Schmidt process, we obtain (\mathbf{e}_1| \cdots|\mathbf{e}_n). Therefore, we put

\begin{matrix}Q\end{matrix} = (\mathbf{e}_1|\cdots |\mathbf{e}_n)

and thus,



Let us consider the case when

A = \left( {\begin{array}{*{20}{c}}    2 & 4 \\    1 & 3  \\    0 & 0 \\    0 & 0  \\   \end{array} } \right)...


\mathbf a_1 = \left( {\begin{array}{*{20}{c}}    2  \\    1  \\    0  \\    0  \\   \end{array} } \right), \quad \mathbf a_2 ...

From \mathbf a_1, we obtain \mathbf e_1 from \mathbf e_1 : = \frac{\mathbf a_1}{\|\mathbf a_1\|}, that is,

\mathbf e_1 = \left( {\begin{array}{*{20}{c}}    {\frac{2} {{\sqrt 5 }}}  \\    {\frac{1} {{\sqrt 5 }}}  \\    0  \\    0  \\...

To obtain \mathbf e_2, we firstly project \mathbf a_2 onto \mathbf e_1, then we subtract this result from \mathbf a_2. More precisely,

{\mathbf e_2} = \frac{\mathbf a_2 - {\rm proj}_{\mathbf e_1}{\mathbf a_2}} {{\left\| {{\mathbf a_2} - {\rm proj}_{\mathbf e_1...


{\rm proj}_{\mathbf e_1}{\mathbf a_2} = \frac{{\left\langle {{\mathbf e_1},{\mathbf a_2}} \right\rangle }} {{\left\langle {{\...

which yields

{\mathbf e_2} = \left( {\begin{array}{*{20}{c}}    {\frac{{ - 1}} {{\sqrt 5 }}}  \\    {\frac{2} {{\sqrt 5 }}}  \\    0  \\  ...


Q = \left( {\begin{array}{*{20}{c}}    {\frac{2} {{\sqrt 5 }}} & { - \frac{1} {{\sqrt 5 }}} & 0 & 0  \\    {\frac...


R = {Q^T}A = \left( {\begin{array}{*{20}{c}}    {\frac{2} {{\sqrt 5 }}} & {\frac{1} {{\sqrt 5 }}} & 0 & 0  \\    ...


\underbrace {\left( {\begin{array}{*{20}{c}}    2 & 4  \\    1 & 3  \\    0 & 0  \\    0 & 0  \\   \end{array...

In Maple, you can practice as shown below.


June 10, 2009

Linear Least Square Problem and Singular Value Decomposition (SVD)

Linear Least Square Problem (linear least squares), or ordinary least squares (OLS), is an important computational problem, that arises primarily in applications when it is desired to fit a linear mathematical model to measurements obtained from experiments. The goals of linear least squares are to extract predictions from the measurements and to reduce the effect of measurement errors. Mathematically, it can be stated as the problem of finding an approximate solution to an overdetermined system of linear equations. In statistics, it corresponds to the maximum likelihood estimate for a linear regression with normally distributed error.

Linear least square problems admit a closed-form solution, in contrast to non-linear least squares problems, which often have to be solved by an iterative procedure.

Motivational example

As a result of an experiment, four (x,y) data points were obtained, (1,6), (2,5), (3,7), and (4,10) (shown in red in the picture on the right). It is desired to find a line y = \beta_1 + \beta_2x that fits “best” these four points. In other words, we would like to find the numbers \beta_1 and \beta_2 that approximately solve the overdetermined linear system

\displaystyle {\beta _1} + 1{\beta _2} = 6, \quad {\beta _1} + 2{\beta _2} = 5, \quad {\beta _1} + 3{\beta _2} = 7, \quad {\beta _1} + 4{\beta _2} = 10.

of four equations in two unknowns in some “best” sense.

File:Linear least squares example2.png

The least squares approach to solving this problem is to try to make as small as possible the sum of squares of “errors” between the right- and left-hand sides of these equations, that is, to find the minimum of the function

\displaystyle S(\beta_1, \beta_2)= \left[6-(\beta_1+1\beta_2)\right]^2 +\left[5-(\beta_1+2\beta_2) \right]^2 +\left[7-(\beta_1 + 3\beta_2)\right]^2 +\left[10-(\beta_1 + 4\beta_2)\right]^2.

The minimum is determined by calculating the partial derivatives of S(\beta_1,\beta_2) in respect to \beta_1 and \beta_2 and setting them to zero. This results in a system of two equations in two unknowns, called the normal equations, which, when solved, gives the solution

\displaystyle \beta_1 = 3.5, \beta_2 = 1.4

and the equation y = 3.5 + 1.4x of the line of best fit. The residuals, that is, the discrepancies between the y values from the experiment and the y values calculated using the line of best fit are then found to be 1.1, -1.3, -0.7, and 0.9 (see the picture on the right). The minimum value of the sum of squares is

\displaystyle S(3.5,1.4) = 1.12 + (-1.3)2 + (-0.7)2 + 0.92 = 4.2.

The general problem

Consider an overdetermined system (there are more equations than unknowns)

\displaystyle\sum_{j=1}^{n} X_{ij}\beta_j = y_i,\ (i=1, 2, \dots, m),

of m linear equations in n unknown coefficients, \beta_1,\beta_2,…,\beta_n, with m > n, written in matrix form as \mathbf {X} \boldsymbol {\beta} = \mathbf {y},where

\displaystyle\mathbf {X}=\begin{pmatrix} X_{11} & X_{12} & \cdots & X_{1n} \\ X_{21} & X_{22} & \cdots & X_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ X_{m1} & X_{m2} & \cdots & X_{mn} \end{pmatrix} , \qquad \boldsymbol \beta = \begin{pmatrix} \beta_1 \\ \beta_2 \\ \vdots \\ \beta_n \end{pmatrix} , \qquad \mathbf y = \begin{pmatrix} y_1 \\ y_2 \\ \vdots \\ y_m \end{pmatrix}.

Such a system usually has no solution, and the goal is then to find the coefficients \beta which fit the equations “best”, in the sense of solving the quadratic minimization problem.

\displaystyle\underset{\boldsymbol \beta} {\rm argmin} \sum_{i=1}^{m}\left|y_i - \sum_{j=1}^{n} X_{ij}\beta_j\right|^2 = \underset{\boldsymbol \beta} {\rm argmin}\big\|\mathbf y - X \boldsymbol \beta \big\|^2.

A justification for choosing this criterion is given in properties below. This minimization problem has a unique solution, provided that the n columns of the matrix X are linearly independent, given by solving the normal equations

\displaystyle (\mathbf{X}^\top \mathbf{X}) \widehat{\boldsymbol \beta} =\mathbf{X}^ \top \mathbf{y}.

Singular value decomposition

In linear algebra, the singular value decomposition (SVD) is an important factorization of a rectangular real or complex matrix, with many applications in signal processing and statistics. Applications which employ the SVD include computing the pseudo-inverse, least squares fitting of data, matrix approximation, and determining the rank, range and null space of a matrix.

Suppose M is an m-by-n matrix whose entries come from the field K, which is either the field of real numbers or the field of complex numbers. Then there exists a factorization of the form

\displaystyle M = U\Sigma V^*,

where U is an m-by-m unitary matrix over K, the matrix \Sigma is m-by-n diagonal matrix with nonnegative real numbers on the diagonal, and V^* denotes the conjugate transpose of V, an n-by-n unitary matrix over K. Such a factorization is called a singular-value decomposition of M. A common convention is to order the diagonal entries \Sigma_{i,i} in non-increasing fashion. In this case, the diagonal matrix \Sigma is uniquely determined by M (though the matrices U and V are not). The diagonal entries of \Sigma are known as the singular values of M.

In practice, matrix U consists of eigenvectors of matrix MM^T while matrix V consists of eigenvectors of matrix M^TM. Finally, matrix \Sigma is the square root of a diagonal matrix forming by eigenvalues of either matrix MM^T or matrix M^TM.


Let us consider the case when

\displaystyle A = \left( {\begin{array}{*{20}{c}} 2&4 \\ 1&3 \\ 0&0 \\ 0&0 \end{array}} \right).

Having A yields

\displaystyle A{A^T} = \left( {\begin{array}{*{20}{c}} {20}&{14}&0&0 \\ {14}&{10}&0&0 \\ 0&0&0&0 \\ 0&0&0&0 \end{array}} \right)


\displaystyle {A^T}A = \left( {\begin{array}{*{20}{c}} 5&{11} \\ {11}&{25} \end{array}} \right).

By simple calculation, one has

\displaystyle\lambda_1=15+\sqrt{221}, \quad \lambda_2 =15-\sqrt{221}

being eigenvalues of matrix A^TA (in the descending form), which helps us to write down

\displaystyle\Sigma = \left( {\begin{array}{*{20}{c}} {\sqrt {15 + \sqrt {221} } }&0 \\ 0&{\sqrt {15 - \sqrt {221} } } \\ 0&0 \\ 0 & 0 \end{array}} \right)

To construct matrix V, one has to find eigenvectors corresponding to eigenvalues \lambda_1 and \lambda_2. For the first one, that is, \lambda_1, one has the following linear system

\displaystyle\left( {\begin{array}{*{20}{c}} { - 10 - \sqrt {221} }&{11} \\ {11}&{10 - \sqrt {221} } \end{array}} \right)

which implies, for example,

\displaystyle {x_1} = \left( {\begin{array}{*{20}{c}} {\frac{{11}}{{10 + \sqrt {221} }}} \\ 1 \end{array}} \right).

Similar to \lambda_2, one has its corresponding eigenvector as follows

\displaystyle {x_2} \left( {\begin{array}{*{20}{c}} {\frac{{11}}{{10 - \sqrt {221} }}} \\ 1 \end{array}} \right).


\displaystyle V = \left( {\begin{array}{*{20}{c}} {\frac{{11}}{{10 + \sqrt {221} }}}&{\frac{{11}}{{10 - \sqrt {221} }}} \\ 1&1 \end{array}} \right).

The numerical solution of V is

\displaystyle V = \left( {\begin{array}{*{20}{c}} {0.4423698861}&{ - 2.260551703} \\ 1&1 \end{array}} \right)

which gives us

\displaystyle {V^T}V = \left( {\begin{array}{*{20}{c}} {\frac{{2(221 + 10\sqrt {221} )}}{{{{(10 + \sqrt {221} )}^2}}}}&0 \\ 0&{ - \frac{{2( - 221 + 10\sqrt {221} )}}{{{{( - 10 + \sqrt {221} )}^2}}}} \end{array}} \right).

Having V we can compute U by the following formula U=AV^{-1}\Sigma^{-1} or you can perform the same routine as above. That is

\displaystyle U = \left( {\begin{array}{*{20}{c}} { - \frac{{11\sqrt {442} (\sqrt {17} + \sqrt {13} )}}{{442(15 + \sqrt {221} )}}}&{\frac{{\sqrt {442} ( - \sqrt {17} + \sqrt {13} )( - 10 + 3\sqrt {221} )}}{{442( - 15 + \sqrt {221} )}}}&0&0 \\ { - \frac{{11\sqrt {442} (\sqrt {17} + \sqrt {13} )}}{{442(15 + \sqrt {221} )}}}&{\frac{{\sqrt {442} ( - \sqrt {17} + \sqrt {13} )( - 10 + 3\sqrt {221} )}}{{442( - 15 + \sqrt {221} )}}}&0&0 \\ 0&0&0&0 \\ 0&0&0&0 \end{array}} \right).

It is easy to see that

\displaystyle U\Sigma V^* = \left( {\begin{array}{*{20}{c}} {2.000000051}&{4.000000052} \\ {1.000000029}&{3.000000030} \\ 0&0 \\ 0&0 \end{array}} \right).

Having a SVD result, that is

\displaystyle A = U\left( {\begin{array}{*{20}{c}} {{\Sigma _1}} \\ 0 \\ \end{array} } \right){V^T}

one can see that the solution of Ax=b can be computed by

\displaystyle x = V\left( {\begin{array}{*{20}{c}} {\Sigma _1^{ - 1}} & 0 \\ \end{array} } \right){U^T}b.

Note that matrices U and V are not unique. Using Maple software, you can compute SVD as shown below.


Useful links

June 7, 2009

Schwarz Reflection Principle and several applications

Suppose that f is an analytic function which is defined in the upper half-disk \{|z|^2<1,I[z]>0\}. Further suppose that f extends to a continuous function on the real axis, and takes on real values on the real axis. Then f can be extended to an analytic function on the whole disk by the formula

\displaystyle f(\overline z)=\overline{f(z)}

and the values for z reflected across the real axis are the reflections of f(z) across the real axis. It is easy to check that the above function is complex differentiable in the interior of the lower half-disk. What is remarkable is that the resulting function must be analytic along the real axis as well, despite no assumptions of differentiability.


This is called the Schwarz reflection principle, and is sometimes also known as Schwarz’s symmetric principle (Needham 2000, p. 257). The diagram above shows the reflection principle applied to a function f defined for the upper half-disk (left figure; red) and its image (right figure; red). The function is real on the real axis, so it is possible to extend the function to the reflected domain (left and right figures; pink).

For the reflected function to be continuous, it is necessary for the values at the boundary to be continuous and to fall on the line being reflected. The reflection principle also applies in the generality of reflecting along any line, not just the real axis, in which case the function f has to take values along a line in the range. In fact, any arc which has a neighborhood biholomorphic to a straight line can be reflected across. The basic example is the boundary of the unit circle which is mapped to the real axis by z \mapsto \frac{z+1}{z+i}.

The reflection principle can also be used to reflect a harmonic function which extends continuously to the zero function on its boundary. In this case, for negative y, defining

\displaystyle v(x,y)=-v(x,-y)

extends v to a harmonic function on the reflected domain. Again note that it is necessary for  v(x,0)=0. This result provides a way of extending a harmonic function from a given open set to a larger open set (Krantz 1999, p. 95).


Perturbation in linear systems

In this small topic, I will show several traditional results regarding to the perturbation in matrix algebra. This comes from the study of the stability of solving system of linear equations in Numerical Analysis.

Let us consider the following matrix-form linear system Ax=b. We split into three cases

  • Perturbation on b

If \delta b is a small perturbation on b, then \delta x is a small perturbation on x in the sense that the following system A(x + \delta x) = b + \delta b holds true. This together with the fact that Ax =b gives A \delta x = \delta b. By some simple calculation, one has

\displaystyle\begin{gathered}\frac{{\left\| {\delta x} \right\|}}{{\left\| x \right\|}} = \frac{{\left\| {{A^{ - 1}}\left( {\delta b} \right)} \right\|}}{{\left\| x \right\|}}\leq\frac{{\left\| {{A^{ - 1}}} \right\|\left\| {\delta b} \right\|}}{{\left\| x \right\|}} \hfill \\ \qquad= \frac{{\left\| {{A^{ - 1}}} \right\|\left\| A \right\|\left\| {\delta b} \right\|}}{{\left\| A \right\|\left\| x \right\|}}\leq\frac{{\left\| {{A^{ - 1}}} \right\|\left\| A \right\|\left\| {\delta b} \right\|}}{{\left\| {Ax} \right\|}} \hfill \\ \qquad= \frac{{\left\| {{A^{ - 1}}} \right\|\left\| A \right\|\left\| {\delta b} \right\|}}{{\left\| b \right\|}}. \hfill \\ \end{gathered}


\displaystyle\frac{{\left\| {\delta x} \right\|}}{{\left\| x \right\|}}\leq\left\| {{A^{ - 1}}} \right\|\left\| A \right\|\frac{{\left\| {\delta b} \right\|}}{{\left\| b \right\|}}.

  • Perturbation on A.

If \delta A is a small perturbation on A, then \delta x is a small perturbation on x in the sense that the following system (A + \delta A)(x+\delta x) = b holds true. This together with the fact that Ax =b gives \left( {\delta A} \right)x + A\left( {\delta x} \right) + \left( {\delta A} \right)\left( {\delta x} \right) = 0. By some simple calculation, one has

\displaystyle\begin{gathered}\left\| {\delta x} \right\| = \left\| {{A^{ - 1}}\left( {\left( {\delta A} \right)\left( x \right) + \left( {\delta A} \right)\left( {\delta x} \right)} \right)} \right\| \hfill \\ \qquad\leq\left\| {{A^{ - 1}}} \right\|\left\| {\left( {\delta A} \right)\left( x \right) + \left( {\delta A} \right)\left( {\delta x} \right)} \right\| \hfill \\ \qquad\leq \left\| {{A^{ - 1}}} \right\|\left( {\left\| {\delta A} \right\|\left\| x \right\| + \left\| {\delta A} \right\|\left\| {\delta x} \right\|} \right). \hfill \\ \end{gathered}


\displaystyle\left\| {\delta x} \right\|\left( {1 - \left\| {{A^{ - 1}}} \right\|\left\| {\delta A} \right\|} \right) \leqslant \left\| {{A^{ - 1}}} \right\|\left\| {\delta A} \right\|\left\| x \right\|

which implies that

\displaystyle\frac{{\left\| {\delta x} \right\|}}{{\left\| x \right\|}} \leqslant \frac{{\left\| {{A^{ - 1}}} \right\|\left\| {\delta A} \right\|}}{{1 - \left\| {{A^{ - 1}}} \right\|\left\| {\delta A} \right\|}} = \frac{{\left\| A \right\|\left\| {{A^{ - 1}}} \right\|\frac{{\left\| {\delta A} \right\|}}{{\left\| A \right\|}}}}{{1 - \left\| A \right\|\left\| {{A^{ - 1}}} \right\|\frac{{\left\| {\delta A} \right\|}}{{\left\| A \right\|}}}}.

  • Perturbation on A and b.

If \delta A is a small perturbation on A and \delta b is a small perturbation on b, then \delta x is a small perturbation on x in the sense that the following system (A + \delta A)(x+\delta x) = b + \delta b holds true. This together with the fact that Ax =b gives A\left( {\delta x} \right) + \left( {\delta A} \right)x + \left( {\delta A} \right)\left( {\delta x} \right) = \delta b. By some simple calculation, one has

\displaystyle\begin{gathered}\left\| {\delta x} \right\| = \left\| {{A^{ - 1}}\left( {\delta b - \left( {\delta A} \right)\left( x \right) - \left( {\delta A} \right)\left( {\delta x} \right)} \right)} \right\| \hfill \\ \qquad\leqslant \left\| {{A^{ - 1}}} \right\|\left\| {\delta b - \left( {\delta A} \right)\left( x \right) - \left( {\delta A} \right)\left( {\delta x} \right)} \right\| \hfill \\ \qquad\leqslant \left\| {{A^{ - 1}}} \right\|\left( {\left\| {\delta b} \right\| + \left\| {\delta A} \right\|\left\| x \right\| + \left\| {\delta A} \right\|\left\| {\delta x} \right\|} \right) \hfill \\ \end{gathered}

which implies that

\displaystyle\left\|{\delta x}\right\|\left({1-\left\|{{A^{-1}}}\right\|\left\|{\delta A}\right\|}\right)\leqslant\left\|{{A^{-1}}}\right\|\left({\left\|{\delta b}\right\|+\left\|{\delta A}\right\|\left\| x\right\|}\right).


\displaystyle\begin{gathered}\frac{{\left\| {\delta x} \right\|}}{{\left\| x \right\|}} \leqslant \frac{{\left\| {{A^{ - 1}}} \right\|\left( {\frac{{\left\| {\delta b} \right\|}}{{\left\| x \right\|}} + \left\| {\delta A} \right\|} \right)}}{{1 - \left\| {{A^{ - 1}}} \right\|\left\| {\delta A} \right\|}} \hfill \\ \qquad= \frac{{\left\| A \right\|\left\| {{A^{ - 1}}} \right\|\left( {\frac{{\left\| {\delta b} \right\|}}{{\left\| A \right\|\left\| x \right\|}} + \frac{{\left\| {\delta A} \right\|}}{{\left\| A \right\|}}} \right)}}{{1 - \left\| A \right\|\left\| {{A^{ - 1}}} \right\|\frac{{\left\| {\delta A} \right\|}}{{\left\| A \right\|}}}} \hfill \\ \qquad\leqslant \frac{{\left\| A \right\|\left\| {{A^{ - 1}}} \right\|\left( {\frac{{\left\| {\delta b} \right\|}}{{\left\| {Ax} \right\|}} + \frac{{\left\| {\delta A} \right\|}}{{\left\| A \right\|}}} \right)}}{{1 - \left\| A \right\|\left\| {{A^{ - 1}}} \right\|\frac{{\left\| {\delta A} \right\|}}{{\left\| A \right\|}}}} \hfill \\ \qquad\leqslant \frac{{\left\| A \right\|\left\| {{A^{ - 1}}} \right\|\left( {\frac{{\left\| {\delta b} \right\|}}{{\left\| b \right\|}} + \frac{{\left\| {\delta A} \right\|}}{{\left\| A \right\|}}} \right)}}{{1 - \left\| A \right\|\left\| {{A^{ - 1}}} \right\|\frac{{\left\| {\delta A} \right\|}}{{\left\| A \right\|}}}}. \hfill \\ \end{gathered}

In the literature, the expression \|A^{-1}\|\|A\| is called the condition number of the matrix A, denoted by K(A). It is easy to see that K(A) \geq 1. If K(A) is small then the system Ax=b is called to be well-posed. Alternatively, in the case when 1 \ll K(A), the system Ax=b is ill-posed.

June 2, 2009

Finite difference method: Local truncation error and consistency

Truncation error or local truncation error is error made by numerical algorithms that arises from taking finite number of steps in computation. It is present even with infinite-precision arithmetic, because it is caused by truncation of the infinite Taylor series to form the algorithm.

Use of arbitrarily small steps in numerical computation is prevented by round-off error, which are the consequence of using finite precision floating point numbers on computers.

Let F_{i,j}(u)=0 represent the difference equation approximating the PDE at the (i.j)-th mesh point, with exact solution u. For example,

\displaystyle {F_{i,j}}\left( u \right) = \frac{{{u_{i,j + 1}} - {u_{i,j}}}}{k} - {\text{ }}\frac{{{u_{i - 1,j}} - 2{u_{i,j}} + {u_{i + 1,j}}}}{{{h^2}}}

represent the difference equation approximating the following PDE

\displaystyle\frac{{\partial u}} {{\partial t}} = \frac{{{\partial ^2}u}} {{\partial {t^2}}}

at the (i,j)-th mesh point. If u is replaced by U at the mesh points of the \Delta E, where U is the exact solution of the PDEs, the value of F_{i,j}(U) is called the local truncation error T_{i,j} at the (i,j)-th mesh point. F_{i,j}(U) clearly measures the amount by which the exact solution values of the PDE at the mesh points of the \Delta E do not satisfy the \Delta E at the point (ih, jk).

Continuing the above example gives us

\displaystyle {T_{i,j}} = {F_{i,j}}\left( U \right) = \frac{{{U_{i,j + 1}} - {U_{i,j}}}}{k} - \frac{{{U_{i - 1,j}} - 2{U_{i,j}} + {U_{i + 1,j}}}}{{{h^2}}}.

Using Taylor expansions, it is easy to express T_{i,j} in terms of power of h and k and partial derivatives of U at (ih, jk). This leads us to the computation of the local truncation error.

It is sometimes possible to approximate a parabolic or hyperbolic equation by a finite-difference scheme that is stable (i.e. limits the amplification of all the components of the initial conditions), but which has a solution that converges to the solution of a different differential equation as the mesh lengths tend to zero. Such a difference scheme is said to be inconsistent or incompatible with the PDE.

The real importance of the concept of consistency lies in a theorem by Lax which states that if a linear finite-difference equation is consistent with a properly posed linear initial-value problem then stability guarantees convergence of u to U as the mesh lengths tend to zero. Consistency can be defined in either of two equivalent but slightly different ways.

The more general definition is as follows. Let L(U)=0 represent the PDE in the independent variables x and t, with exact solution U. Let F(u)=0 represent the approximating finite-difference equation with exact solution u. Let v be a continuous function of x and t with a sufficient number of continuous derivatives to enable L(v) to be evaluated at the point (ih,jk).

Then the truncation error T_{i,j}(v) at the point (ih,jk) is defined by

\displaystyle T_{i,j}(v) = F_{i,j}(v) - L(v_{i,j}).

If T_{i,j}(v) \to 0 as h \to 0 and k \to 0, the difference equation is said to be consistent or compatible with the PDE. Most authors put v = U because L(U)=0.

For example, let us consider the following parabolic equation

\displaystyle u_t = \nu u_{xx}(x) - \kappa u(x), \quad a<x<b, 0<t<T


\displaystyle u(x,0)=f(x), \quad a \leq x \leq b


\displaystyle u(a,t)=g_1(t), \quad u(b,t)=g_2(t), \quad 0 \leq t \leq T,

where \nu >0 and \kappa \geq 0.

Backward Euler + second order central finite difference discretization.

We split equally the domain a \leq x \leq b into N parts and the domain 0 \leq t \leq T into M parts, i.e. (i,j)-th mesh point is of the following form

\displaystyle \left( {a + i\frac{{b - a}} {N},j\frac{T} {M}} \right).

We are now in a position to discretize the problem as follows

\displaystyle\frac{{{u_{i,j + 1}} - {u_{i,j}}}}{k} = \nu  \frac{{{u_{i - 1,j + 1}} - 2{u_{i,j + 1}} + {u_{i + 1,j + 1}}}}{{{h^2}}}  - \kappa {u_{i,j}}

where 1 \leq i \leq N - 1. At i=0, that is x=a, one has

\displaystyle\frac{{{u_{1,j + 1}} - {u_{ - 1,j + 1}}}} {{2h}} = g_1\left( t \right)

and at i=N, that is x=b, one obtains

\displaystyle\frac{{{u_{N + 1,j + 1}} - {u_{N - 1,j + 1}}}} {{2h}} = g_2\left( t \right).

The local truncation error and consistency

The above discretization can be rewritten as following

\frac{{{u_{i,j + 1}} - {u_{i,j}}}} {k} = \nu \frac{{{u_{i - 1,j + 1}} - 2{u_{i,j + 1}} + {u_{i + 1,j + 1}}}} {{{h^2}}} - \kap...

where 1 \leq i \leq N - 1 with the following boundary conditions

\frac{{{u_{0,j + 1}} - {u_{0,j}}}} {k} = \nu \frac{{2{u_{1,j + 1}} - 2{u_{0,j + 1}} - 2hf\left( a \right)}} {{{h^2}}} - \kapp...


\frac{{{u_{N,j + 1}} - {u_{N,j}}}} {k} = \nu \frac{{2{u_{N - 1,j + 1}} - 2{u_{N,j + 1}} + 2hf\left( b \right)}} {{{h^2}}} - \...

Now we have

{F_{i,j}}\left( U \right) = \frac{{{U_{i,j + 1}} - {U_{i,j}}}} {k} - \nu \frac{{{U_{i - 1,j + 1}} - 2{U_{i,j + 1}} + {U_{i + ...

By Taylor’s expansion

{U_{i + 1,j + 1}} = {U_{i,j}} + \left[ {h{{\left( {\frac{{\partial U}} {{\partial x}}} \right)}_{i,j}} + k{{\left( {\frac{{\p...


{U_{i - 1,j + 1}} = {U_{i,j}} + \left[ { - h{{\left( {\frac{{\partial U}} {{\partial x}}} \right)}_{i,j}} + k{{\left( {\frac{...


{U_{i,j + 1}} = {U_{i,j}} + k{\left( {\frac{{\partial U}} {{\partial t}}} \right)_{i,j}} + \frac{{{k^2}}} {2}{\left( {\frac{{...


{F_{i,j}}\left( U \right) = {\left( {\frac{{\partial U}} {{\partial t}}} \right)_{i,j}} + \frac{k} {2}{\left( {\frac{{{\parti...

which yields

{F_{i,j}}\left( U \right) = \left[ {{{\left( {\frac{{\partial U}} {{\partial t}}} \right)}_{i,j}} - \nu {{\left( {\frac{{{\pa...

Thus, the principle part of the local truncation error is

\frac{k} {2}{\left( {\frac{{{\partial ^2}U}} {{\partial {t^2}}}} \right)_{i,j}} - \nu k{\left( {\frac{{{\partial ^3}U}} {{\pa...


\displaystyle {T_{i,j}} = O\left( k \right) + O\left( {{h^2}} \right).


In view of the local truncation error, one can easily see that the local truncation error is a polynomial of two variables h and k which implies that the method is consistent.

Blog at