Ngô Quốc Anh

Tháng Sáu 30, 2009

A quick introduction to Simplex Method via an example

Chuyên mục: Giải tích 9 (MA5265), Linh Tinh, Nghiên Cứu Khoa Học — Ngô Quốc Anh @ 14:36

Below is the
original problem:
objective function
is in green.
See
step
1
Red variables below
are called
SLACK VARIABLES
See
step
2
Below is the
SIMPLEX TABLEAU.
Compare RED symbols
with Z = x1 + 2x2 - x3.
Blue numbers below
are the “ISM“.

The 1st SIMPLEX TABLEAU
is below. Note
missing z-column
(correction by Steve Hulet)
See steps 3,4,5 of
SIMPLEX METHOD
as you handle INDICATORS,
RATIOS, and PIVOTS.
Named below are the 4
row operations needed to
pivot on the number
“5″ encircled in red

Below are the
results of the
row operations
named above
blue
numbers
are the
new ISM
Since one INDICATOR
(namely -1/5) remains
negative, we must
repeat steps 3-9
Below is the result
of changing our pivot
to “1″
Named below are
4 row operations
needed to pivot
on the number(16/5)
encircled in red
Above there was a tie for least non-negative ratio:
either row 1 or row 2 could have become the pivot row, and either choice leads to the final tableau after one additional pivoting. At the right is the result of the final 3 row operations.
All indicators {0, 0, 49
16
, 0, 1
16
and 3
8
} are now zero or bigger (”13″ is NOT an indicator).
Thus, as in step 8 of the SIMPLEX METHOD, the last tableau is a FINAL TABLEAU.
Row operations of SIMPLEX METHOD are done.
Thus, the basic solution for the tableau above is the solution to our original problem.
[1st] set equal to 0 all variables NOT associated with the blue ISM, as at the right. Each column of the final tableau has a label naming it’s variable.
[2nd] convert each row of the final tableau (except the bottom row) back into equation form (as at the right) to find the values of the remaining variables. The value of the objective function is in the lower right corner of the final tableau.

Source: http://math.uww.edu/~mcfarlat/s-prob.htm

More: http://people.hofstra.edu/stefan_Waner/realworld/tutorialsf4/unit4_3.html

Tháng Sáu 29, 2009

Karush–Kuhn–Tucker conditions, the 1st order optimality condition

Chuyên mục: Giải tích 9 (MA5265), Linh Tinh, Nghiên Cứu Khoa Học — Ngô Quốc Anh @ 16:28

In mathematics, the Karush–Kuhn–Tucker conditions (also known as the Kuhn-Tucker or the KKT conditions) are necessary for a solution in nonlinear programming to be optimal, provided some regularity conditions are satisfied. It is a generalization of the method of Lagrange multipliers to inequality constraints. The conditions are named for William Karush, Harold W. Kuhn, and Albert W. Tucker.

Let us consider the following nonlinear optimization problem:

\mbox{Minimize } f(x)

\mbox{subject to: } g_i(x) \le 0 , h_j(x) = 0

where f(x) is the function to be minimized, g_i (x) (i = 1,...,m) are the inequality constraints and h_j (x) (j = 1,...,l) are the equality constraints, and m and l are the number of inequality and equality constraints, respectively.

The necessary conditions for this general equality-inequality constrained problem were first published in the Masters thesis of William Karush, although they only became renowned after a seminal conference paper by Harold W. Kuhn and Albert W. Tucker.

Necessary conditions

Suppose that the objective function, i.e., the function to be minimized, is f : \mathbb{R}^n \rightarrow \mathbb{R} and the constraint functions are g_i : \mathbb{R}^n \rightarrow \mathbb{R} and h_j :  \mathbb{R}^n \rightarrow \mathbb{R}. Further, suppose they are continuously differentiable at a point x^\star. If x^\star is a local minimum that satisfies some regularity conditions, then there exist constants \mu_i\ (i = 1,...,m) and \lambda_j (j = 1,...,l) such that

Stationarity:

\nabla f(x^\star) + \sum_{i=1}^m \mu_i \nabla g_i(x^\star) + \sum_{j=1}^l \lambda_j \nabla h_j(x^\star) = 0,

Primal feasibility:

g_i(x^\star) \le 0, \mbox{ for all } i = 1,..., m
h_j(x^\star) = 0, \mbox{ for all } j = 1,..., l .

Dual feasibility:

\mu_i \ge 0 (i = 1,...,m) .

Complementary slackness:

\mu_i g_i (x^\star) = 0 for all i = 1,...,m.

Regularity conditions (or constraint qualifications)

In order for a minimum point x^\star be KKT, it should satisfy some regularity condition, the most used ones are listed below:

  • Linear independence constraint qualification (LICQ): the gradients of the active inequality constraints and the gradients of the equality constraints are linearly independent at x^\star.
  • Mangasarian-Fromowitz constraint qualification (MFCQ): the gradients of the active inequality constraints and the gradients of the equality constraints are positive-linearly independent at x^\star.
  • Constant rank constraint qualification (CRCQ): for each subset of the gradients of the active inequality constraints and the gradients of the equality constraints the rank at a vicinity of x^\star is constant.
  • Constant positive linear dependence constraint qualification (CPLD): for each subset of the gradients of the active inequality constraints and the gradients of the equality constraints, if it is positive-linear dependent at x^\star then it is positive-linear dependent at a vicinity of x^\star. (v_1,...,v_n) is positive-linear dependent if there exists a_1\geq 0,...,a_n\geq 0 not all zero such that a_1v_1+...+a_nv_n=0)
  • Quasi-normality constraint qualification (QNCQ): if the gradients of the active inequality constraints and the gradients of the equality constraints are positive-linearly independent at x^\star with associated multipliers \lambda_i for equalities and \mu_j for inequalities than it doesn’t exist a sequence x_k\to x^\star such that: \lambda_i \ne 0 therefore \lambda_i h_i(x_k)>0 and \mu_j \ne 0 thus \mu_j g_j(x_k)>0.
  • Slater condition: for a convex problem, there exists a point x such that h(x) = 0 and g_i(x) < 0 for all i active in x^\star.
  • Linearity constraints: If f and g are affine functions, then no other condition is needed to assure that the minimum point is KKT.

It can be shown that

LICQ⇒MFCQ⇒CPLD⇒QNCQ,

LICQ⇒CRCQ⇒CPLD⇒QNCQ

(and the converses are not true), although MFCQ is not equivalent to CRCQ. In practice weaker constraint qualifications are preferred since they provide stronger optimality conditions.

Sufficient conditions

In some cases, the necessary conditions are also sufficient for optimality. This is the case when the objective function f and the inequality constraints g_j are continuously differentiable convex functions and the equality constraints hi are affine functions. It was shown by Martin in 1985 that the broader class of functions in which KKT conditions guarantees global optimality are the so called invex functions. So if equality constraints are affine functions, inequality constraints and the objective function are continuously differentiable invex functions then KKT conditions are sufficient for global optimality.

Value function

If we reconsider the optimization problem as a maximization problem with constant inequality constraints,

\mbox{Maximize }\; f(x)

\mbox{subject to: } g_i(x) \le a_i , h_j(x) = 0.

The value function is defined as:

V(a_1, ..., a_n) = \sup\limits_{x}\;\; f(x)

\mbox{subject to: } g_i(x) \le a_i , h_j(x) = 0 j\in\{1,...,l\}, i\in{1,...,m}.

(So the domain of V is \{a \in \mathbb{R}^m | \mbox{for some }x\in X, g_i(x) \leq a_i, i \in \{1,...,m\}.) Given this definition, each coefficient, \lambda_i, is the rate at which the value function increases as ai increases. Thus if each ai is interpreted as a resource constraint, the coefficients tell you how much increasing a resource will increase the optimum value of our function f. This interpretation is especially important in economics and is used, for instance, in utility maximization problems.

Source

Tháng Sáu 20, 2009

How to apply the Runge–Kutta 4 method to the system of 1st order ODEs?

In numerical analysis, the Runge–Kutta methods are an important family of implicit and explicit iterative methods for the approximation of solutions of ordinary differential equations.

The common fourth-order Runge–Kutta method

One member of the family of Runge–Kutta methods is so commonly used that it is often referred to as “RK4″ or simply as “the Runge–Kutta method”. Let an initial value problem be specified as follows.

y' = f(t, y), a \leq t \leq b with y(a) = y_0.

We equally split the interval [a,b] into N subintervals by the following interior points

a = t_0 <t_1<...<t_{N-1}<t_N = b.

Of course, t_i = a + i\frac{b-a}{N}. The number \frac{b-a}{N}, which will be denoted by h, is called the step size. The value of y at step i is usually denoted by y_i, that is, y_i = y(t_i). Therefore, the initial value problem can be rewritten as y' = f(t, y), where a \leq t \leq b and with y(t_0) = y_0. The Runge–Kutta 4 method tells us that, having the value of y at t_i, we can compute the value of y at t_{i+1} by using the following scheme

\begin{gathered}   {K^1} = hf\left( {{t_i},{y_i}} \right) \hfill \\   {K^2} = hf\left( {{t_i} + \frac{h} {2},{y_i} + \frac{1}...

Thus, for each step, we need to compute all four K^i, i =\overline{1,4}, constants.

The Runge–Kutta 4 method for a system of ODEs

Now we investigate the question in the title of this entry. An mth-order system of 1st initial value problems has the form

\begin{gathered}   {{y'}_1}\left( t \right) = {f_1}\left( {t,{y_1},...,{y_m}} \right), \hfill \\   {{y'}_2}\left( t \right) =...

for a \leq t \leq b, with the initial conditions

y_1(a)=\alpha_1, y_2(a)=\alpha_2,…,y_m(a)=\alpha_m.

Let t_i be a+i\frac{b-a}{N}. We denote by y_{i,j} the following number

{y_{i,j}} = {y_i}\left( {{t_j}} \right), \quad i = \overline {1,m} , \quad j = \overline {0,N} .

At each step j+1, having all y_{i,j} we can compute y_{i,j+1} for all i=\overline{1,m}. To do this, for each i, we need to compute all K^k, k=\overline{1,4}. More precisely, we need to compute

\begin{gathered}   K_{i,j}^1 = h{f_i}\left( {{t_j},{y_{1,j}},{y_{2,j}},...,{y_{m,j}}} \right), \hfill \\   K_{i,j}^2 = h{f_i}...

As can be seen, at step j and for each k=\overline{1,3}, all the values K_{i,j}^k, with i=\overline{1,m} must be computed before any of the expressions K_{i,j}^{k+1}.

Example

We now consider the following example

\begin{gathered}   u'\left( t \right) = f\left( {t,u,v} \right), \hfill \\   v'\left( t \right) = g\left( {t,u,v} \right). \h...

Therefore, at the step j+1, we can compute u(t_{j+1}) and v(t_{j+1}) by using the following scheme

\begin{gathered}   {K^1} = hf\left( {{t_j},{u_j},{v_j}} \right), \hfill \\   {\overline K ^1} = hg\left( {{t_j},{u_j},{v_j}} ...

Tháng Sáu 17, 2009

Linear Least Square Problem and QR Decomposition

In linear algebra, a QR decomposition (also called a QR factorization) of a matrix is a decomposition of the matrix into an orthogonal and a right triangular matrix. QR decomposition is often used to solve the linear least squares problem, and is the basis for a particular eigenvalue algorithm, the QR algorithm.

Square matrix

A QR decomposition of a real square matrix A is a decomposition of A as A = QR, where Q is an orthogonal matrix (meaning that Q^TQ = I ) and R is an upper triangular matrix (also called right triangular matrix). This generalizes to a complex square matrix A and a unitary matrix Q. If A is nonsingular, then this factorization is unique if we require that the diagonal elements of R are positive.

Rectangular matrix

More generally, we can factor a complex m \times n matrix A, with m > n, as the product of an m \times m unitary matrix Q and an m\times n upper triangular matrix R. As the bottom m-n rows of an m \times n upper triangular matrix consist entirely of zeroes, it is often useful to partition R, or both R and Q:

A = QR = Q \begin{pmatrix} R_1 \\ 0 \end{pmatrix} = \begin{pmatrix} Q_1, Q_2 \end{pmatrix} \begin{pmatrix} R_1 \\ 0 \end{pmat...

where R_1 is an n\times n upper triangular matrix, Q_1 is m\times n, Q_2 is m\times (m-n), and Q_1 and Q_2 both have orthogonal columns.

Golub & Van Loan call Q_1R_1 the thin QR factorization of A. If A is of full rank n and we require that the diagonal elements of R_1 are positive then R_1 and Q_1 are unique, but in general Q_2 is not. R_1 is then equal to the upper triangular factor of the Cholesky decomposition of A^\star A (= A^TA if A is real).

QL, RQ and LQ decompositions

Analogously, we can define QL, RQ, and LQ decompositions, with L being a lower triangular matrix.

How to solve the Linear Least Square problems

Having the QR decomposition, that is, A=QR, one can solve the Least Square Problem Ax=b where A a m \times n matrix, x \in \mathbb R^n and b \in \mathbb R^m with m >n. To be precise, by mean of A=QR, we see that Q is an m \times m matrix and R is an m \times n matrix. Therefore, we can assume both Q and R to be

Q = \left( {\begin{array}{*{20}{c}}    {{{\left[ {{Q_1}} \right]}_{m \times n}}} & {{{\left[ {{Q_2}} \right]}_{m \times \...

Now equation Ax=b is equivalent to Q_1R_1x =b. Thus x=R_1^{-1}Q_1^Tb.

Computing the QR decomposition

There are several methods for actually computing the QR decomposition, such as by means of the Gram–Schmidt process, Householder transformations, or Givens rotations. Each has a number of advantages and disadvantages. In this section, we mainly consider the Gram–Schmidt process. More precisely, consider the Gram–Schmidt process, with the vectors to be considered in the process as the columns of the matrix A=(\mathbf{a}_1| \cdots|\mathbf{a}_n). Assume that after finishing the Gram–Schmidt process, we obtain (\mathbf{e}_1| \cdots|\mathbf{e}_n). Therefore, we put

\begin{matrix}Q\end{matrix} = (\mathbf{e}_1|\cdots |\mathbf{e}_n)

and thus,

R=Q^TA.

Example

Let us consider the case when

A = \left( {\begin{array}{*{20}{c}}    2 & 4 \\    1 & 3  \\    0 & 0 \\    0 & 0  \\   \end{array} } \right)...

Then

\mathbf a_1 = \left( {\begin{array}{*{20}{c}}    2  \\    1  \\    0  \\    0  \\   \end{array} } \right), \quad \mathbf a_2 ...

From \mathbf a_1, we obtain \mathbf e_1 from \mathbf e_1 : = \frac{\mathbf a_1}{\|\mathbf a_1\|}, that is,

\mathbf e_1 = \left( {\begin{array}{*{20}{c}}    {\frac{2} {{\sqrt 5 }}}  \\    {\frac{1} {{\sqrt 5 }}}  \\    0  \\    0  \\...

To obtain \mathbf e_2, we firstly project \mathbf a_2 onto \mathbf e_1, then we subtract this result from \mathbf a_2. More precisely,

{\mathbf e_2} = \frac{\mathbf a_2 - {\rm proj}_{\mathbf e_1}{\mathbf a_2}} {{\left\| {{\mathbf a_2} - {\rm proj}_{\mathbf e_1...

where

{\rm proj}_{\mathbf e_1}{\mathbf a_2} = \frac{{\left\langle {{\mathbf e_1},{\mathbf a_2}} \right\rangle }} {{\left\langle {{\...

which yields

{\mathbf e_2} = \left( {\begin{array}{*{20}{c}}    {\frac{{ - 1}} {{\sqrt 5 }}}  \\    {\frac{2} {{\sqrt 5 }}}  \\    0  \\  ...

Thus

Q = \left( {\begin{array}{*{20}{c}}    {\frac{2} {{\sqrt 5 }}} & { - \frac{1} {{\sqrt 5 }}} & 0 & 0  \\    {\frac...

and

R = {Q^T}A = \left( {\begin{array}{*{20}{c}}    {\frac{2} {{\sqrt 5 }}} & {\frac{1} {{\sqrt 5 }}} & 0 & 0  \\    ...

Finally,

\underbrace {\left( {\begin{array}{*{20}{c}}    2 & 4  \\    1 & 3  \\    0 & 0  \\    0 & 0  \\   \end{array...

In Maple, you can practice as shown below.

QR-Maple

Tháng Sáu 10, 2009

Linear Least Square Problem and Singular Value Decomposition (SVD)

Linear Least Square Problem (linear least squares), or ordinary least squares (OLS), is an important computational problem, that arises primarily in applications when it is desired to fit a linear mathematical model to measurements obtained from experiments. The goals of linear least squares are to extract predictions from the measurements and to reduce the effect of measurement errors. Mathematically, it can be stated as the problem of finding an approximate solution to an overdetermined system of linear equations. In statistics, it corresponds to the maximum likelihood estimate for a linear regression with normally distributed error.

Linear least square problems admit a closed-form solution, in contrast to non-linear least squares problems, which often have to be solved by an iterative procedure.

Motivational example

As a result of an experiment, four (x,y) data points were obtained, (1,6), (2,5), (3,7), and (4,10) (shown in red in the picture on the right). It is desired to find a line y = \beta_1 + \beta_2x that fits “best” these four points. In other words, we would like to find the numbers \beta_1 and \beta_2 that approximately solve the overdetermined linear system

\begin{gathered}   {\beta _1} + 1{\beta _2} = 6, \hfill \\   {\beta _1} + 2{\beta _2} = 5, \hfill \\   {\beta _1} + 3{\beta _...

of four equations in two unknowns in some “best” sense.

File:Linear least squares example2.png

The least squares approach to solving this problem is to try to make as small as possible the sum of squares of “errors” between the right- and left-hand sides of these equations, that is, to find the minimum of the function

S(\beta_1, \beta_2)= \left[6-(\beta_1+1\beta_2)\right]^2 +\left[5-(\beta_1+2\beta_2) \right]^2 +\left[7-(\beta_1 + 3\beta_2)\...

The minimum is determined by calculating the partial derivatives of S(\beta_1,\beta_2) in respect to \beta_1 and \beta_2 and setting them to zero. This results in a system of two equations in two unknowns, called the normal equations, which, when solved, gives the solution

\beta_1 = 3.5, \beta_2 = 1.4

and the equation y = 3.5 + 1.4x of the line of best fit. The residuals, that is, the discrepancies between the y values from the experiment and the y values calculated using the line of best fit are then found to be 1.1, -1.3, -0.7, and 0.9 (see the picture on the right). The minimum value of the sum of squares is

S(3.5,1.4) = 1.12 + (-1.3)2 + (-0.7)2 + 0.92 = 4.2.

The general problem

Consider an overdetermined system (there are more equations than unknowns)

\sum_{j=1}^{n} X_{ij}\beta_j = y_i,\ (i=1, 2, \dots, m),

of m linear equations in n unknown coefficients, \beta_1,\beta_2,...,\beta_n, with m > n, written in matrix form as \mathbf {X} \boldsymbol {\beta} = \mathbf {y},where


\mathbf {X}=\begin{pmatrix} X_{11} & X_{12} & \cdots & X_{1n} \\ X_{21} & X_{22} & \cdots & X_{2n} \\...

Such a system usually has no solution, and the goal is then to find the coefficients \beta which fit the equations “best”, in the sense of solving the quadratic minimization problem

\mathop {\rm argmin}\limits_\beta  \sum\limits_{i = 1}^m {{{\left| {{y_i} - \sum\limits_{j = 1}^n {{X_{ij}}} {\beta _j}} \rig...

A justification for choosing this criterion is given in properties below. This minimization problem has a unique solution, provided that the n columns of the matrix X are linearly independent, given by solving the normal equations

(\mathbf X^\top \mathbf X )\hat \boldsymbol \beta= \mathbf X^\top \mathbf y.

Singular value decomposition

In linear algebra, the singular value decomposition (SVD) is an important factorization of a rectangular real or complex matrix, with many applications in signal processing and statistics. Applications which employ the SVD include computing the pseudoinverse, least squares fitting of data, matrix approximation, and determining the rank, range and null space of a matrix.

Suppose M is an m-by-n matrix whose entries come from the field K, which is either the field of real numbers or the field of complex numbers. Then there exists a factorization of the form

M = U\Sigma V^*,

where U is an m-by-m unitary matrix over K, the matrix \Sigma is m-by-n diagonal matrix with nonnegative real numbers on the diagonal, and V^* denotes the conjugate transpose of V, an n-by-n unitary matrix over K. Such a factorization is called a singular-value decomposition of M. A common convention is to order the diagonal entries \Sigma_{i,i} in non-increasing fashion. In this case, the diagonal matrix \Sigma is uniquely determined by M (though the matrices U and V are not). The diagonal entries of \Sigma are known as the singular values of M.

In practice, matrix U consists of eigenvectors of matrix AA^T while matrix V consists of eigenvectors of matrix A^TA. Finally, matrix \Sigma is the square root of a diagonal matrix forming by eigenvalues of either matrix AA^T or matrix A^TA.

Example

Let us consider the case when

A = \left( {\begin{array}{*{20}{c}}    2 & 4    \\    1 & 3    \\    0 & 0    \\    0 & 0   \\   \end{array} ...

Having A yields

A{A^T} = \left( {\begin{array}{*{20}{c}}    {20} & {14} & 0 & 0  \\    {14} & {10} & 0 & 0  \\    0 &...

By simple calculation, one has

\lambda_1=15+\sqrt{221}, \quad \lambda_2 =15-\sqrt{221}

being eigenvalues of matrix A^TA (in the descending form), which helps us to write down

\Sigma  = \left( {\begin{array}{*{20}{c}}    {15 + \sqrt {221} } & 0  \\    0 & {15 - \sqrt {221} }  \\    0 & 0 ...

To construct matrix V, one has to find eigenvectors corresponding to eignvalues \lambda_1 and \lambda_2. For the first one, that is, \lambda_1, one has the following linear system

\left( {\begin{array}{*{20}{c}}    { - 10 - \sqrt {221} } & {11}  \\    {11} & {10 - \sqrt {221} }  \\   \end{array} ...

which implies, for example,

{x_1} = \frac{{11}} {{\sqrt {442 + 20\sqrt {221} } }}, \quad {x_2} = \frac{{10 + 221}} {{\sqrt {442 + 20\sqrt {221} } }}.

Similar to \lambda_2, one has its corresponding eigenvector as follows

{x_3} = \frac{{11}} {{\sqrt {442 - 20\sqrt {221} } }}, \quad {x_4} = \frac{{10 - 221}} {{\sqrt {442 - 20\sqrt {221} } }}.

Thus

V = \left( {\begin{array}{*{20}{c}}    {\frac{{11}} {{\sqrt {442 + 20\sqrt {221} } }}} & {\frac{{11}} {{\sqrt {442 - 20\s...

The numerical solution of V is

V = \left( {\begin{array}{*{20}{c}}    {{\text{0}}{\text{.4045535848}}} & {{\text{0}}{\text{.9145142955}}}  \\    {{\text...

which gives us

{V^T}V = \left( {\begin{array}{*{20}{c}}    {{\text{1}}{\text{.000000000}}} & {{\text{ - }}{\text{.4e - 9}}}  \\    {{\te...

Having V we can compute U by the following formula U=AV\Sigma^{-1} or you can perform the same routine as above. That is

U = \left( {\begin{array}{*{20}{c}}    {{\text{0}}{\text{.1495732296}}} & {{\text{1}}{\text{.574048236}}} & 0 & 0....

It is easy to see that

U\Sigma {V^T} = \left( {\begin{array}{*{20}{c}}    {{\text{1}}{\text{.999999783}}} & {{\text{4}}{\text{.000000099}}}  \\ ...

Having a SVD result, that is

A = U\left( {\begin{array}{*{20}{c}}    {{\Sigma _1}}  \\    0  \\   \end{array} } \right){V^T}

one can see that the solution of Ax=b can be computed by x = V\left( {\begin{array}{*{20}{c}}    {\Sigma _1^{ - 1}} & 0  \\   \end{array} } \right){U^T}b. Note that matrices U and V are not unique. Using Maple software, you can compute SVD as shown below.

SVD-Maple

Useful links

  • http://web.mit.edu/be.400/www/SVD/Singular_Value_Decomposition.htm
  • http://en.wikipedia.org/wiki/Overdetermined_system
  • http://en.wikipedia.org/wiki/Singular_Value_Decomposition
  • http://en.wikipedia.org/wiki/Linear_least_squares
  • http://www.miislita.com/information-retrieval-tutorial/svd-lsi-tutorial-3-full-svd.html

Tháng Sáu 7, 2009

Schwarz Reflection Principle and several applications

Suppose that f is an analytic function which is defined in the upper half-disk \{|z|^2<1,I[z]>0\}. Further suppose that f extends to a continuous function on the real axis, and takes on real values on the real axis. Then f can be extended to an analytic function on the whole disk by the formula

f(\overline z)=\overline{f(z)}

and the values for z reflected across the real axis are the reflections of f(z) across the real axis. It is easy to check that the above function is complex differentiable in the interior of the lower half-disk. What is remarkable is that the resulting function must be analytic along the real axis as well, despite no assumptions of differentiability.

SchwarzReflectionPrinciple

This is called the Schwarz reflection principle, and is sometimes also known as Schwarz’s symmetric principle (Needham 2000, p. 257). The diagram above shows the reflection principle applied to a function f defined for the upper half-disk (left figure; red) and its image (right figure; red). The function is real on the real axis, so it is possible to extend the function to the reflected domain (left and right figures; pink).

For the reflected function to be continuous, it is necessary for the values at the boundary to be continuous and to fall on the line being reflected. The reflection principle also applies in the generality of reflecting along any line, not just the real axis, in which case the function f has to take values along a line in the range. In fact, any arc which has a neighborhood biholomorphic to a straight line can be reflected across. The basic example is the boundary of the unit circle which is mapped to the real axis by z \mapsto \frac{z+1}{z+i}.

The reflection principle can also be used to reflect a harmonic function which extends continuously to the zero function on its boundary. In this case, for negative y, defining

v(x,y)=-v(x,-y)

extends v to a harmonic function on the reflected domain. Again note that it is necessary for v(x,0)=0. This result provides a way of extending a harmonic function from a given open set to a larger open set (Krantz 1999, p. 95).

Source:

  • http://mathworld.wolfram.com/SchwarzReflectionPrinciple.html
  • http://en.wikipedia.org/wiki/Schwarz_reflection_principle

Several applications

  • Let f be a function meromorphic in the closed unit disk \{ |z| \leq 1\}. Suppose that |f| is constant on the unit circle |z|=1.  Prove that f must be a rational function.

Proof. If |f| = 0 on \partial\mathbb{D}, then f = 0 and the statement is true. W.L.O.G suppose |f| = 1 on \partial\mathbb{D}, i.e. f(\partial\mathbb{D}) \subset \partial\mathbb{D} and suppose that f \neq \textrm{const.} f is holomorphic as a function on \mathbb{D} to the Riemann sphere \hat{\mathbb{C}}, and f: \overline{\mathbb{D}}\to\hat{\mathbb{C}} is continuous. Since \overline{\mathbb{D}} is compact, so is f(\overline{\mathbb{D}}) and thus closed in \hat{\mathbb{C}}. In particular f(\overline{\mathbb{D}}) = \overline{f(\mathbb{D})}. Furthermore, since non-constant holomorphic mappings are open, f(\mathbb{D}) is open. In conclusion

\partial(f(\mathbb{D})) = \overline{f(\mathbb{D})}\,\backslash\,f(\mathbb{D}) = f(\partial\mathbb{D}) \subset \partial\mathbb....

Therefore either f(\mathbb{D}) = \mathbb{D} or f(\mathbb{D}) = \hat{\mathbb{C}}\,\backslash\,\overline{\mathbb{D}} and, in both cases, f(\partial\mathbb{D}) = \partial\mathbb{D}.

Suppose f(\mathbb{D}) = \mathbb{D} (the other case is treated similarly). Now by

f(z) : = \overline{\frac {1}{f\left(\frac {1}{\overline{z}}\right)}},\,\,\,\,|z| > 1,

i.e. reflection along \partial\mathbb{D}, we have a, in fact, the holomorphic continuation of f to \hat{\mathbb{C}} by the Schwarz reflection principle (if you’re not familiar with this: The idea is that this continuation is clearly holomorphic in \hat{\mathbb{C}}\,\backslash\,\overline{\mathbb{D}}; on \partial\mathbb{D} use Morera’s theorem). We know not that f: \hat{\mathbb{C}}\to\hat{\mathbb{C}} is holomorphic. Thus f has to be a rational function.

Comments: In the above solution, we can construct

f(z) : = \overline{\frac {1}{f\left(\frac {1}{\overline{z}}\right)}},\,\,\,\,|z| > 1,

to be a holomorphic function by mean of the Schwarz Reflection Principle. Roughtly speaking, we first go back to the domain \{|z| <1\} by the mapping \frac{1}{\overline z}. Then we compute its image under the function f. Then we go forward by the mapping

\frac{1}{\overline{f(\frac{1}{\overline z})}}

which equals to

\overline{\frac {1}{f\left(\frac {1}{\overline{z}}\right)}}.

The above complex conjugate operation reflects the word “reflection ” in this principle. Similar to the above question, we have the following

  • Let f be holomorphic in the unit disk \{ |z| < 1\}, continuous in \{ |z| \leq 1\} and |f(z)|=1 whenever |z|=1. Prove that f is a rational function.

Proof. If f has infinite many zeros, by the isolatedness of the zeros of holomorphis functions, the zeros must have limit points on the boundary of the unit disk, otherwise, f \equiv 0. But it will violate the fact that f is continuous in \{|z| \leq 1 \} and |f(z)|=1 whenever |z|=1. Hence f has only finite zeros in the unit disk. Denote all these zeros by z_1, ..., z_n, multiple zeros being repeated, and define,

F\left( z \right) = \frac{{f\left( z \right)}} {{\prod\limits_{k = 1}^n {\left( {\frac{{z - {z_k}}} {{1 - \overline {{z_k}} z...

Then F is holomorphic in \{|z|<1\}, continuous in \{ |z| \leq 1\} and |F(z)|=1 whenever |z|=1. By Maximum Modulus Principle, |F(z)| \leq 1 in \{|z| <1\}. Since F(z) has no zero in \{|z|<1\}, \frac{1}{F(z)} is also holomorphic in \{|z|<1\}, continuous in \{ |z| \leq 1\} and \left|\frac{1}{F(z)}\right|=1 whenever |z|=1. Application of the Maximum Modulus Principle to \frac{1}{F(z)} yields |F(z)| \geq 1 in \{|z|\leq 1\}. Hence |F(z)|=1 holds in \{|z| \leq 1\}, which implies F(z)=e^{i\alpha} with \alpha a real number. So we obtain

f\left( z \right) = {e^{i\alpha }}\prod\limits_{k = 1}^n {\left( {\frac{{z - {z_k}}} {{1 - \overline {{z_k}} z}}} \right)}

which completes the proof.

Comments: It doesn’t have to just sound like one. Basically, if f has a pole in b you should multiply by the conformal disc automorphism \frac {z - b}{1 - \overline{b}z} and divide by it if f has a zero in b. The motivation is basically just that I knew it Wink. The standard way to construct an analytic function in the disc with zeros in (b_n) is to multiply together infinitely many factors

\frac {z - b_n}{1 - \overline{b_n}z}.

This is called a Blaschke product. There are factorization theorems saying that analytic functions belonging to certain growth classes (the Smirnov class f.ex.) can be factored into products of a Blaschke product (describing the zeros), a singular inner function (describing the angle of f(z)), and an outer function (describing the growth). With the very nice function described in the problem I knew that it would have no growth and no inner factor either (they can’t be continuous on the circle), so essentially I just knew the solution to the problem. Although I guess the solution can be easily guessed if products of Möbius transforms of the type above are familiar to you as exactly the rational functions that have constant modulus on the unit circle.

Perturbation in linear systems

Chuyên mục: Các Bài Tập Nhỏ, Linh Tinh, Nghiên Cứu Khoa Học — Ngô Quốc Anh @ 1:53

In this small topic, I will show several traditional results regarding to the perturbation in matrix algebra. This comes from the study of the stability of solving system of linear equations in Numerical Analysis.

Let us consider the following matrix-form linear system Ax=b. We split into three cases

  • Perturbation on b

If \delta b is a small perturbation on b, then \delta x is a small perturbation on x in the sense that the following system A(x + \delta x) = b + \delta b holds true. This together with the fact that Ax  =b gives A \delta x = \delta b. By some simple calculation, one has

\frac{{\left\| {\delta x} \right\|}} {{\left\| x \right\|}} = \frac{{\left\| {{A^{ - 1}}\left( {\delta b} \right)} \right\|}}...

Thus

\frac{{\left\| {\delta x} \right\|}} {{\left\| x \right\|}} \leqslant \left\| {{A^{ - 1}}} \right\|\left\| A \right\|\frac{{\...

  • Perturbation on A.

If \delta A is a small perturbation on A, then \delta x is a small perturbation on x in the sense that the following system (A + \delta A)(x+\delta x) = b holds true. This together with the fact that Ax  =b gives \left( {\delta A} \right)x + A\left( {\delta x} \right) + \left( {\delta A} \right)\left( {\delta x} \right) = 0. By some simple calculation, one has

\left\| {\delta x} \right\| = \left\| {{A^{ - 1}}\left( {\left( {\delta A} \right)\left( x \right) + \left( {\delta A} \right...

Thus

\left\| {\delta x} \right\|\left( {1 - \left\| {{A^{ - 1}}} \right\|\left\| {\delta A} \right\|} \right) \leqslant \left\| {{...

which implies that

\frac{{\left\| {\delta x} \right\|}} {{\left\| x \right\|}} \leqslant \frac{{\left\| {{A^{ - 1}}} \right\|\left\| {\delta A} ...

  • Perturbation on A and b.

If \delta A is a small perturbation on A and \delta b is a small perturbation on b, then \delta x is a small perturbation on x in the sense that the following system (A + \delta A)(x+\delta x) = b + \delta b holds true. This together with the fact that Ax  =b gives A\left( {\delta x} \right) + \left( {\delta A} \right)x + \left( {\delta A} \right)\left( {\delta x} \right) = \delta b. By some simple calculation, one has

\left\| {\delta x} \right\| = \left\| {{A^{ - 1}}\left( {\delta b - \left( {\delta A} \right)\left( x \right) - \left( {\delt...

which implies that

\left\| {\delta x} \right\|\left( {1 - \left\| {{A^{ - 1}}} \right\|\left\| {\delta A} \right\|} \right) \leqslant \left\| {{...

Thus,

\frac{{\left\| {\delta x} \right\|}} {{\left\| x \right\|}} \leqslant \frac{{\left\| {{A^{ - 1}}} \right\|\left( {\frac{{\lef...

In the literature, the expression \|A^{-1}\|\|A\| is called the condition number of the matrix A, denoted by K(A). It is easy to see that K(A) \geq 1. If K(A) is small then the system Ax=b is called to be well-posed. Alternatively, in the case when 1 \ll K(A), the system Ax=b is ill-posed.

Tháng Sáu 2, 2009

Finite difference method: Local truncation error and consistency

Chuyên mục: Các Bài Tập Nhỏ, Linh Tinh, Nghiên Cứu Khoa Học — Ngô Quốc Anh @ 15:06

Truncation error or local truncation error is error made by numerical algorithms that arises from taking finite number of steps in computation. It is present even with infinite-precision arithmetic, because it is caused by truncation of the infinite Taylor series to form the algorithm.

Use of arbitrarily small steps in numerical computation is prevented by round-off error, which are the consequence of using finite precision floating point numbers on computers.

Let F_{i,j}(u)=0 represent the difference equation approximating the PDE at the (i.j)-th mesh point, with exact solution u. For example,

{F_{i,j}}\left( u \right) = \frac{{{u_{i,j + 1}} - {u_{i,j}}}} {k} - \frac{{{u_{i - 1,j}} - 2{u_{i,j}} + {u_{i + 1,j}}}} {{{h...

represent the difference equation approximating the following PDE \frac{{\partial u}} {{\partial t}} = \frac{{{\partial ^2}u}} {{\partial {t^2}}} at the (i.j)-th mesh point. If u is replaced by U at the mesh points of the \DeltaE, where U is the exact solution of the PDES, the value of F_{i,j}(U) is called the local truncation error T_{i,j} at the (i.j)-th mesh point. F_{i,j}(U) clearly measures the amount by which the exact solution values of the PDE at the mesh points of the \DeltaE do not satisfy the \DeltaE at the point (ih, jk).

Continuing the above example gives us

{T_{i,j}} = {F_{i,j}}\left( U \right) = \frac{{{U_{i,j + 1}} - {U_{i,j}}}} {k} - \frac{{{U_{i - 1,j}} - 2{U_{i,j}} + {U_{i + ...

Using Taylor expansions, it is easy to express T_{i,j} in terms of power of h and k and partial derivatives of U at (ih, jk). This leads us to the computation of the local truncation error.

It is sometimes possible ot approximate a parabolic or hyperbolic equation by a finite-difference scheme that is stable (i.e. limits the amplification of all the compoonents of the initial conditions), but which has a solution that converges to the solution of a different differential equation as the mesh lengths tend to zero. Such a difference scheme is said to be inconsistent or imcompatible with the PDE.

The real importance of the concept of consistency lies in a theorem by Lax which states that if a linear finite-difference equation is consistent with a properly posed linear initial-value problem then stability guarantees convergnce of u to U as the mesh lengths tned to zero. Consistency can be defined in either of two equivalent but slightly different ways.

The more general definition is as follows. Let L(U)=0 represent the PDE in the independent variables x and t, with exact solution U. Let F(u)=0 represent the approximating finite-difference equation with exact solution u. Let v be a continuous function of x and t with a sufficient number of continuous derivatives to enable L(v) to be evaluated at the point (ih,jk).

Then the truncation error T_{i,j}(v) at the point (ih,jk) is defined by

T_{i,j}(v) = F_{i,j}(v) - L(v_{i,j}).

If T_{i,j}(v) \to 0 as h \to 0 and k \to 0, the difference equation is said to be consistent or compatible with the PDE. Most authors put v = U because L(U)=0.

For example, let us consider the following parabolic equation

u_t = \nu u_{xx}(x) - \kappa u(x), \quad a<x<b, 0<t<T

with

u(x,0)=f(x), \quad a \leq x \leq b

and

u(a,t)=g_1(t), \quad u(b,t)=g_2(t), \quad 0 \leq t \leq T,

where \nu >0 and \kappa \geq 0.

Backward Euler + second order central finite difference discretization.

We split equally the domain a \leq x \leq b into N parts and the domain 0 \leq t \leq T into M parts, i.e. (i,j)-th mesh point is of the following form \left( {a + i\frac{{b - a}} {N},j\frac{T} {M}} \right). We are now in a position to discretize the problem as follows

\frac{{{u_{i,j + 1}} - {u_{i,j}}}} {k} = \nu \frac{{{u_{i - 1,j + 1}} - 2{u_{i,j + 1}} + {u_{i + 1,j + 1}}}} {{{h^2}}} - \kap...

where 1 \leq i \leq N - 1. At i=0, that is x=a, one has

\frac{{{u_{1,j + 1}} - {u_{ - 1,j + 1}}}} {{2h}} = g_1\left( t \right)

and at i=N, that is x=b, one obtains

\frac{{{u_{N + 1,j + 1}} - {u_{N - 1,j + 1}}}} {{2h}} = g_2\left( t \right).

The local truncation error and consistency

The above discretization can be rewritten as following

\frac{{{u_{i,j + 1}} - {u_{i,j}}}} {k} = \nu \frac{{{u_{i - 1,j + 1}} - 2{u_{i,j + 1}} + {u_{i + 1,j + 1}}}} {{{h^2}}} - \kap...

where 1 \leq i \leq N - 1 with the following boundary conditions

\frac{{{u_{0,j + 1}} - {u_{0,j}}}} {k} = \nu \frac{{2{u_{1,j + 1}} - 2{u_{0,j + 1}} - 2hf\left( a \right)}} {{{h^2}}} - \kapp...

and

\frac{{{u_{N,j + 1}} - {u_{N,j}}}} {k} = \nu \frac{{2{u_{N - 1,j + 1}} - 2{u_{N,j + 1}} + 2hf\left( b \right)}} {{{h^2}}} - \...

Now we have

{F_{i,j}}\left( U \right) = \frac{{{U_{i,j + 1}} - {U_{i,j}}}} {k} - \nu \frac{{{U_{i - 1,j + 1}} - 2{U_{i,j + 1}} + {U_{i + ...

By Taylor’s expansion

{U_{i + 1,j + 1}} = {U_{i,j}} + \left[ {h{{\left( {\frac{{\partial U}} {{\partial x}}} \right)}_{i,j}} + k{{\left( {\frac{{\p...

and

{U_{i - 1,j + 1}} = {U_{i,j}} + \left[ { - h{{\left( {\frac{{\partial U}} {{\partial x}}} \right)}_{i,j}} + k{{\left( {\frac{...

and

{U_{i,j + 1}} = {U_{i,j}} + k{\left( {\frac{{\partial U}} {{\partial t}}} \right)_{i,j}} + \frac{{{k^2}}} {2}{\left( {\frac{{...

Therefore,

{F_{i,j}}\left( U \right) = {\left( {\frac{{\partial U}} {{\partial t}}} \right)_{i,j}} + \frac{k} {2}{\left( {\frac{{{\parti...

which yields

{F_{i,j}}\left( U \right) = \left[ {{{\left( {\frac{{\partial U}} {{\partial t}}} \right)}_{i,j}} - \nu {{\left( {\frac{{{\pa...

Thus, the principle part of the local truncation error is

\frac{k} {2}{\left( {\frac{{{\partial ^2}U}} {{\partial {t^2}}}} \right)_{i,j}} - \nu k{\left( {\frac{{{\partial ^3}U}} {{\pa...

Hence {T_{i,j}} = O\left( k \right) + O\left( {{h^2}} \right).

Consistency.

In view of the local truncation error, one can easily see that the local truncation error is a polynomial of two variables h and k which implies that the method is consistent.

Tháng Năm 28, 2009

Finite difference method: A brief introduction

Chuyên mục: Linh Tinh, Nghiên Cứu Khoa Học — Ngô Quốc Anh @ 20:08

In mathematics, finite-difference methods are numerical methods for approximating the solutions to differential equations using finite difference equations to approximate derivatives.

Intuitive derivation

Finite-difference methods approximate the solutions to differential equations by replacing derivative expressions with approximately equivalent difference quotients. That is, because the first derivative of a function f is, by definition,

f'(a)=\lim_{h\to 0}{f(a+h)-f(a)\over h},

then a reasonable approximation for that derivative would be to take

f'(a)\approx {f(a+h)-f(a)\over h}

for some small value of h. In fact, this is the forward difference equation for the first derivative. Using this and similar formulae to replace derivative expressions in differential equations, one can approximate their solutions without the need for calculus.

Derivation from Taylor’s polynomial

Assuming the function whose derivatives are to be approximated is properly-behaved, by Taylor’s theorem,

f(x_0 + h) = f(x_0) + \frac{f'(x_0)}{1!}h + \frac{f^{(2)}(x_0)}{2!}h^2 + \cdots + \frac{f^{(n)}(x_0)}{n!}h^n + R_n(x),
where n! denotes the factorial of n, and R_n(x) is a remainder term, denoting the difference between the Taylor polynomial of degree n and the original function. Again using the first derivative of the function f as an example, by Taylor’s theorem,

f(x_0 + h) = f(x_0) + f'(x_0)h + R_1(x),

which, with some minor algebraic manipulation, is equivalent to

f'(a) = {f(a+h)-f(a)\over h} + R_1(x)
so that for R_1(x) sufficiently small,

f'(a)\approx {f(a+h)-f(a)\over h}.

Accuracy and order

The error in a method’s solution is defined as the difference between its approximation and the exact analytical solution. The two sources of error in finite difference methods are round-off error, the loss of precision due to computer rounding of decimal quantities, and truncation error or discretization error, the difference between the exact solution of the finite difference equation and the exact quantity assuming perfect arithmetic (that is, assuming no round-off).

To use a finite difference method to attempt to solve (or, more generally, approximate the solution to) a problem, one must first discretize the problem’s domain. This is usually done by dividing the domain into a uniform grid (see image to the right). Note that this means that finite-difference methods produce sets of discrete numerical approximations to the derivative, often in a “time-stepping” manner.

File:Finite Differences.png

An expression of general interest is the local truncation error of a method. Typically expressed using Big-O notation, local truncation error refers to the error from a single application of a method. That is, it is the quantity f'(x_i)-f'_i if f'(x_i) refers to the exact value and f'_i to the numerical approximation. The remainder term of a Taylor polynomial is convenient for analyzing the local truncation error. Using the Lagrange form of the remainder from the Taylor polynomial for f(x_0 + h), which is

R_n(x_0 + h) = \frac{f^{(n+1)}(\xi)}{(n+1)!} (h)^{n+1},
where x_0 <\xi< x_0 + h, the dominant term of the local truncation error can be discovered. For example, again using the forward-difference formula for the first derivative, knowing that f(x_i) = f(x_0 + ih),

f(x_0 + i h) = f(x_0) + f'(x_0)i h + \frac{f''(\xi)}{2!} (i h)^{2},
and with some algebraic manipulation, this leads to

\frac{f(x_0 + i h) - f(x_0)}{i h} = f'(x_0) + \frac{f''(\xi)}{2!} i h,
and further noting that the quantity on the left is the approximation from the finite difference method and that the quantity on the right is the exact quantity of interest plus a remainder, clearly that remainder is the local truncation error. A final expression of this example and its order is:

\frac{f(x_0 + i h) - f(x_0)}{i h} = f'(x_0) + O(h).
This means that, in this case, the local truncation error is proportional to the step size.

Explicit method

Using a forward difference at time t_n and a second-order central difference for the space derivative at position x_j (”FTCS”) we get the recurrence equation:

\frac{u_j^{n+1} - u_j^{n}}{k} =\frac{u_{j+1}^n - 2u_j^n + u_{j-1}^n}{h^2}.
This is an explicit method for solving the one-dimensional heat equation. We can obtain u_j^{n+1} from the other values this way:

u_{j}^{n+1} = (1-2r)u_{j}^{n} + ru_{j-1}^{n} + ru_{j+1}^{n}
where r = \frac{k}{h^2}. So, knowing the values at time n you can obtain the corresponding ones at time n+1 using this recurrence relation. u_0^n and u_J^n must be replaced by the boundary conditions, in this example they are both 0.

File:Explicit method-stencil.svg

This explicit method is known to be numerically stable and convergent whenever r\le \frac{1}{2}. The numerical errors are proportional to the time step and the square of the space step:

\Delta u = O(k)+O(h^2).

Implicit method

If we use the backward difference at time t_{i + 1} and a second-order central difference for the space derivative at position x_j (”BTCS”) we get the recurrence equation:

\frac{u_{j}^{i+1} - u_{j}^{i}}{k} =\frac{u_{j+1}^{i+1} - 2u_{j}^{i+1} + u_{j-1}^{i+1}}{h^2}.
This is an implicit method for solving the one-dimensional heat equation.

File:Implicit method-stencil.svg

We can obtain u_j^{i+1} from solving a system of linear equations:

(1+2r)u_j^{i+1} - ru_{j-1}^{i+1} - ru_{j+1}^{i+1}= u_{j}^{i}.

The scheme is always numerically stable and convergent but usually more numerically intensive than the explicit method as it requires solving a system of numerical equations on each time step. The errors are linear over the time step and quadratic over the space step.

Crank-Nicolson method

Finally if we use the central difference at time t_{n + \frac{1}{2}} and a second-order central difference for the space derivative at position x_j (”CTCS”) we get the recurrence equation:

\frac{u_j^{n+1} - u_j^{n}}{k} = \frac{1}{2} \left(\frac{u_{j+1}^{n+1} - 2u_j^{n+1} + u_{j-1}^{n+1}}{h^2}+\frac{u_{j+1}^{n} - ...
This formula is known as the Crank-Nicolson method. We can obtain u_j^{n+1} from solving a system of linear equations:

(2+2r)u_j^{n+1} - ru_{j-1}^{n+1} - ru_{j+1}^{n+1}= (2-2r)u_j^n + ru_{j-1}^n + ru_{j+1}^n .

The scheme is always numerically stable and convergent but usually more numerically intensive as it requires solving a system of numerical equations on each time step.

File:Crank-Nicolson-stencil.svg

The errors are quadratic over the time step and formally are of the fourth degree regarding the space step:

\Delta u = O(k^2)+O(h^4).
However, near the boundaries, the error is often O(h^2) instead of O(h^4). Usually the Crank-Nicolson scheme is the most accurate scheme for small time steps. The explicit scheme is the least accurate and can be unstable, but is also the easiest to implement and the least numerically intensive. The implicit scheme works the best for large time steps.

Source

Tháng Năm 19, 2009

Casorati–Weierstrass theorem, the behavior of meromorphic functions near essential singularities

Chuyên mục: Linh Tinh, Nghiên Cứu Khoa Học — Ngô Quốc Anh @ 14:42

In complex analysis, a branch of mathematics, the Casorati–Weierstrass theorem describes the remarkable behavior of meromorphic functions near essential singularities. It is named for Karl Theodor Wilhelm Weierstrass and Felice Casorati.

Formal statement of the theorem

Start with some open subset U in the complex plane containing the number z_0, and a function f that is holomorphic on U\backslash \left\{ {{z_0}} \right\}, but has an essential singularity at z_0. The Casorati–Weierstrass theorem then states that

if V is any neighborhood of z_0 contained in U, then f(V \backslash  \{z_0\}) is dense in \mathbb C.

This can also be stated as follows:

for any \varepsilon > 0 and any complex number w, there exists a complex number z in U with |z-z_0| < \varepsilon and |f(z)-w| < \varepsilon.

Or in still more descriptive terms:

f comes arbitrarily close to any complex value in every neighbourhood of z_0.

This form of the theorem also applies if f is only meromorphic. The theorem is considerably strengthened by Picard’s great theorem, which states, in the notation above, that f assumes every complex value, with one possible exception, infinitely often on V.

Examples

The function f(z) = e^\frac{1}{z} has an essential singularity at z_0 = 0, but the function g(z) = \frac{1}{z^3} does not (it has a pole at 0). Consider the function

f(z)=e^{1/z}.

This function has the following Laurent series about the essential singular point at z_0:

f(z)=\displaystyle\sum_{n=0}^{\infty}\frac{1}{n!z^{n}}.

Because f'(z) =\frac{-e^{\frac{1}{z}}}{z^{2}} exists for all points z\ne 0 we know that f(z) is analytic in the neighborhood of z = 0. Hence it is an isolated singularity, as well as being and essential singularity. Using a change of variable to polar coordinates z = re^{i\theta} our function, f(z) = e^\frac{1}{z} becomes:

f(z)=e^{\frac{1}{r}e^{-i\theta}}=e^{\frac{1}{r}\cos(\theta)}e^{-\frac{1}{r}i \sin(\theta)}.

Taking the absolute value of both sides:

\left| f(z) \right| = \left| e^{\frac{1}{r}cos \theta} \right| \left| e^{-\frac{1}{r}i \sin(\theta)} \right | =e^{\frac{1}{r}...

Thus, for values of \theta such that \cos \theta > 0, we have f(z)\rightarrow\infty as r \rightarrow 0, and for \cos \theta < 0, f(z) \rightarrow 0 as r \rightarrow 0.

Consider what happens, for example when z takes values on a circle of diameter \frac{1}{R} tangent to the imaginary axis. This circle is given by r = \frac{1}{R} \cos \theta. Then,

f(z) = e^{R} \left[ \cos \left( R\tan \theta \right) - i \sin \left( R\tan \theta \right) \right]

and

\left| f(z) \right| = e^R.

Thus,\left| f(z) \right| may take any positive value other than zero by the appropriate choice of R. As z \rightarrow 0 on the circle, \theta \rightarrow \frac{\pi}{2} with R fixed. So this part of the equation:

\left[ \cos \left( R \tan \theta \right) - i \sin \left( R \tan \theta \right) \right]

takes on all values on the unit circle infinitely often. Hence f(z) takes on all the value of every number in the complex plane except for zero infinitely often.

Proof of the theorem

A short proof of the theorem is as follows: Take as given that function f is meromorphic on some punctured neighborhood V\backslash \left\{ {{z_0}} \right\}, and that z_0 is an essential singularity. Assume by way of contradiction that some value b exists that the function can never get close to; that is: assume that there is some complex value b and some \varepsilon > 0 such that |f(z)-b| \geq \varepsilon for all z in V at which f is defined.

Then the new function:

g(z) = \frac{1}{f(z) - b}

must be holomorphic on V\backslash \left\{ {{z_0}} \right\}, with zeroes at the poles of f, and bounded by \frac{1}{\varepsilon}. It can therefore be analytically continued (or continuously extended, or holomorphically extended) to all of V by Riemann’s analytic continuation theorem. So the original function can be expressed in terms of g:

f(z) = \frac{1}{g(z)} + b

for all arguments z in V\backslash \left\{ {{z_0}} \right\}. Consider the two possible cases for \lim_{z \rarr z_0} g(z). If the limit is 0, then f has a pole at z_0 . If the limit is not 0, then z_0 is a removable singularity of f. Both possibilities contradict the assumption that the the point z_0 is an essential singularity of the function f. Hence the assumption is false and the theorem holds.

Bài viết cũ hơn »

Blog at WordPress.com.