Ngô Quốc Anh

June 11, 2021

Second-order differentiability in terms of partial derivatives

Filed under: Uncategorized — Ngô Quốc Anh @ 0:28

Let U \subset \mathbf R^n be open, a \in U is an arbitrary point, and f : U  \to \mathbf R is a function. Recall that f is called differentiable at a if there is a linear map, denoted by A, such that

\displaystyle \lim_{\|h\| \searrow 0} \frac{ \| f(a+h)-f(a) - A (h) \| }{\|h\|} = 0.

The linear map A is called derivative of f at a, denoted by f'(a). Notice that in the nominator of the above quotient, the symbol \| \cdot \| is simply the absolute value function. But this is no longer true for higher-order derivatives that we are going to define.

The following theorem is well-known.

Theorem 1 (1st order differentiability). The function f : U \to \mathbf R is differentiable at a \in U if all partial derivatives \partial_i f exist in a neighborhood of a and are continuous at a.

When f'(a) exists, we must have

\displaystyle f'(a) (h) = \sum_{i=1}^n \partial_i f(a) h_i.

In this note, we want to extend the above theorem for higher-order derivatives. To be more precise, we prove the following

Theorem 2 (2nd order differentiability). The function f : U \to \mathbf R is twice differentiable at a \in U if all partial derivatives \partial_{ij} f exist in a neighborhood of a and are continuous at a.

When f''(a) exists, we must have

\displaystyle f''(a) = \begin{pmatrix} \partial_{11} f(a) & \partial_{12} f(a) & \cdots & \partial_{1n} f(a) \\ \partial_{21} f(a) & \partial_{22} f(a) & \cdots & \partial_{2n} f(a) \\ \vdots & \vdots & \ddots & \vdots \\ \partial_{n1} f(a) & \partial_{n2} f(a) & \cdots & \partial_{nn} f(a) \end{pmatrix} = \begin{pmatrix} (\partial_1 f)'(a) \\ (\partial_2 f)'(a) \\ \vdots \\ (\partial_n f)'(a) \end{pmatrix},

which is the Hessian matrix of f at a. The preceding formula suggests the relation between f''(a) and (\partial_i f)'(a) for all 1\leq i \leq n.

Recursively, higher-order derivatives of f, denoted by f^{(k)} can be defined similarly. To be more precise, and for simplicity, let us treat the case k=2. We say that the function f is twice differentiable at a if there is a bilinear map Q such that

\displaystyle \lim_{\|h\| \searrow 0} \frac{\| f'(a+h)-f'(a) - Q (h)\| }{\|h\|} = 0.

Such a bilinear map Q is called the derivative f'' at a. Notice that if Q is a bilinear map, then Q(h) is a linear map. To prove Theorem 2, we need the following lemma.

I. A CALCULUS LEMMA

The key ingredient of the proof of Theorem 2 is the following:

Lemma. f''(a) exists if and only if all derivatives (\partial_i f)' (a) exist.

To prove the lemma, we first need some preparation. Obviously, the question is how to estimate \| f'(a+h)-f'(a) - f '' (a) (h)\| . Recall that if A \in \mathcal L(X,Y) is a linear map, then its norm is

\displaystyle \| A\|_{\mathcal L}=\sup_{\|x\|_X=1} \|A(x)\|_Y.

Hence as a linear map, we know that

\displaystyle \| f' (a+h)-f' (a) - f '' (a) (h)\|= \sup_{ \|u\|=1} \| \big( f'(a+h)-f ' (a) - f ''(a)(h) \big) (u) \|.

II. PROOF OF THE LEMMA: THE DIRECTION \Rightarrow

Suppose that f''(a) exists. By definition, this is a bilinear map. For clarity, let us denote f''(a) by Q. In terms of Q we obtain

\displaystyle \lim_{\|h\| \searrow 0} \frac{\| f'(a+h)-f'(a) - Q(h)\| }{\|h\|} = 0.

As a linear map, we have

\displaystyle \| \big( f'(a+h)-f ' (a) - Q(a,h) \big) (u) \| = \Big| \sum_{i=1}^n \big[ \partial_i f(a+h) -\partial_i f (a) -Q(h, e_i) \big] u_i \Big|,

with u = (u_1,...,u_n) \in \mathbf R^n, thanks to Q(h,u)=Q(h, \sum_i u_i e_i) = \sum_i Q(h, e_i) u_i. We now show that every (\partial_i f)'(a) exists. The idea is to show that

\displaystyle  \big| \partial_i f(a+h) -\partial_i f (a) -Q(h, e_i) \big| = o(\|h\|).

Indeed, take u = (u_1,...,u_n) with

u_j=\begin{cases} 1 &\text{ if }j=i \\ 0 & \text{ if } j \ne i\end{cases}

to get

\displaystyle  \sum_{i=1}^n \big[ \partial_i f(a+h) - \partial_i f (a) -   Q(h, e_i)   \big] u_i  =  \partial_i f(a+h) - \partial_i f (a) -   Q(h, e_i)  .

Hence

\displaystyle  \sup_{ \|u\|=1} \| \big( f'(a+h)-f ' (a) - Q(h) \big) (u) \| \geq  | \partial_i f(a+h) - Q(h, e_i)   |,

which implies

\displaystyle \| f' (a+h)-f' (a) - Q(h)\| \geq  | \partial_i f(a+h) -\partial_i f (a) -  D_i f (a) (h)  |.

Thus we must have

\displaystyle 0\leq  \frac{ | \partial_i f(a+h) -\partial_i f (a) -   Q(h, e_i)   |}{\|h\|} \leq \frac{\| f' (a+h)-f' (a) - Q(h)\|}{\|h\|} ,

which immediately implies that (\partial_i f)'(a) exists for any 1 \leq i \leq n.

III. PROOF OF THE LEMMA: THE DIRECTION \Leftarrow

By definition, we must have

\displaystyle \lim_{\|h\| \searrow 0} \frac{ \| \partial_i f(a+h) - \partial_i f(a) - (\partial_i f)'(a) (h) \| }{\|h\|} = 0

for each 1 \leq i \leq n. Keep in mind that each (\partial_i f)'(a) is a vector, namely

\displaystyle  (\partial_i f)'(a) = \big( \partial_{i1} f (a),...,\partial_{in} f(a) \big)  .

Let us construct the following bilinear map

\displaystyle Q : (u,e_j) \mapsto \big\langle (\partial_i f)'(a) , u \big\rangle e_j,

namely, with v=(v_1,...,v_n), we have

\displaystyle Q : (u,v) \mapsto   \sum_{i=1}^n \big\langle (\partial_i f)'(a) , u \big\rangle v_i  .

Making use of f''(a) and with \|u\| = 1, we can verify

\displaystyle | \big( f'(a+h)-f ' (a) - Q(h) \big) (u) | = \Big| \sum_{i=1}^n \big[ \partial_i f(a+h) -\partial_i f (a)-  \big\langle (\partial_i f)'(a) , h \big\rangle  \big] u_i \Big| \leq \sum_{i=1}^n \Big| \partial_i f(a+h) -\partial_i f (a)-  \big\langle (\partial_i f)'(a) , h \big\rangle  \big| = o(\|h\|),

thanks to |u_i| \leq \|u\|=1 and the existence of (\partial_i f)'(a) for all 1 \leq i \leq n. Thus, we have just shown that

\displaystyle  \| f'(a+h)-f ' (a) - Q(h) \|  =\sup_{\|u\|=1} | \big( f'(a+h)-f ' (a) -  \big\langle (\partial_i f)'(a) , h \big\rangle  \big) (u) |  =o(\|h\|)

proving the twice diffentiablity of f.

IV. APPLICATION OF THE LEMMA: PROOF OF THEOREM 2

We are now in position to prove our main result. For arbitrary 1 \leq i \leq n but fixed, because all second order partial derivatives \partial_{ij}f with 1 \leq j \leq n exist in a neighborhood of a and are continuous at a, we know that \partial_i f is differentiable at a; see Theorem 1. Now we apply the above lemma to conclude that f is twice differentiable at a.

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.