Ngô Quốc Anh

August 10, 2009

Convex function

Filed under: Các Bài Tập Nhỏ, Linh Tinh, Nghiên Cứu Khoa Học — Ngô Quốc Anh @ 19:56

In mathematics, a real-valued function f defined on an interval (or on any convex subset of some vector space) is called convex, concave upwards, concave up or convex cup, if for any two points \bf x and \bf y in its domain C and any t \in [0,1], we have

\displaystyle f(t{\bf x}+(1-t){\bf y})\leq t f({\bf x})+(1-t)f({\bf y}).

In other words, a function is convex if and only if its epigraph (the set of points lying on or above the graph) is a convex set.

Pictorially, a function is called ‘convex’ if the function lies below the straight line segment connecting two points, for any two points in the interval.

A function is called strictly convex if

\displaystyle f(t{\bf x}+(1-t){\bf y}) < t f({\bf x})+(1-t)f({\bf y}),

for any \displaystyle t \in (0,1) and \displaystyle {\bf x} \neq {\bf y}.

A function \displaystyle f is said to be concave if \displaystyle -f is convex.

First-order condition

f is differentiable if domain of f is open and the gradient

\displaystyle \nabla f\left({\bf x}\right) =\left({\frac{{\partial f\left({\bf x}\right)}}{{\partial{x_{1}}}},\frac{{\partial f\left({\bf x}\right)}}{{\partial{x_{2}}}},...,\frac{{\partial f\left({\bf x}\right)}}{{\partial{x_{n}}}}}\right)

exists at each {\bf x}=(x_1,x_2,...,x_n).

1st-order condition: differentiable f with convex domain is convex iff

\displaystyle f\left( {\bf y} \right) \geq f\left( {\bf x} \right) + \nabla f{\left( {\bf x} \right)^T}\left( {{\bf y} - {\bf x}} \right)

for all \bf x, y.

Second-order conditions

f is twice differentiable if domain of f is open and the Hessian

\displaystyle {\nabla ^2}f({\mathbf{x}}) = {\left( {\frac{{{\partial ^2}f\left( {\mathbf{x}} \right)}}{{\partial {x_i}\partial {x_j}}}} \right)_{i,j = \overline {1,n} }}

exists at each \bf x.

2nd-order conditions: for twice differentiable f with convex domain f is convex if and only if

\displaystyle {\nabla ^2}f({\mathbf{x}}) \geq 0

for all \bf x. If

\displaystyle {\nabla ^2}f({\mathbf{x}})> 0

for all \bf x, then f is strictly convex.


* Quadratic function


\displaystyle f(\mathbf{x})=\frac{1}{2}\mathbf{x}^TP\mathbf{x}+\mathbf{q}^T\mathbf{x}+\mathbf{r}

we have

\displaystyle \nabla f(\mathbf{x}) = P\mathbf{x}+q, \quad \nabla^2f(\mathbf{x})=P.

Therefore f(\mathbf{x}) is convex if P \geq 0.

* Least-squares objective


\displaystyle f(\mathbf{x})=\|A\mathbf{x}-b\|_2^2

we have

\displaystyle \nabla f(\mathbf{x})=2A^T(A\mathbf{x}-b), \quad \nabla^2f(\mathbf{x})=2A^TA.

Therefore, least-squares objective is convex for any A.

* Quadratic-over-linear

With f(x,y)=\frac{x^2}{y} we have

\displaystyle \nabla f\left({x,y}\right) =\left({\begin{array}{*{20}{c}}{\frac{{2x}}{y}}&{-\frac{{{x^{2}}}}{{{y^{2}}}}}\\ \end{array}}\right)

and thus

\displaystyle {\nabla^{2}}f\left({x,y}\right) =\left({\begin{array}{*{20}{c}}{\frac{2}{y}}&{-\frac{{2x}}{{{y^{2}}}}}\\ {-\frac{{2x}}{{{y^{2}}}}}&{\frac{{2{x^{2}}}}{{{y^{3}}}}}\\ \end{array}}\right) =\frac{2}{{{y^{3}}}}\left({\begin{array}{*{20}{c}}y\\ {-x}\\ \end{array}}\right)\left({\begin{array}{*{20}{c}}y &{-x}\\ \end{array}}\right)\geq 0

convex for y>0.

* Log-sum-exp

\displaystyle f\left( \mathbf{x} \right) = \log \sum\limits_{k = 1}^n {\exp {x_k}}

is convex. To this end, note that

\displaystyle {\nabla^{2}}f\left(\mathbf{x}\right) =\frac{\mathbf{1}}{{{1^{T}}\mathbf{z}}}{\rm diag}\left(\mathbf{z}\right)-\frac{1}{{{{\left({{\mathbf{1}^{T}}\mathbf{z}}\right)}^{2}}}}\mathbf{z}\mathbf{z}^{T},\quad{z_{k}}=\exp{x_{k}}

To show \nabla^2 f(\mathbf{x}) \geq 0, we must verify that \mathbf{v}^T \nabla^2f(\mathbf{x})\mathbf{v} \geq 0 for all \mathbf{v}. Indeed,

\displaystyle {{\mathbf{v}}^{T}}{\nabla^{2}}f({\mathbf{x}}){\mathbf{v}}=\frac{{\left({\sum\limits_{k = 1}^{n}{{z_{k}}v_{k}^{2}}}\right)\left({\sum\limits_{k = 1}^{n}{{z_{k}}}}\right)-{{\left({\sum\limits_{k = 1}^{n}{{v_{k}}{z_{k}}}}\right)}^{2}}}}{{{{\left({\sum\limits_{k = 1}^{n}{{z_{k}}}}\right)}^{2}}}}\geqslant 0.

* Geometric mean

\displaystyle f\left( \mathbf{x} \right) = {\left( {\prod\limits_{k = 1}^n {{x_k}} } \right)^{\frac{1} {n}}}

on \mathbb R^n_{++} is concave (similar proof as for log-sum-exp).


least-squares objective

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at

%d bloggers like this: