# Ngô Quốc Anh

## August 10, 2009

### Convex function

Filed under: Các Bài Tập Nhỏ, Linh Tinh, Nghiên Cứu Khoa Học — Ngô Quốc Anh @ 19:56

In mathematics, a real-valued function $f$ defined on an interval (or on any convex subset of some vector space) is called convex, concave upwards, concave up or convex cup, if for any two points $\bf x$ and $\bf y$ in its domain $C$ and any $t \in [0,1]$, we have

$\displaystyle f(t{\bf x}+(1-t){\bf y})\leq t f({\bf x})+(1-t)f({\bf y})$.

In other words, a function is convex if and only if its epigraph (the set of points lying on or above the graph) is a convex set.

Pictorially, a function is called ‘convex’ if the function lies below the straight line segment connecting two points, for any two points in the interval.

A function is called strictly convex if

$\displaystyle f(t{\bf x}+(1-t){\bf y}) < t f({\bf x})+(1-t)f({\bf y})$,

for any $\displaystyle t \in (0,1)$ and $\displaystyle {\bf x} \neq {\bf y}$.

A function $\displaystyle f$ is said to be concave if $\displaystyle -f$ is convex.

First-order condition

$f$ is differentiable if domain of $f$ is open and the gradient

$\displaystyle \nabla f\left({\bf x}\right) =\left({\frac{{\partial f\left({\bf x}\right)}}{{\partial{x_{1}}}},\frac{{\partial f\left({\bf x}\right)}}{{\partial{x_{2}}}},...,\frac{{\partial f\left({\bf x}\right)}}{{\partial{x_{n}}}}}\right)$

exists at each ${\bf x}=(x_1,x_2,...,x_n)$.

1st-order condition: differentiable $f$ with convex domain is convex iff

$\displaystyle f\left( {\bf y} \right) \geq f\left( {\bf x} \right) + \nabla f{\left( {\bf x} \right)^T}\left( {{\bf y} - {\bf x}} \right)$

for all $\bf x, y$.

Second-order conditions

$f$ is twice differentiable if domain of $f$ is open and the Hessian

$\displaystyle {\nabla ^2}f({\mathbf{x}}) = {\left( {\frac{{{\partial ^2}f\left( {\mathbf{x}} \right)}}{{\partial {x_i}\partial {x_j}}}} \right)_{i,j = \overline {1,n} }}$

exists at each $\bf x$.

2nd-order conditions: for twice differentiable $f$ with convex domain $f$ is convex if and only if

$\displaystyle {\nabla ^2}f({\mathbf{x}}) \geq 0$

for all $\bf x$. If

$\displaystyle {\nabla ^2}f({\mathbf{x}})> 0$

for all $\bf x$, then $f$ is strictly convex.

Examples

With

$\displaystyle f(\mathbf{x})=\frac{1}{2}\mathbf{x}^TP\mathbf{x}+\mathbf{q}^T\mathbf{x}+\mathbf{r}$

we have

$\displaystyle \nabla f(\mathbf{x}) = P\mathbf{x}+q, \quad \nabla^2f(\mathbf{x})=P$.

Therefore $f(\mathbf{x})$ is convex if $P \geq 0$.

* Least-squares objective

With

$\displaystyle f(\mathbf{x})=\|A\mathbf{x}-b\|_2^2$

we have

$\displaystyle \nabla f(\mathbf{x})=2A^T(A\mathbf{x}-b), \quad \nabla^2f(\mathbf{x})=2A^TA$.

Therefore, least-squares objective is convex for any $A$.

With $f(x,y)=\frac{x^2}{y}$ we have

$\displaystyle \nabla f\left({x,y}\right) =\left({\begin{array}{*{20}{c}}{\frac{{2x}}{y}}&{-\frac{{{x^{2}}}}{{{y^{2}}}}}\\ \end{array}}\right)$

and thus

$\displaystyle {\nabla^{2}}f\left({x,y}\right) =\left({\begin{array}{*{20}{c}}{\frac{2}{y}}&{-\frac{{2x}}{{{y^{2}}}}}\\ {-\frac{{2x}}{{{y^{2}}}}}&{\frac{{2{x^{2}}}}{{{y^{3}}}}}\\ \end{array}}\right) =\frac{2}{{{y^{3}}}}\left({\begin{array}{*{20}{c}}y\\ {-x}\\ \end{array}}\right)\left({\begin{array}{*{20}{c}}y &{-x}\\ \end{array}}\right)\geq 0$

convex for $y>0$.

* Log-sum-exp

$\displaystyle f\left( \mathbf{x} \right) = \log \sum\limits_{k = 1}^n {\exp {x_k}}$

is convex. To this end, note that

$\displaystyle {\nabla^{2}}f\left(\mathbf{x}\right) =\frac{\mathbf{1}}{{{1^{T}}\mathbf{z}}}{\rm diag}\left(\mathbf{z}\right)-\frac{1}{{{{\left({{\mathbf{1}^{T}}\mathbf{z}}\right)}^{2}}}}\mathbf{z}\mathbf{z}^{T},\quad{z_{k}}=\exp{x_{k}}$

To show $\nabla^2 f(\mathbf{x}) \geq 0$, we must verify that $\mathbf{v}^T \nabla^2f(\mathbf{x})\mathbf{v} \geq 0$ for all $\mathbf{v}$. Indeed,

$\displaystyle {{\mathbf{v}}^{T}}{\nabla^{2}}f({\mathbf{x}}){\mathbf{v}}=\frac{{\left({\sum\limits_{k = 1}^{n}{{z_{k}}v_{k}^{2}}}\right)\left({\sum\limits_{k = 1}^{n}{{z_{k}}}}\right)-{{\left({\sum\limits_{k = 1}^{n}{{v_{k}}{z_{k}}}}\right)}^{2}}}}{{{{\left({\sum\limits_{k = 1}^{n}{{z_{k}}}}\right)}^{2}}}}\geqslant 0$.

* Geometric mean

$\displaystyle f\left( \mathbf{x} \right) = {\left( {\prod\limits_{k = 1}^n {{x_k}} } \right)^{\frac{1} {n}}}$

on $\mathbb R^n_{++}$ is concave (similar proof as for log-sum-exp).

least-squares objective