# Graph Optimization 3 - Optimization Update

From the last article, we get the following negative log likelihood function as our optimization target:

$F(x)=\sum_{ij}{e_{ij}(x)^T\Omega_{ij}e_{ij}(x)}$

The optimization problem turns to be:

$x^*=\arg\min_xF(x)$

This article will explain how can this optimization problem be solved using Gauss-Newton method.

If we find $$x_0$$ as the initial/rough estimation of $$x$$, the next step would be looking for a $$\Delta x^*$$ to achieve:

$\Delta x^* = \arg \min_{\Delta x}F(x_0+\Delta x)$

$$F(x_0+\Delta x)$$ can be expanded to be:

$F(x_0+\Delta x)=\sum_{ij}{e_{ij}(x_0+\Delta x)}^T\Omega_{ij}e_{ij}(x_0+\Delta x)$

Apply first order Taylor approximation on $$e_{ij}$$, we get:

$e_{ij}(x_0+\Delta x)\approx e_{ij}(x_0)+\frac{d e_{ij}}{dx_{ij}}\Delta x,$

a Jacobian matrix is defined as:

$J_{ij}=\frac{de_{ij}}{dx_{ij}}$

Then we get:

$F(x_0+\Delta x)\approx \sum_{ij}(e_{ij}(x_0)+J_{ij}\Delta x)^T\Omega_{ij}(e_{ij}(x_0)+J_{ij}\Delta x)$ $=\sum_{ij}\{e_{ij}(x_0)^T\Omega_{ij}e_{ij}(x_0)+[2e_{ij}(x_0)^T\Omega_{ij}J_{ij}(x_0)]\Delta x+\Delta x^T[J_{ij}(x_0)^T\Omega_{ij}J_{ij}(x_0)]\Delta x\}$

now we define:

$c_{ij}(x_0)=e_{ij}(x_0)^T\Omega_{ij}e_{ij}(x_0)$ $b_{ij}(x_0)=e_{ij}(x_0)^T\Omega_{ij}J_{ij}(x_0)$ $H_{ij}(x_0)=J_{ij}(x_0)^T\Omega_{ij}J_{ij}$

Then the equation system is simplified to be:

$F(x_0+\Delta x)\approx\sum_{ij}c_{ij}(x_0)+2b_{ij}(x_0)\Delta x+\Delta x^TH_{ij}(x_0)\Delta x$

What’s more, if $$F$$ is convex, or at least locally convex in the region close to $$x_0$$, $$F(x_0+\Delta x)$$ is the minimum if the derivative of $$F(x_0+\Delta x)$$ to $$\Delta x$$ equals to 0.

$\frac{dF(x_0+\Delta x)}{d\Delta x}=0,$

$\sum_{ij}[2b_{ij}(x_0)+2H_{ij}\Delta x^*]=0,$

then we can solve $$\Delta x^*$$:

$\sum_{ij}H_{ij}(x_0)\Delta x^*=-\sum_{ij}b_{ij}(x_0)$

The above equation can be arranged in a sparse matrix in the form of

$H(x_0)\Delta x^*=-b(x_0)$

Now $$\Delta x^*$$ can be solved by sparse Cholesky factorization. When $$\Delta x^*$$ is solved, we can compute:

$x_1=x_0+\Delta x^*$

Obviously, the final $$x$$ can be solved iteratively:

$x_n=x_{n-1}+\Delta x^*_{n-1}$
-->

### Wangxin

I am algorithm engineer focused in computer vision, I know it will be more elegant to shut up and show my code, but I simply can't stop myself learning and explaining new things ...

### Uncertainty learning how to and why

#### What is the role of depth in neural networks?

Published on June 13, 2021

#### Gaussian Yolo V3 - get 3.5 more mAP and uncertainty with 1% more computation

Published on December 01, 2019