Model Convergence Analysis

1 minute read

Published: May 30, 2023

A specific inference concerning the convergence of Federated learning.

Optimization objective

\[x^*=\min f(x), \quad s.t. \Vert x\Vert\]

General convergence analysis

Given the constraint is not convex, it might be hard to address this problem directly.

Convex and smooth function without constraints

Define the loss function as L-smooth:

\[\Vert\nabla f(x)-\nabla f(y)\Vert\leq L\Vert x-y\Vert^2\]

Then, we can infer the following Property. ([[Difference about lemmas, theorem, proposition, corollary]])

Property 1 Inferred from L-smooth:

\[f(x)-f(y) \leq \nabla f(x)^{\top}(x-y)-\frac{1}{2 L}\|\nabla f(x)-\nabla f(y)\|^2\]

Convergence Analysis of the L-smooth convex optimization problem

The main target of convergence analysis is to prove $x^{t+1}$ is closer to the optimal $x^*$ than $x^t$. Thus, we can use Eucidean distance in gradient descent optimization solver to measure the proximity as:

\[\begin{aligned} \left\|x^{t+1}-x^*\right\|^2&=\left\|x^t-\eta \nabla f\left(x^t\right)-x^*\right\|^2 \\ &=\left\|x^t-x^*\right\|^2-2 \eta \nabla f\left(x^t\right)^{\top}\left(x^t-x^*\right)+\eta^2\left\|\nabla f\left(x^t\right)\right\|^2 \end{aligned}\]

Prove the convergence

We can infer that $0\leq f(x^t)-f(x^*)\leq \nabla f(x^t)^{\top}(x^t-x^*)-\frac{1}{2 L}\|\nabla f(x^t)-\nabla f(x^**)\|^2$ Then: $\nabla f(x^t)^{\top}(x^t-x^*)\geq\frac{1}{2 L}\|\nabla f(x^t)-\nabla f(x^**)\|^2$

From ![[Model concergence analysis#^a1itzw]]we can obtain $\begin{aligned} \left\|x^{t+1}-x^*\right\|^2 &=\left\|x^t-x^*\right\|^2-2 \eta \nabla f\left(x^t\right)^{\top}\left(x^t-x^*\right)+\eta^2\left\|\nabla f\left(x^t\right)\right\|^2 \\ &\leq \Vert x^t-x^*\Vert^2 -\frac{\eta}{L}\Vert\nabla f(x^t)\Vert^2+\eta^2\left\|\nabla f\left(x^t\right)\right\|^2\\ & = \Vert x^t-x^*\Vert^2 -\eta\left(\frac{1}{L}-\eta\right)\left\|\nabla f\left(x^t\right)\right\|^2 \end{aligned}$

Therefore, if the $\frac{1}{L}> \eta$ , we can safely assert that $\left\|x^{t+1}-x^*\right\|^2 \leq\left\|x^t-x^*\right\|^2$

which proves the convergence.

This is a convex optimization with no constraint case. More detail about the other cases is avaible at Convergence Analysis - 知乎 (zhihu.com)

Also, the convergence rate analysis is not included, including the O,Omega and Pi analysis are out of scope in this blog. However, it should be specified in the future blogs.

Share on

Twitter Facebook LinkedIn

Yizhou Ding (Jason)

Model Convergence Analysis

Optimization objective

General convergence analysis

Convex and smooth function without constraints

Property 1 Inferred from L-smooth:

Convergence Analysis of the L-smooth convex optimization problem

Prove the convergence

Share on

You May Also Enjoy

Future Blog Post

Difference between Transductive and Inductive Learning

Difference between SGD and GD

L-smooth