2. Variational Inference Intro(2)

a) EM : reminder

We try to maximize the (marginal) log likelihood. To do this, we derive a variational lower bound (=L(theta,q)) and try to maximize this lower bound. We do this on an iterative way, with E step and M step.

$logp(X|\theta) \geq L(\theta,q) = E_{q(T)}\;log \frac{p(X,T|\theta)}{q(T)}\;dT$

[ E-step ]

maximize lower bound ( with respect to q )
maximization of lower bound = minmizing the KL divergence between “q” & “posterior distribution”

$L(\theta,q) \rightarrow \underset{q}{max} \;\; \Leftrightarrow \;\; KL[q(T)||p(T|X,\theta)] \rightarrow \underset{q}{min}$

[ M-step ]

maximize the expected value of logarithm of the joint distribution

$L(\theta,q) \rightarrow \underset{\theta}{max} \;\; \Leftrightarrow \;\; E_{q(T)}logp(X,T|\theta) \rightarrow \underset{\theta}{min}$

b) E-step

we could get q(t) like the below, using the full posterior.

$q(T) = p(T|X,\theta)$

But for many cases, we can not compute for posterior exactly. So if we use variational Inference in this E-step part, then it will be much easier to compute!

$KL[q(T)||p(T|X,\theta)] \rightarrow \underset{q\in Q}{min}$

[ Algorithm of Variational EM ]

https://www.researchgate.net/publication/257870092

c) Summary

Let’s compare the accuracy and the speed of diverse methods.

Accuracy

Full Inference > Mean Field > EM algorithm > Variational EM

Speed

Full Inference < Mean Field < EM algorithm < Variational EM

Twitter Facebook LinkedIn

15.(VI) Variational Inference Intro(2)

Seunghan Lee