Variational Inference for Nonparametric Bayesian Quantile Regression (2015)


Abstract

Present a non-parametric method of inferring quantiles & derive novel VB approximation


1. Introduction

Quantile regression

  • introduced as method of modelling variation in functions


2 main approaches used in inferring quantiles

  • 1) CDF

  • 2) Loss function that penalizes predictive quantiles at wrong locations

    • observations \(\mathbf{y_i}\) & inferred quantile \(\mathbf{f_i}\)
    • \(\mathcal{L}\left(\xi_{i}, \alpha\right)=\left\{\begin{array}{ll} \alpha \xi_{i} & \text { if } \xi_{i} \geq 0 \\ (\alpha-1) \xi_{i} & \text { if } \xi_{i}<0 . \end{array}\right.\).

    • with regularization )

      \(\mathcal{L}(\alpha, \mathbf{y}, \mathbf{f})+\lambda\mid\mathbf{f}\mid\) , where \(\mathcal{L}(\alpha, \mathbf{y}, \mathbf{f})=\sum_{i=1}^{N} \mathcal{L}\left(\mathbf{y}_{i}-\mathbf{f}_{i}, \alpha\right)\)

this paper adopts second approach

\(\rightarrow\) loss function is minimized, but within a Bayesian framework

derives a non-parametric approach to modelling the quantile function


2. Bayesian Quantile regression

goal : derive the posterior \(p\left(\mathbf{f}_{\star} \mid \mathbf{y}, \mathbf{x}_{\star}, \mathbf{x}\right)\)

  • \(\mathbf{f}_{\star}\) : prediction for some input \(\mathbf{x}_{\star}\)
  • done by marginalizing out all latent variables
  • priors )
    • on function : Gaussian Process prior
    • on \(\sigma\) : Inverse Gamma prior


Data Likelihood

( = exponentiation of the Pinball loss \(\mathcal{L}\left(\xi_{i}, \alpha\right)=\left\{\begin{array}{ll} \alpha \xi_{i} & \text { if } \xi_{i} \geq 0 \\ (\alpha-1) \xi_{i} & \text { if } \xi_{i}<0 . \end{array}\right.\) )

\(p\left(\mathbf{y}_{i} \mid \mathbf{f}_{i}, \alpha, \sigma, \mathbf{x}_{i}\right) =\frac{\alpha(1-\alpha)}{\sigma} \exp \left(-\frac{\xi_{i}\left(\alpha-I\left(\xi_{i}<0\right)\right)}{\sigma}\right)\).

  • where \(\xi_{i}=\mathbf{y}_{i}-\mathbf{f}_{i}^{1}\).

  • \(p(\mathbf{f} \mid \mathbf{x}) =\mathcal{N}(\mathbf{m}(\mathbf{x}), \mathbf{K}(\mathbf{x}))\).
  • \(p(\sigma) =\operatorname{IG}\left(10^{-6}, 10^{-6}\right)\).


Important property of likelihood :

  • \(p\left(\mathbf{y}_{i}<\mathbf{f}_{i}\right)=\alpha\).

  • can be written as mixture of Gaussians :

    \(p\left(\mathbf{y}_{i} \mid \mathbf{f}_{i}, \mathbf{x}_{i}, \sigma, \alpha\right)=\int \mathcal{N}\left(\mathbf{y}_{i} \mid \mu_{\mathbf{y}_{i}}, \sigma_{\mathbf{y}_{i}}\right) \exp \left(-\mathbf{w}_{i}\right) d \mathbf{w}\).

    • \(\mu_{\mathbf{y}_{i}}=\mathbf{f}_{i}\left(\mathbf{x}_{i}\right)+\frac{1-2 \alpha}{\alpha(1-\alpha)} \sigma \mathbf{w}_{i}\).
    • \(\sigma_{\mathbf{y}_{i}}=\frac{2}{\alpha(1-\alpha)} \sigma^{2} \mathbf{w}_{i}\).
  • likelihood can be represented as a joint distn with \(\mathbf{w}\) ( will be marginalized out )


3. Variational Bayesian Inference

Data Likelihood :

  • \(\log p(\mathbf{y} \mid \mathbf{x}, \alpha, \theta)\).

  • can also be expressed as

    \(\mathcal{L}(q(\mathbf{f}, \mathbf{w}, \sigma), \theta \mid \alpha)+ K L(q(\mathbf{f}, \mathbf{w}, \sigma) \mid p(\mathbf{f}, \mathbf{w}, \sigma \mid \mathbf{y}, \theta, \alpha))\).

    where \(\mathcal{L} =\iint q(\mathbf{f}, \mathbf{w}, \sigma) \log \frac{p(\mathbf{f}, \mathbf{w}, \sigma, \mathbf{y} \mid \theta, \alpha)}{q(\mathbf{f}, \mathbf{w}, \sigma)} d \mathbf{f} d \mathbf{w} d \sigma\).


Closed form solution :

  • \(q\left(z_{i}\right)=\exp (E(\log p(\mathbf{z}, \mathbf{y})) / Z\).


Approximate posterior on the function space : \(\mathcal{N}(\mu, \Sigma)\)

  • \(\Sigma=\left(\left\langle\mathrm{D}^{-1}\right\rangle+\mathrm{K}^{-1}\right)^{-1}\).

  • \(\mu =\Sigma\left(\left\langle\mathrm{D}^{-1}\right\rangle \mathrm{y}-\frac{1-2 \alpha}{2}\left\langle\frac{1}{\sigma}\right\rangle 1\right)\).

    • \(\mathbf{D}=\frac{2}{\alpha(1-\alpha)} \sigma^{2} \operatorname{diag}(\mathbf{w}) .\).
    • \(\langle\mathbf{f}\rangle=\boldsymbol{\mu}\).
    • \(\left\langle\mathrm{ff}^{T}\right\rangle=\Sigma+\mu \mu^{T}\).


4. Hyper-parameter Optimization

나중에

Categories:

Updated: