[paper reading][Proceedings of the IEEE 2016] Taking the Human Out of the Loop: A Review of Bayesian

目录

1 Introduction

  • design, choice, high-dim, hyperparam
    • IBM ILOG CPLEX
  • \(x^* = argmax_{x\in \mathcal X}f(x)\)
    • compact subset of \(\mathbb R^d\), or ...
    • stochastic output \(\mathbb E[y|f(x)]=f(x)\)
    • unbiased noisy point-wise observations
  • data efficient, evaluations are costly
  • prior, refine
  • best choice? acquisition function \(\alpha_n: \mathcal X\to \mathbb R\)
  • [paper reading][Proceedings of the IEEE 2016] Taking the Human Out of the Loop: A Review of Bayesian
    • mean, confidence interval
  • myopic heuristics
    • uncertainty is large (exploration), or prediction is high (exploitation)
    • acquisition function: easy to find the optimum, analytic?

2 Bayesian Optimization with Parametric Models

  • parametrized by \(w\)
  • \(\mathcal D\): data
  • bayesian: \(p(w|D)=p(D|w)p(w)/p(D)\)
    • beliefs about \(w\) after observing data \(D\)
    • \(p(D)\) intractable, but in fact a normalizing constant
  • prior: conjucacy, analytically
  • \(K\) drugs, independent
    • to optimize \(f\), on \(K\) indices, fully parametrized
    • beta, conjugacy
  • TS, simplest strategy, posterior prob of optimality, estimated, MC
    • \(a_{n+1}=argmax_a f_{\bar w}(a)\)
    • no more param other than the prior
  • linear model, feature, vector, \(f_w(a)=x_a^T w\)
  • \(X\): input vectors, \(y\): outputs
  • nonlinear basis functions
    • radial
    • Fourier
    • learned from data
    • feature map, regardless, weights can be computed analytically

3 Nonparametric models

  • start, observation variance \(\sigma^2\), zero-mean Gaussian prior \(V_0\), preserve Gaussianity
  • basis functions, linear regression, symmetric positive-semidefinite, kernel
    • intuitive similarity between pairs of points, rather than a feature map \(\Phi\)
    • tractable, linear algebra, unnecessary to explicitly define \(\Phi\)
  • GP, nonparametric model, prior mean, covariance
  • \(f|X\sim \mathcal N(m,K)\)
  • \(y|f, \sigma^2\sim \mathcal N(f,\sigma^2 I)\)
  • posterior: use \(x\) and previous data (not "abstracted by parameters")
  • kernel, structure, periodic, stationary
    • Matern, diagonal, paramtrized
      [paper reading][Proceedings of the IEEE 2016] Taking the Human Out of the Loop: A Review of Bayesian
  • kernel, smoothness and amplitude
  • prior, possible offset, constant, expert knowledge
上一篇:【鼠群优化算法】基于鼠群优化算法求解单目标优化问题(Rat Swarm Optimizer,RSO)matlab代码


下一篇:E2. Array Optimization by Deque(树状数组 + 离散化 + 贪心)