Questions tagged [bayesian-optimization]

For questions related to Bayesian optimization (BO), which is a technique used to model an unknown function (that is expensive to evaluate), based on concepts of a surrogate model (which is usually a Gaussian process, which models the unknown function), Bayesian inference (to update the Gaussian process) and an acquisition function (which guides the Bayesian inference). BO can be used for hyper-parameter optimization.

15 questions
6
votes
3 answers

What is a "surrogate model"?

In the following paragraph from the book Automated Machine Learning: Methods, Systems, Challenges (by Frank Hutter et al.) In this section we first give a brief introduction to Bayesian optimization, present alternative surrogate models used in it,…
4
votes
1 answer

Bayesian hyperparameter optimization, is it worth it?

In the Deep Learning book by Goodfellow et al., section 11.4.5 (p. 438), the following claims can be found: Currently, we cannot unambiguously recommend Bayesian hyperparameter optimization as an established tool for achieving better deep learning…
4
votes
0 answers

How can I draw a Bayesian network for this problem with birds?

I am working on the following problem to gain an understanding of Bayesian networks and I need help drawing it: Birds frequently appear in the tree outside of your window in the morning and evening; these include finches, cardinals and robins.…
3
votes
0 answers

Can we use a Gaussian process to approximate the belief distribution at every instant in a POMDP?

Suppose $x_{t+1} \sim \mathbb{P}(\cdot | x_t, a_t)$ denotes the state transition dynamics in a reinforcement learning (RL) problem. Let $y_{t+1} = \mathbb{P}(\cdot | x_{t+1})$ denote the noisy observation or the imperfect state information. Let…
2
votes
1 answer

In which community does using a Bayesian regression model as a reward function with exploration vs. exploitation challenges fall under?

I am trying to find research papers addressing a problem that, in my opinion, deserves significant attention. However, I am having difficulty locating relevant information. To illustrate the problem at hand, consider a multivariate Bayesian…
2
votes
0 answers

Minimum sampling for maximising the prediction accuracy

Suppose that I'm training a machine learning model to predict people's age by a picture of their faces. Lets say that I have a dataset of people from 1 year olds to 100 year olds. But I want to choose just 9 (arbitrary) ages out of this 100 age…
noone
  • 123
  • 4
2
votes
0 answers

Is it normal to see oscillations in tested hyperparameters during bayesian optimisation?

I've been trying out bayesian hyperparameter optimisation (with TPE) on a simple CNN applied to the MNIST handwritten digit dataset. I noticed that over iterations of the optimisation loop, the tested parameters appear to oscillate slowly. Here's…
1
vote
0 answers

variational inference but with a weighted loglikelihood

I would like to know if it's correct if I substitute in the ELBO formula a weighted sum of the loglikelihood $$\sum E_{q_{\theta}(w)}[w_i \ln{p(y_i|f^{w}(x_i))}]$$ in place of the traditional sum. My problem is that my dataset comes with the…
1
vote
0 answers

Why does importance sampling work with latent variable models?

Caveat: importance sampling doesn't actually work for variational auto-encoders, but the question makes sense regardless In "L4 Latent Variable Models (VAE) -- CS294-158-SP20 Deep Unsupervised Learning", we see the following. We want to optimize the…
1
vote
0 answers

Alternatives to Bayesian optimization

I am given a dataset $\mathcal{D} = \{\mathbf{x}_i\}_{i=1}^n$ and I need to find the point (in my case a material) $\mathbf{x}^*$ that maximizes a property $y$ (which can be obtained from a black-box function $f(\mathbf{x}$), performing the least…
1
vote
0 answers

Bayesian optimization with confidence bound not working

I have a simple MLP for which I want to optimize some hyperparameters. I have fixed the number of hidden layers (for unrelated reasons) to be 3. So the hyperparameters being optimized through Bayesian Optimization are just number of neurons per…
NeuroEng
  • 121
  • 4
1
vote
1 answer

How can I interpret the value returned by score(X) method of sklearn.neighbors.KernelDensity?

For sklearn.neighbors.KernelDensity, its score(X) method according to the sklearn KDE documentation says: Compute the log-likelihood of each sample under the model For 'gaussian' kernel, I have implemented hyper-parameter tuning for the…
1
vote
1 answer

Understanding Bayesian Optimisation graph

I came across the concept of Bayesian Occam Razor in the book Machine Learning: a Probabilistic Perspective. According to the book: Another way to understand the Bayesian Occam’s razor effect is to note that probabilities must sum to one. Hence…
user9947
1
vote
1 answer

Are there Python packages for recent Bayesian optimization methods?

I want to try and compare different optimization methods in some datasets. I know that in scikit-learn there are some corresponding functions for the grid and random search optimizations. However, I also need a package (or multiple ones) for…
0
votes
0 answers

Very high dimensional optimization with large budget, requiring high quality solutions

What would be theoretically the best performing optimization algorithm(s) in this case? Very high dimensional problem: 250-500 parameters Goal is to obtain very high quality solutions, not just "good" solutions Parameters form multiple…