For questions related to Bayesian optimization (BO), which is a technique used to model an unknown function (that is expensive to evaluate), based on concepts of a surrogate model (which is usually a Gaussian process, which models the unknown function), Bayesian inference (to update the Gaussian process) and an acquisition function (which guides the Bayesian inference). BO can be used for hyper-parameter optimization.
Questions tagged [bayesian-optimization]
15 questions
6
votes
3 answers
What is a "surrogate model"?
In the following paragraph from the book Automated Machine Learning: Methods, Systems, Challenges (by Frank Hutter et al.)
In this section we first give a brief introduction to Bayesian optimization, present alternative surrogate models used in it,…

yousef yegane
- 163
- 1
- 6
4
votes
1 answer
Bayesian hyperparameter optimization, is it worth it?
In the Deep Learning book by Goodfellow et al., section 11.4.5 (p. 438), the following claims can be found:
Currently, we cannot unambiguously recommend Bayesian hyperparameter
optimization as an established tool for achieving better deep learning…

Stefano Barone
- 183
- 4
4
votes
0 answers
How can I draw a Bayesian network for this problem with birds?
I am working on the following problem to gain an understanding of Bayesian networks and I need help drawing it:
Birds frequently appear in the tree outside of your window in the morning and evening; these include finches, cardinals and robins.…

Jake
- 41
- 1
3
votes
0 answers
Can we use a Gaussian process to approximate the belief distribution at every instant in a POMDP?
Suppose $x_{t+1} \sim \mathbb{P}(\cdot | x_t, a_t)$ denotes the state transition dynamics in a reinforcement learning (RL) problem. Let $y_{t+1} = \mathbb{P}(\cdot | x_{t+1})$ denote the noisy observation or the imperfect state information. Let…

math_phile
- 56
- 2
2
votes
1 answer
In which community does using a Bayesian regression model as a reward function with exploration vs. exploitation challenges fall under?
I am trying to find research papers addressing a problem that, in my opinion, deserves significant attention. However, I am having difficulty locating relevant information.
To illustrate the problem at hand, consider a multivariate Bayesian…

paul
- 33
- 5
2
votes
0 answers
Minimum sampling for maximising the prediction accuracy
Suppose that I'm training a machine learning model to predict people's age by a picture of their faces. Lets say that I have a dataset of people from 1 year olds to 100 year olds. But I want to choose just 9 (arbitrary) ages out of this 100 age…

noone
- 123
- 4
2
votes
0 answers
Is it normal to see oscillations in tested hyperparameters during bayesian optimisation?
I've been trying out bayesian hyperparameter optimisation (with TPE) on a simple CNN applied to the MNIST handwritten digit dataset. I noticed that over iterations of the optimisation loop, the tested parameters appear to oscillate slowly.
Here's…

Alexander Soare
- 1,319
- 2
- 11
- 26
1
vote
0 answers
variational inference but with a weighted loglikelihood
I would like to know if it's correct if I substitute in the ELBO formula
a weighted sum of the loglikelihood
$$\sum E_{q_{\theta}(w)}[w_i \ln{p(y_i|f^{w}(x_i))}]$$
in place of the traditional sum.
My problem is that my dataset comes with the…

Alucard
- 111
- 1
1
vote
0 answers
Why does importance sampling work with latent variable models?
Caveat: importance sampling doesn't actually work for variational auto-encoders, but the question makes sense regardless
In "L4 Latent Variable Models (VAE) -- CS294-158-SP20 Deep Unsupervised Learning", we see the following.
We want to optimize the…

Foobar
- 151
- 5
1
vote
0 answers
Alternatives to Bayesian optimization
I am given a dataset $\mathcal{D} = \{\mathbf{x}_i\}_{i=1}^n$ and I need to find the point (in my case a material) $\mathbf{x}^*$ that maximizes a property $y$ (which can be obtained from a black-box function $f(\mathbf{x}$), performing the least…

ado sar
- 150
- 4
1
vote
0 answers
Bayesian optimization with confidence bound not working
I have a simple MLP for which I want to optimize some hyperparameters. I have fixed the number of hidden layers (for unrelated reasons) to be 3. So the hyperparameters being optimized through Bayesian Optimization are just number of neurons per…

NeuroEng
- 121
- 4
1
vote
1 answer
How can I interpret the value returned by score(X) method of sklearn.neighbors.KernelDensity?
For sklearn.neighbors.KernelDensity, its score(X) method according to the sklearn KDE documentation says:
Compute the log-likelihood of each sample under the model
For 'gaussian' kernel, I have implemented hyper-parameter tuning for the…

Arun
- 225
- 1
- 8
1
vote
1 answer
Understanding Bayesian Optimisation graph
I came across the concept of Bayesian Occam Razor in the book Machine Learning: a Probabilistic Perspective. According to the book:
Another way to understand
the Bayesian Occam’s razor effect is to note that probabilities must
sum to one. Hence…
user9947
1
vote
1 answer
Are there Python packages for recent Bayesian optimization methods?
I want to try and compare different optimization methods in some datasets. I know that in scikit-learn there are some corresponding functions for the grid and random search optimizations. However, I also need a package (or multiple ones) for…

Enes Altuncu
- 153
- 2
0
votes
0 answers
Very high dimensional optimization with large budget, requiring high quality solutions
What would be theoretically the best performing optimization algorithm(s) in this case?
Very high dimensional problem: 250-500 parameters
Goal is to obtain very high quality solutions, not just "good" solutions
Parameters form multiple…

Charly Empereur-mot
- 101
- 3