Questions tagged [uncertainty-quantification]

For questions about uncertainty quantification (aka uncertainty estimation) in the context of artificial intelligence, in particular, in the context of Bayesian machine learning.

17 questions
37
votes
6 answers

Why do CNN's sometimes make highly confident mistakes, and how can one combat this problem?

I trained a simple CNN on the MNIST database of handwritten digits to 99% accuracy. I'm feeding in a bunch of handwritten digits, and non-digits from a document. I want the CNN to report errors, so I set a threshold of 90% certainty below which my…
7
votes
1 answer

How does the Dempster-Shafer theory differ from Bayesian reasoning?

How does the Dempster-Shafer theory differ from Bayesian reasoning? How do these two methods handle uncertainty and compute posterior distributions?
5
votes
5 answers

How would AI be able to self-examine?

As I see some cases of machine-learning based artificial intelligence, I often see they make critical mistakes when they face inexperienced situations. In our case, when we encounter totally new problems, we acknowledge that we are not skilled…
3
votes
0 answers

Do we need as much information to know if we can can answer a question as we need to actually answer the question?

I am reading The Book of Why: The New Science of Cause and Effect by Judea Pearl, and in page 12 I see the following diagram. The box on the right side of box 5 "Can the query be answered?" is located before box 6 and box 9 which are the processes…
3
votes
0 answers

Why does this formula $\sigma^2 + \frac{1}{T}\sum_{t=1}^Tf^{\hat{W_t}}(x)^Tf^{\hat{W_t}}(x_t)-E(y)^TE(y)$ approximate the variance?

How does: $$\text{Var}(y) \approx \sigma^2 + \frac{1}{T}\sum_{t=1}^Tf^{\hat{W_t}}(x)^Tf^{\hat{W_t}}(x_t)-E(y)^TE(y)$$ approximate variance? I'm currently reading What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision, and the…
3
votes
0 answers

How can I use Monte Carlo Dropout in a pre-trained CNN model?

In Monte Carlo Dropout (MCD), I know that I should enable dropout during training and testing, then get multiple predictions for the same input $x$ by performing multiple forward passes with $x$, then, for example, average these predictions. Let's…
3
votes
1 answer

Is there any research on models that provide uncertainty estimation?

Is there any research on machine learning models that provide uncertainty estimation? If I train a denoising autoencoder on words and put through a noised word, I'd like it to return a certainty that it is correct given the distribution of data it…
1
vote
2 answers

How do language models know what they don't know - and report it?

Again and again I ask myself what goes on in a pre-trained transformer-based language model (like ChatGPT9) when it comes to "know" that it cannot give an appropriate answer and either states it ("I have not enough information to answer this…
1
vote
0 answers

Active Learning regression with Random Forest

I have a dataset of about 8k points and I am trying to employ active learning with the random forest regressor. I have split the dataset to train and test with train being around 20 points. The test serves as the unlabelled pool (although I have the…
1
vote
0 answers

How to calculate uncertainty in Deep Ensembles for Reinforcement Learning?

Lets take the following example: I must predict the return (Q-values) of x state-action pairs using an ensemble of m models. Using NumPy I could have the following for x = 5 and m = 3: >>> predictions = np.random.rand(3, 1, 5) [[[0.22668968…
1
vote
0 answers

Does MobileNet SSD v2 only capture aleatoric uncertainty (and so not the epistemic one)?

Regarding the MobileNet SSD v2 model, I was wondering to what extend it captures uncertainty of the predictions. There are 2 types of uncertainty, data uncertainty (aleatoric) and model uncertainty (epistemic). The model outputs bounding boxes with…
1
vote
2 answers

Why is my Keras prediction always close to 100% for one image class?

I am using Keras (on top of TF 2.3) to train an image classifier. In some cases I have more than two classes, but often there are just two classes (either "good" or "bad"). I am using the tensorflow.keras.applications.VGG16 class as base model with…
0
votes
0 answers

Do deep ensembles and regular ensembles coincide for classification tasks?

The deep ensemble paper https://arxiv.org/pdf/1612.01474.pdf introduces proper scoring rules for ensembles of NNs. Turns out that the likelihood is always a proper scoring rule. For regression tasks, we can then use the gaussian NLL, which includes…
0
votes
0 answers

What is the relationship between entropy in thermodynamics and entropy in information theory?

BACKGROUND: In thermodynamics, entropy $S$ is a measure of disorder and is given by $${\displaystyle S=k_B\log(W)},$$ where $k_B$ is Boltzman's constant and $W$ is the number of microstates. In information theory, (Shannon) entropy $H$ is a measure…
0
votes
1 answer

What does AUSE metric mean in uncertainty estimation

I am reading the paper "Evaluating Scalable Bayesian Deep Learning Methods for Robust Computer Vision", I do not understand the definition of AUSE metric in this sentence "but only in terms of the AUSE metric which is a relative measure of the…
1
2