Questions tagged [statistics]

For questions related to statistics in the context of artificial intelligence and, in particular, machine learning. Note that there is a Stack Exchange website completely dedicated to statistics, namely, Cross Validated Stack Exchange.

29 questions
4
votes
0 answers

When do two identical neural networks have uncorrelated errors?

In Chapter 9, section 9.1.6, Raul Rojas describes how committees of networks can reduce the prediction error by training N identical neural networks and averaging the results. If $f_i$ are the functions approximated by the $N$ neural nets,…
3
votes
1 answer

Why has statistics-based AI become more popular than other forms of AI?

What is the fundamental reason that statistics-based AI (e.g., ML and Neural Net) has become more popular than other forms of AI, e.g., Fuzzy Logic and rules-based AI (e.g., Prolog)?
user366312
  • 351
  • 1
  • 12
3
votes
1 answer

If I can repeat ML experiments, how can I bound my results?

It has been asked here if we should repeat lengthy experiments. Let's say I can repeat them, how should I present them? For instance, if I am measuring the accuracy of a model on test data during some training epochs, and I repeat various times this…
biofa.esp
  • 31
  • 2
3
votes
1 answer

What is the most statistically acceptable method for tuning neural network hyperparameters on very small datasets?

Neural networks are usually evaluated by dividing a dataset into three splits: training, validation, and test The idea is that critical hyperparameters of the network such as the number of epochs and the learning rate can be tuned by testing the…
3
votes
0 answers

What is meant by subspace clustering in MFA?

The basic idea of MFA is to perform subspace clustering by assuming the covariance structure for each component of the form, $\Sigma_i = \Lambda_i \Lambda_i^T + \Psi_i$, where $\Lambda_i \in \mathbb{R}^{D\times d}$, is the factor loadings matrix…
stoic-santiago
  • 1,121
  • 5
  • 18
3
votes
1 answer

Research paths/areas for improving the performance of CNNs when faced with limited data

I've been reading through the research literature for image processing, computer vision, and convolutional neural networks. For image classification and object recognition, I know that convolutional neural networks deliver state-of-the-art…
3
votes
1 answer

How does $\mathbb{E}$ suddenly change to $\mathbb{E}_{\pi'}$ in this equation?

In Sutton-Barto's book on page 63 (81 of the pdf): $$\mathbb{E}[R_{t+1} + \gamma v_\pi(S_{t+1}) \mid S_t=s,A_t=\pi'(s)] = \mathbb{E}_{\pi'}[R_{t+1} + \gamma v_\pi(S_{t+1}) \mid S_{t} = s]$$ How does $\mathbb{E}$ suddenly change to…
3
votes
1 answer

What is the difference between model and data distributions?

Is there any difference between the model distribution and data distribution, or are they the same?
2
votes
1 answer

What does it mean when a model "statistically outperforms" another?

I was reading this paper where they are stating the following: We also use the T-Test to test the significance of GMAN in 1 hour ahead prediction compared to Graph WaveNet. The p-value is less than 0.01, which demonstrates that GMAN statistically…
razvanc92
  • 1,108
  • 7
  • 18
2
votes
0 answers

What is the relationship between PAC learning and classic parameter estimation theorems?

What are the differences and similarities between PAC learning and classic parameter estimation theorems (e.g. consistency results when estimating parameters, e.g. with MLE)?
1
vote
0 answers

Does (English) ChatGPT-generated content have statistically significantly different character frequency than human-generated content?

Detecting ChatGPT-generated content is a contentious topic at the moment. E.g., I've been on Reddit r/ChatGPT, and there's a constant stream of users claiming they've been unfairly accused of plagiarism. One thing I'm curious about is character…
1
vote
1 answer

Distribution of a log-normal random variable with fixed dimension

Consider a random variable $x_{ijk}$, where $ijk$ is a subscript that indicates that the variable varies with 3 dimensions (e.g., firm, product, and country). $x_{ijk}$ is known to be i.i.d. log-normally distributed. How to find the distribution of…
1
vote
1 answer

Markov's Decision Process - calculate value in each iteration

I have the following decision tree: I calculated the value of the plan using the following paramenters (given): {0 → 1 , 1 → 3 , 2 → 4 }, Discount factor ()= 0.2 I used this formula to calculate the linear equations to find the value of the plan: …
1
vote
0 answers

What does "statistical strength" mean in this context?

Consider the following excerpt from a paragraph taken from chapter 10: Sequence Modeling: Recurrent and Recursive Nets of the textbook named Deep Learning by Ian Goodfellow et al regarding the advantages of RNN over full traditional MLP. To go from…
hanugm
  • 3,571
  • 3
  • 18
  • 50
1
vote
1 answer

How does a VGG-based Style-Loss incorporate color information?

I've recently been reading a lot about style transfer, its applications and implications. I understand what the Gram matrix is and does. I can program it. But one thing that has been boggling me is: how does the VGG style loss incorporate color…
1
2