What are the differences and similarities between PAC learning and classic parameter estimation theorems (e.g. consistency results when estimating parameters, e.g. with MLE)?
Asked
Active
Viewed 124 times
2

nbro
- 39,006
- 12
- 98
- 176

FourierFlux
- 783
- 1
- 4
- 14
-
The concept of "consistency" also exists in learning theory, and it's been defined differently in different contexts. For example, see the paper [Learning and Consistency](https://www-alg.ist.hokudai.ac.jp/~thomas/publications/gosbook95wz.pdf) (1995), by Wiehagen and Zeugmann, or chapter 7 of [Understanding Machine Learning: From Theory to Algorithms](https://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/understanding-machine-learning-theory-algorithms.pdf) (2014) by Shalev-Shwartz and Ben-David. In statistics, consistency is related to the convergence guarantees of the estimator. – nbro Apr 26 '20 at 19:33
-
The relationship is more fundamental: both seek to modify parameters to fit a model, both provide asymptotic bounds on how many observations are needed. The question is, why isn't PAC learning a strictly stronger version of classical parameter estimation? For example, you can view the estimation of a real number with AWGN in the MLE framework and in the PAC framework right? Both provide bounds on how many observations are needed to stay predict the probabilistic accuracy of the estimator. But it appears statistics and PAC learning are very separate fields. – FourierFlux Apr 26 '20 at 21:04
-
I didn't talk about other things because your question is too broad (and could be closed as too broad). I only talked about consistency because that's the only specific "classic results" you mentioned. I suggest you narrow your question. – nbro Apr 26 '20 at 21:06