I'd like to learn about generalization theory for machine learning algorithms. I'm looking for books and other references (in case books aren't available) that provide a gentle introduction to the field for a relative beginner like me.
My background includes exposure to mostly undergrad mathematics and I have enough mathematical maturity to learn graduate-level topics as well.
To be more specific, I'm looking to understand more about mathematical abstraction of ML concepts (e.g. learning algorithm, hypothesis space, complexity of algorithm/hypothesis etc.), the purpose of an ML algorithm as an expected risk minimization exercise, techniques used to get bounds on generalization and so on.
To be even more specific, I'm looking to familiarize myself with concepts, theory and techniques so that I can understand papers (at least on a basic level) like:
- Generalization in Deep Learning (2019)
- Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness
- nbro's answer to this question
and references therein