For questions related to the sample complexity of a machine learning algorithm, which represents the number of training-samples that it needs in order to successfully learn a target function. More precisely, the sample complexity is the number of training-samples that we need to supply to the algorithm, so that the function returned by the algorithm is within an arbitrarily small error of the best possible function, with probability arbitrarily close to 1.
Questions tagged [sample-complexity]
14 questions
7
votes
1 answer
How many training data is required for GAN?
I'm beginning to study and implement GAN to generate more datasets. I'll just try to experiment with state-of-the-art GAN models as described here https://paperswithcode.com/sota/image-generation-on-cifar-10.
The problem is I don't have a big…

gameon67
- 215
- 3
- 12
5
votes
1 answer
How can I estimate how many photos I need to train ResNet-50 for image classification?
I am working on a project where I have to classify around 1000 unique objects. I'm trying to plan how much training data I will need to collect. I was planning on using ResNet-50. Is there anyway I can estimate the amount of photos I should plan to…

Tyler Hilbert
- 145
- 5
5
votes
4 answers
How does size of the dataset depend on VC dimension of the hypothesis class?
This might be a little broad question, but I have been watching Caltech youtube videos on Machine Learning, and in this video prof. is trying to explain how we should interpret the VC dimension in terms of what it means in layman terms, and why do…

Stefan Radonjic
- 187
- 5
3
votes
2 answers
Is there a way to define the boundaries of the optimal size of a training set?
At a related question in Computer Science SE, a user told:
Neural networks typically require a large training set.
Is there a way to define the boundaries of the "optimal" size of a training set in the general case?
When I was learning about fuzzy…

Zoltán Schmidt
- 623
- 7
- 14
2
votes
2 answers
2
votes
1 answer
Do the terms 'sample complexity' and 'sample efficiency' mean the same thing in RL context
For example, the the paper Soft Actor-Critic:Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor, both terms are mentioned but without explaining. I have seen them in other places as well but in different contexts. So I am…

Sam
- 175
- 5
2
votes
0 answers
Can you find another reason for sample inefficiency of model-free on-policy Deep Reinforcement Learning?
The following mindmap gives an overview of multiple reasons for sample inefficiency. The list is definitely not complete. Can you see another reason not mentioned so far?
Some related links:
interactive mindmap
Reddit post asking the same…

Ray Walker
- 451
- 3
- 8
2
votes
1 answer
Is there any practical application of knowing whether a concept class is PAC-learnable?
A concept class $C$ is PAC-learnable if there exists an algorithm that can output a hypothesis with probability at least $(1-\delta)$ (the "probably" part), and an error that is less than $\epsilon$ (the "approximately" part), in time that is…

calveeen
- 1,251
- 7
- 17
1
vote
0 answers
Is the VC dimension of a MLP regressor a valid upper bound on how many points it can exactly fit?
I want to calculate an upper bound on how many training points an MLP regressor can fit with ~0 error. I don't care about the test error, I want to overfit as much as possible the (few) training points.
For example, for linear regression, it's…

Daniele
- 11
- 2
1
vote
0 answers
What is the sample complexity of Monte Carlo Exploring Starts in RL?
We can use a model-free Monte Carlo approach to solving an MDP $(S,A,R,P,\gamma)$ with transition dynamics $P$ unknown by estimating Q-values by rolling out trajectories starting from random states $s_0 \in S$ and improving the policy $\pi$…

Snowball
- 213
- 1
- 6
1
vote
0 answers
Is it better to split sequences into overlapping or non-overlapping training samples?
I have $N$ (time) sequences of data with length $2048$. Each of these sequences correseponds to a different target output. However, I know that only a small part of the sequence is needed to actually predict this target output, say a sub-sequence of…

Thomas Wagenaar
- 1,187
- 8
- 11
1
vote
0 answers
How much data do we need for making a successful de-noising auto-encoder?
Is there a guide how much data do you need for making successful denoising model using autoencoders?
Or the rule is, the more data, the better it is?
I tried with small dataset 350 samples, to see what I will get as an output. And I failed. :D

Vesko Vujovic
- 161
- 2
- 8
0
votes
1 answer
What is the relation between any suitable measure of model complexity, number of training examples and network size in deep learning?
What is the relation between any suitable measure of model complexity, number of training examples and network size in deep learning?

Justaperson
- 153
- 3
0
votes
1 answer
For given units of a measure of model complexity, how many examples do we need to train a network to get the model right and generalize?
For given units of a measure of model complexity, how many examples do we need to train a network to get the model right and generalize?

Justaperson
- 153
- 3