Highest Voted 'training-datasets' Questions - Artificial Intelligence Stack Exchange

8

votes

1 answer

What causes ChatGPT to generate responses that refer to itself as a bot or LM?

ChatGPT occasionally generates responses to prompts that refer to itself as a "bot" or "language model." For instance, when given a certain input (the first paragraph of this question) ChatGPT produces (in part) the output: It is not appropriate…

asked Dec 16 '22 at 08:58

Obie 2.0

183
6

7

votes

1 answer

How many training data is required for GAN?

I'm beginning to study and implement GAN to generate more datasets. I'll just try to experiment with state-of-the-art GAN models as described here https://paperswithcode.com/sota/image-generation-on-cifar-10. The problem is I don't have a big…

datasets generative-adversarial-networks image-generation training-datasets sample-complexity

asked Nov 28 '19 at 08:10

gameon67

215
3
12

6

votes

1 answer

How was ChatGPT trained?

I know that large language models like GPT-3 are trained simply to continue pieces of text that have been scraped from the web. But how was ChatGPT trained, which, while also having a good understanding of language, is not directly a language model,…

natural-language-processing chat-bots training-datasets language-model chatgpt

asked Dec 29 '22 at 01:02

HelloGoodbye

313
1
11

6

votes

1 answer

During neural network training, can gradients leak sensitive information in case training data fed is encrypted (homomorphic)?

Some algorithms in the literature allow recovering the input data used to train a neural network. This is done using the gradients (updates) of weights, such as in Deep Leakage from Gradients (2019) by Ligeng Zhu et al. In case the neural network is…

neural-networks training gradient-descent ai-security training-datasets

asked Dec 19 '20 at 22:03

witdev

73
4

5

votes

1 answer

How can I estimate how many photos I need to train ResNet-50 for image classification?

I am working on a project where I have to classify around 1000 unique objects. I'm trying to plan how much training data I will need to collect. I was planning on using ResNet-50. Is there anyway I can estimate the amount of photos I should plan to…

computer-vision computational-learning-theory training-datasets sample-complexity

asked Nov 16 '21 at 14:56

Tyler Hilbert

145
5

5

votes

2 answers

Do we need automatic hyper-parameter tuning when we have a large enough dataset?

Hyperparameter tuning is the process of selecting the optimal hyperparameters for an ANN. Now, my guess is that, if we have sufficient data (say, 1.4 million for, say, 6 features), the model can be optimally trained and we don't need a…

neural-networks hyperparameter-optimization training-datasets

asked Oct 17 '21 at 18:10

user366312

351
1
12

4

votes

1 answer

What happens to the training data after your machine learning model has been trained?

What happens after you have used machine learning to train your model? What happens to the training data? Let's pretend it predicted correct 99.99999% of the time and you were happy with it and wanted to share it with the world. If you put in 10GB…

neural-networks machine-learning training datasets training-datasets

asked Aug 28 '18 at 04:55

icYou520

159
6

4

votes

1 answer

What are "development test sets" used for?

This is a theoretical question. I am a newbie to artificial intelligence and machine learning, and the more I read the more I like this. So far, I have been reading about the evaluation of language models (I am focused on ASR), but I still don't get…

terminology cross-validation training-datasets test-datasets validation-datasets

asked Mar 13 '18 at 10:04

little_mice

143
2

3

votes

1 answer

How do I select the (number of) negative cases, if I'm given a set of positive cases?

We were given a list of labeled data (around 100) of known positive cases, i.e. people that have a certain disease, i.e. all these people are labeled with the same class (disease). We also have a much larger amount of data that we can label as…

neural-networks imbalanced-datasets selection-bias training-datasets test-datasets

asked Dec 20 '20 at 14:14

Otto

33
5

3

votes

0 answers

How does one continue the pre-training in BERT?

I need some help with continuing pre-training on Bert. I have a very specific vocabulary and lots of specific abbreviations at hand. I want to do an STS task. Let me specify my task: I have domain-specific sentences and want to pair them in terms of…

training python bert fine-tuning training-datasets

asked Mar 05 '20 at 15:02

Adrian_G

31
1

2

votes

2 answers

What is the effect of training a neural network with randomly generated fake data that satisfies certain constraints?

I have a neural network with 2 inputs and one output, like so: input | output ____________________ a | b | c 5.15 |3.17 | 0.0607 4.61 |2.91 | 0.1551 etc. I have 75 samples and I am using 50 for training and 25 for…

neural-networks reference-request data-augmentation training-datasets

asked Apr 01 '18 at 11:48

Mohammad

123
5

2

votes

2 answers

Is there a measure of model complexity?

deep-learning models training-datasets sample-complexity vc-theory

asked Jun 14 '23 at 20:44

Justaperson

153
3

2

votes

3 answers

Why does MNIST provide only a training and a test set and not a validation set as well?

I was taught that, usually, a dataset has to be divided into three parts: Training set - for learning purposes Validation set - for picking the model which minimize the loss on this set Test test - for testing the performance of the model picked…

datasets training-datasets mnist validation-datasets

asked Oct 22 '22 at 08:45

tail

147
6

2

votes

2 answers

How many unique angles of an object do you need in your image training set in order to correctly classify it?

I'm interested in using ResNet-50 to classify images of objects for around 1000 unique classes. I'm wondering if there is any way to estimate how many unique angles I need in my training set to classify images that can be taken from any angle. For…

deep-learning computer-vision residual-networks training-datasets

asked Nov 17 '21 at 01:06

Tyler Hilbert

145
5

2

votes

2 answers

Why not make the training set and validation set one if their roles are similar?

If the validation set is used to tune the hyperparameters and the training set adjusts the weights, why don't they be one thing as they have a similar role, as in improving the model?

neural-networks deep-learning comparison training-datasets validation-datasets

asked Jul 28 '21 at 10:08

Omar Zayed

43
4

Questions tagged [training-datasets]