Questions tagged [data-augmentation]

For questions related to the concept of data augmentation, where a dataset can be augmented in terms of number and diversity of the samples, which can be useful to avoid over-fitting, especially, when the available dataset(s) is relatively small.

39 questions
7
votes
2 answers

How does rotating an image and adding new 'rotated classes' prevent overfitting?

From Meta-Learning with Memory-Augmented Neural Networks in section 4.1: To reduce the risk of overfitting, we performed data augmentation by randomly translating and rotating character images. We also created new classes through 90◦, 180◦ and 270◦…
5
votes
1 answer

Does the term "data augmentation" imply increasing the training dataset?

I have a manuscript that has been reviewed and one of the reviewers commented on my use of the term " data augmentation", saying that it might not be the appropriate term in my case (explained below). I collected a large dataset of short audio files…
4
votes
3 answers

Would this relatively small dataset be enough to train a CNN?

Scenario: I am trying to create a dataset with images of choice for different animal classes. I am going to train those images for classification using CNN. Problem: Let's assume I somehow don't have the privilege to collect too many images and was…
3
votes
0 answers

If random rotations are included in the data augmentation process, how are the new bounding boxes calculated?

When studying bounding box-based detectors, it's not clear to me if data augmentation includes adding random rotations. If random rotations are added, how is the new bounding box calculated?
2
votes
2 answers

What is the effect of training a neural network with randomly generated fake data that satisfies certain constraints?

I have a neural network with 2 inputs and one output, like so: input | output ____________________ a | b | c 5.15 |3.17 | 0.0607 4.61 |2.91 | 0.1551 etc. I have 75 samples and I am using 50 for training and 25 for…
2
votes
1 answer

How to label edited images after data augmentation?

I am new to neural networks, I've only started studying and learning about the subject a year ago, and I just started building my first neural network. The project is a little bit ambitious: A browser extension for children's safety, it checks for…
2
votes
1 answer

Is using separate channels of a RBG image a valid data augmentation technique?

Suppose there is a ML network that takes grayscale images as the input. The images that I have are RGB images. So, instead of converting these RGB images to grayscale, I treat each individual colour bands as distinct inputs to the network. that is,…
user7080
  • 21
  • 2
2
votes
0 answers

What is the sensible amount of augmentation?

I am playing with the transforms from Torchvision. There are plenty of different kinds of these like: Resize RandomCrop ColorJitter Blurring ... These are some cases of Resize for a given image: ColorJitter RandomAffine The main purpose of the…
2
votes
1 answer

Data augmentation for very small image datasets

I am looking for techniques for augmenting very small image datasets. I have a classification problem with 3 classes. Each class consists of 20 different shapes. The shapes are similar between the classes, but the task is to identify which class the…
2
votes
1 answer

Should one rescale (normalize) image before or after data augmentation?

During image preprocessing pipeline, should one rescale each pixel value to [0, 1] by dividing 255 first, and then perform data transformation such as color distortion, gaussian blur? or vice versa? I believe for correctness, it may depend on the…
2
votes
6 answers

How do I increase the size of an (almost) balanced dataset?

I am trying to add more data points in my (almost) balanced dataset for training my neural network. I have come across techniques such as SMOTE or Random Over Sampling, but they work best for imbalanced data (as they balance the dataset). How can I…
2
votes
1 answer

What is the difference between feature extraction with or without data augmentation?

Here's an extract from Chollet's book "Deep Learning with Python" about using pre-trained CNN to predict class from a photo set (p. 146): At this point, there are two ways you could proceed: Running the convolutional base over your dataset,…
2
votes
1 answer

Should I remove the text overlaying some images in the dataset before training the CNN?

If I am attempting to train a CNN on some image data to perform image classification, but some of the images have pieces of text overlaying them (for the purpose of description to humans), then is it better for the CNN to remove the text? And if so,…
2
votes
0 answers

How much should we augment our training data?

I am wondering how much I should extend my training set with data augmentation. Is there somewhere a pre-defined number I can go with? Suppose I have 10000 images, can I go as far as 10x or 20x times, to get 100000 and 200000, respectively, images?…
2
votes
1 answer

What could cause a big fluctuation of the loss in the last epochs of training an AlexNet?

I am training an AlexNet neural network, with about 12000 images which 80% is for training, 10% is for validation and another 10% is for testing. I have a problem in my plots. There is a big fluctuation in epoch 47, how can I have a smooth plot?…
1
2 3