I have a object detection problem which has extremely imbalanced dataset. Lets say there is only one class to detect, say apple or not apple. This detection network will be used in a real case including IP camera streaming where positive/negative samples ratio is extremely huge, 1:1Million
.
My first idea to train the model with using a dataset of 100:100k
positive/negative samples ratio but the model still might be skewed to negative samples which might results in not detecting apples apparently.
I have two questions. My questions are:
- What are the best approaches to this imbalanced dataset problem ?
- Since I am using high resolution camera, detecting apples from camera streaming is related with small object detection problem since the apples in stream is really small compared the resolution of the camera. I know that Yolo is doing some data augmentation to improve accuracy but would it be enough to detect small objects? Should I used some image tiling techniques? If someone has experience about small object detection problem and know how to handle this, any idea would be perfect.
P.S: I am using Yolov4 with AlexeyAB's Darknet as a object detection model. Also, background of images are mostly the same due to Camera view.