Object Detection: Can I modify this script to support larger images (Scaled YOLOv4)?

Question

I am looking at training the Scaled YOLOv4 on TensorFlow 2.x, as can be found at this link. I plan to collect the imagery, annotate the objects within the image in VOC format, and then use these images/annotations to train the large-scale model. If you look at the multi-scale training commands, they are as follows:

python train.py --use-pretrain True --model-type p5 --dataset-type voc --dataset dataset/pothole_voc --num-classes 1 --class-names pothole.names --voc-train-set dataset_1,train --voc-val-set dataset_1,val  --epochs 200 --batch-size 4 --multi-scale 320,352,384,416,448,480,512 --augment ssd_random_crop

As we know that Scaled YOLOv4 (and any YOLO algorithm at that) likes image dimensions divisible by 32, I have plans to use larger images of 1024x1024. Is it possible to modify the --multi-scale commands to include larger dimensions such as 1024, and have the algorithm run successfully?

Here is what it would look like when modified:

--multi-scale 320,352,384,416,448,480,512,544,576,608,640,672,704,736,768,800,832,864,896,928,960,992,1024

score 1 · Accepted Answer · answered Apr 15 '21 at 20:07

1

Yes, the functionality should is there. But, don't you think you are overdoing the scales. You have at least 18 scales mentioned here. Too much of anything is bad. There is a reason it likes things divisible by 32 because at that increase in size something more meaningful will show up in the image. Spamming sizes like this won't help you at all, it would rather waste your time.

answered Apr 15 '21 at 20:07

Abhishek Verma

858
3
6

thank you, that makes sense. So should I do something like start at 1024 and list 6 or 7 sizes down from it; would that work? – ihb Apr 15 '21 at 22:04
1

I think you can start with 128, 256, 512, 1024. – Abhishek Verma Apr 16 '21 at 00:25

Object Detection: Can I modify this script to support larger images (Scaled YOLOv4)?

1 Answers1