0

I am looking at training the Scaled YOLOv4 on TensorFlow 2.x, as can be found at this link. I plan to collect the imagery, annotate the objects within the image in VOC format, and then use these images/annotations to train the large-scale model. If you look at the multi-scale training commands, they are as follows:

python train.py --use-pretrain True --model-type p5 --dataset-type voc --dataset dataset/pothole_voc --num-classes 1 --class-names pothole.names --voc-train-set dataset_1,train --voc-val-set dataset_1,val  --epochs 200 --batch-size 4 --multi-scale 320,352,384,416,448,480,512 --augment ssd_random_crop

As we know that Scaled YOLOv4 (and any YOLO algorithm at that) likes image dimensions divisible by 32, I have plans to use larger images of 1024x1024. Is it possible to modify the --multi-scale commands to include larger dimensions such as 1024, and have the algorithm run successfully?

Here is what it would look like when modified:

--multi-scale 320,352,384,416,448,480,512,544,576,608,640,672,704,736,768,800,832,864,896,928,960,992,1024
ihb
  • 129
  • 1
  • 10

1 Answers1

1

Yes, the functionality should is there. But, don't you think you are overdoing the scales. You have at least 18 scales mentioned here. Too much of anything is bad. There is a reason it likes things divisible by 32 because at that increase in size something more meaningful will show up in the image. Spamming sizes like this won't help you at all, it would rather waste your time.

Abhishek Verma
  • 858
  • 3
  • 6