0

I'm on a robotics team and we've been tasked to write a program to differentiate between a live and dead fish. We've been given ~15 minutes of training footage and it's absolutely terrible. It's low quality, hard to label (even for humans) and it's like 20 frames a second.

I have tried everything I can think of. YOLO, 3D convolutions (to take movement over time into account), residual networks with anywhere from 1-10 layers and more. I have narrowed it down to the data is just terrible.

Is there anything I can do to fix this? I know of data augmentation and have used it, but that doesn't increase the usefulness of the data, it just creates more terrible data. I feel like using machine learning to clean the data wouldn't be helpful (because of studies I can't remember the name of showing that adding one white pixel to an image can completely confuse an object classifier I just assume that using a machine learning model to alter an image would also just confuse a network), is this an accurate assumption?

Either way: is there anyway to improve the data I've been given? Or another way to approach this problem?

nbro
  • 39,006
  • 12
  • 98
  • 176
  • The study mentioned in your question (halfway through answer): https://ai.stackexchange.com/a/14264/17742 – Rob May 31 '22 at 02:33

1 Answers1

0

I suggest using a model of super resolution to enhance the quality of your dataset and then annotate it properly because by that time it will be readable at least from a human being perspective.

You can check out this blog post for more information.

haddagart
  • 1
  • 1
  • These models fill in data that doesn't exist. Won't this hurt the accuracy of any network I train? Even if there isn't a noticable difference to the human eye, wouldn't this be useless to a computer? –  May 30 '22 at 21:36
  • 1
    You can use super resolution for ease of annotation purposes only. and then you can reduce the qualityof your dataset and the ground truth as well to return back to the original case you started with. So you can train a model that takes an input with bad quality. – haddagart May 30 '22 at 21:51
  • 2
    Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community May 31 '22 at 02:35