-1

I would like to have an order of magnitude of ressources required to build an image recognition system.

Let say you want to build a startup company which main product will have to distinguish 20 different kinds of objects (bottle, dogs, car, flowers...). Images are already tagged.

  • How many images are needed as a learning set ? 1k, 10k, 100k, 1 million ?
  • What kind of hardware and how long will the learning process take ?
  • How many developers, how much time ​?
  • Does it changes a lot if the number of target output is reduced to two kinds, or increased to one thousands ?

​A link to a real life paper would be perfect. Thank you

bokan
  • 389
  • 2
  • 8
  • "Closed. This question needs to be more focused. It is not currently accepting answers." How stupid is that ? I ask for an order of magnitude or a real life example. Of course it is not focused ! – bokan Dec 27 '21 at 09:54

1 Answers1

3

One answer is infinite amount of time because it can always be better.

Another answer is:

  • 10k for training set
  • A PC with a GPU (3~4k USD), google colab (10 USD per month), or other cloud service (probably more expensive than colab)
  • One developer, 1 day lol
  • Two kinds is easier than multiple kinds
  • There is no paper that seeks to answer your question the way you put it. I wouldn't even recommend a paper. In fact, for you I'd recommend an AutoML tutorial. Check these. * no offence if I've misjudged your knowledge/skill level.
  • Here's a paper anyway :) https://paperswithcode.com/lib/torchvision/alexnet

To conclude, please be aware that your question is super open ended, and my answer is bad (but good enough for now maybe), but a good answer doesn't really exist. It's always going to be context dependent. For instance, you never said whether you need 90% or 99% accuracy.

Alexander Soare
  • 1,319
  • 2
  • 11
  • 26
  • Your answer is perfect, the question was intentionally open, you provided a real life example. – bokan Dec 27 '21 at 09:56