What amount of ressources is involved in building an image recognition system?

Question

I would like to have an order of magnitude of ressources required to build an image recognition system.

Let say you want to build a startup company which main product will have to distinguish 20 different kinds of objects (bottle, dogs, car, flowers...). Images are already tagged.

How many images are needed as a learning set ? 1k, 10k, 100k, 1 million ?
What kind of hardware and how long will the learning process take ?
How many developers, how much time ?
Does it changes a lot if the number of target output is reduced to two kinds, or increased to one thousands ?

A link to a real life paper would be perfect. Thank you

"Closed. This question needs to be more focused. It is not currently accepting answers." How stupid is that ? I ask for an order of magnitude or a real life example. Of course it is not focused ! — bokan, Dec 27 '21 at 09:54

score 3 · Accepted Answer · answered Mar 23 '21 at 17:57

One answer is infinite amount of time because it can always be better.

Another answer is:

10k for training set
A PC with a GPU (3~4k USD), google colab (10 USD per month), or other cloud service (probably more expensive than colab)
One developer, 1 day lol
Two kinds is easier than multiple kinds
There is no paper that seeks to answer your question the way you put it. I wouldn't even recommend a paper. In fact, for you I'd recommend an AutoML tutorial. Check these. * no offence if I've misjudged your knowledge/skill level.
Here's a paper anyway :) https://paperswithcode.com/lib/torchvision/alexnet

To conclude, please be aware that your question is super open ended, and my answer is bad (but good enough for now maybe), but a good answer doesn't really exist. It's always going to be context dependent. For instance, you never said whether you need 90% or 99% accuracy.

Your answer is perfect, the question was intentionally open, you provided a real life example. — bokan, Dec 27 '21 at 09:56

What amount of ressources is involved in building an image recognition system?

1 Answers1