Most companies dealing with deep learning (automotive - Comma.ai, Mobileye, various automakers, etc.) do collect large amounts of data to learn from and then use lots of computational power to train a neural network (NN) from such big data. I guess this model is mainly used because both the big data and the training algorithms should remain secret/proprietary.
If I understand it correctly the problem with deep learning is that one needs to have:
- big data to learn from
- lots of hardware to train the neural network from this big data
I am trying to think about how crowdsourcing could be used in this scenario. Is it possible to distribute the training of the NN to the crowd? I mean not to collect the big data to a central place but instead to do the training from local data on the user's hardware (in a distributed way). The result of this would be lots of trained NNs that would in the end be merged into one in a Committee of machines (CoM) way. Would such a model be possible?
Of course, the model described above does have a significant drawback - one does not have control over the data that is used for learning (users could intentionally submit wrong/fake data that would lower the quality of the final CoM). This may be dealt with by sending random data samples to the central community server for review, however.
Example: Think of a powerful smartphone using its camera to capture a road from a vehicle's dashboard and using it for training lane detection. Every user would do the training himself/herself (possibly including any manual work like input image classification for supervised learning etc.).
I wonder if the model proposed above may be viable. Or is there a better model of how to use crowdsourcing (user community) to deal with machine learning?