4

I've heard somewhere that due to their nature of capturing spatial relations, even untrained CNNs can be used as feature extractors? Is this true? Does anyone have any sources regarding this I can look at?

nbro
  • 39,006
  • 12
  • 98
  • 176
Alex
  • 137
  • 5

2 Answers2

6

Yes, it has been demonstrated that the main factor for CNNs to work is its architecture, which exploits locality during the feature extraction. A CNN with random weights will do a random partition of the feature space, but still with that spatial prior that works so well, so those random features are OK for classification (and sometimes even better than trained ones, as they don't introduce additional bias).

You can read more in these papers:

David
  • 511
  • 3
  • 12
-1

I'm not sure it's possible. Untrained CNN means it has random kernel values. Let's say you have a kernel with size 3x3 like below:

0 0 0
0 0 0
0 0 1

I don't think it is possible for that kernel to provide good information about the image. on the contrary, the kernel eliminates a lot of information. We cannot rely on random values for feature extraction.

But, if you use CNN with "assigned" kernel, then you don't need to train the convolutional layer. For example, you can start a CNN with a kernel that designed to extract vertical line:

-1 2 -1
-1 2 -1
-1 2 -1
malioboro
  • 2,729
  • 3
  • 20
  • 46
  • there exists a decent amount of evidence showing you can achieve amazing feature representations using a randomly intiialized CNN as a feature extractor. Think of dart throwing, youll probably get alot of useless ones, but some really good ones will be there. – mshlis Aug 06 '19 at 13:09
  • @mshlis I'm sorry, but what do you mean with "randomly initialized", is it "trained" or "untrained" after that?. – malioboro Aug 06 '19 at 13:49
  • Actually I heard about this random and untrained layer since [Yann Lecun's talk about ELM](https://www.reddit.com/r/MachineLearning/comments/34u0go/yann_lecun_whats_so_great_about_extreme_learning/). I just read papers that are referenced by @David, the first paper is specific in the image restoration cases, now I'm trying to understand the second paper, I hope I can find a new knowledge about this – malioboro Aug 06 '19 at 13:53
  • checkout ubers supermask paper (just finding a good mask for a random initialization can achieve >80% accuracy on many tasks) – mshlis Aug 06 '19 at 14:00
  • @mshlis wow, thank you for the reference, I'll check it soon – malioboro Aug 06 '19 at 14:07