4

I'm reading the AlexNet paper. In section 4, where the authors explain how they prevent overfitting, they mention

Although the 1000 classes of ILSVRC make each training example impose 10 bits of constraint on the mapping from image to label".

What does this mean?

nbro
  • 39,006
  • 12
  • 98
  • 176
harupy
  • 43
  • 2

2 Answers2

4

You need 10-bits ($2^{10} = 1024$) to represent 1000 classes.

nbro
  • 39,006
  • 12
  • 98
  • 176
Brian O'Donnell
  • 1,853
  • 6
  • 20
1

It takes at least 10 bits to represent any number between $1-1000$ because $2^{10} = 1024$. This means that if one was trying to represent 1 of the 1000 classes, one would need at least 10 bits. However, having these 10 bits set correctly for each input is really hard and would require overfitting to ensure it.

nbro
  • 39,006
  • 12
  • 98
  • 176
Jaden Travnik
  • 3,767
  • 1
  • 16
  • 35