1

For action recognition or similar tasks, one can either use 3D CNN or combine 2D CNN with optical flow. See this paper for details.

Can someone tell the pros/cons of each, in terms of accuracy, cost such as computation and memory requirement, etc.? In other words, is the computation overhead of 3D CNN justified by its accuracy improvement? Under what scenarios would one prefer one over another?

3D CNNs are also used for volumetric data, such as MRI images. Can 2D CNN + optical flow be used here?

I understand 2D CNNs and 3D CNNs, but I do not know about optical flow (my background is not computer-vision).

nbro
  • 39,006
  • 12
  • 98
  • 176
user984260
  • 111
  • 2

0 Answers0