How to handle a high dimensional video (large number of frames per video) data for training a video classification network

Asked Dec 25 '19 at 08:26

Active Dec 26 '19 at 02:48

Viewed 47 times

I have a video dataset as follows.

Dataset size: 1k videos

Frames per video: 4k (average) and 8k (maximum)

Labels: Each video has one label.

So the size of my input will be (N, 8000, 64, 64, 3) 64 is height and width of video. I use keras. I am not really sure how to do an end-to-end training with this kind of dataset. I was thinking of dividing each input in blocks of frames (N, 80, 100, 64, 64, 3) for training. But still it wont work for an end-to-end network training.

I am not in favor of dropping the frames. That might be my last choice.

Any help will be appreciated. Thanks in advance.

edited Dec 26 '19 at 02:48

asked Dec 25 '19 at 08:26

manv

How to handle a high dimensional video (large number of frames per video) data for training a video classification network

0 Answers0