Efficient and Information-Preserving Future Frame Prediction and Beyond

Wei Yu; Yichao Lu; Steve Easterbrook; Sanja Fidler

Abstract: Applying resolution-preserving blocks is a common practice to maximize information preservation in video prediction, yet their high memory consumption greatly limits their application scenarios. We propose CrevNet, a Conditionally Reversible Network that uses reversible architectures to build a bijective two-way autoencoder and its complementary recurrent predictor. Our model enjoys the theoretically guaranteed property of no information loss during the feature extraction, much lower memory consumption and computational efficiency. The lightweight nature of our model enables us to incorporate 3D convolutions without concern of memory bottleneck, enhancing the model's ability to capture both short-term and long-term temporal dependencies. Our proposed approach achieves state-of-the-art results on Moving MNIST, Traffic4cast and KITTI datasets. We further demonstrate the transferability of our self-supervised learning method by exploiting its learnt features for object detection on KITTI. Our competitive results indicate the potential of using CrevNet as a generative pre-training strategy to guide downstream tasks.

Efficient and Information-Preserving Future Frame Prediction and Beyond

Wei Yu, Yichao Lu, Steve Easterbrook, Sanja Fidler

Similar Papers

Extreme Tensoring for Low-Memory Preconditioning

Xinyi Chen, Naman Agarwal, Elad Hazan, Cyril Zhang, Yi Zhang,

Scalable and Order-robust Continual Learning with Additive Parameter Decomposition

Jaehong Yoon, Saehoon Kim, Eunho Yang, Sung Ju Hwang,

RNNs Incrementally Evolving on an Equilibrium Manifold: A Panacea for Vanishing and Exploding Gradients?

Anil Kag, Ziming Zhang, Venkatesh Saligrama,