LiFR-Seg: Anytime High-Frame-Rate Segmentation via Event-Guided Propagation
Abstract
Dense semantic segmentation in dynamic environments is fundamentally limited by the low-frame-rate (LFR) nature of standard cameras, which creates critical perceptual gaps between frames. To solve this, we introduce Anytime Interframe Semantic Segmentation: a new task for predicting segmentation at any arbitrary time using only a single past RGB frame and a stream of asynchronous event data. This task presents a core challenge: how to robustly propagate dense semantic features using a motion field derived from sparse and often noisy event data, all while mitigating feature degradation in highly dynamic scenes. We propose LiFR-Seg, a novel framework that directly addresses these challenges by propagating deep semantic features through time. The core of our method is an uncertainty-aware warping process, guided by an event-driven motion field and its learned, explicit confidence. A temporal memory attention module further ensures coherence in dynamic scenarios. We validate our method on the DSEC dataset and a new high-frequency synthetic benchmark (SHF-DSEC) we contribute. Remarkably, our LFR system achieves performance (73.82\% mIoU on DSEC) that is statistically indistinguishable from an HFR upper-bound (within 0.09\%) that has full access to the target frame. % We further demonstrate superior robustness in highly dynamic (M3ED-Drone \& Quadruped) and low-light (DSEC-Night) scenarios, where our method can even surpass the HFR baseline. We further demonstrate superior robustness across extreme scenarios: in highly dynamic (M3ED) tests, our method closely matches the HFR baseline's performance, while in the low-light (DSEC-Night) evaluation, it even surpasses it. This work presents a new, efficient paradigm for achieving robust, high-frame-rate perception with low-frame-rate hardware.