翻译 BMN: Boundary-Matching Network for Temporal Action Proposal Generation

BMN: Boundary-Matching Network for Temporal Action Proposal Generation


翻译 BMN: Boundary-Matching Network for Temporal Action Proposal Generation
Figure 2. Illustration of BM confidence map. Proposals in the same row have the same temporal duration, and proposals in the same column have the same starting time. The ending boundaries of proposals at right-bottom corner exceed the range of video, thus these proposals are not considered during training and inference.


Temporal Action Proposal Generation


As aforementioned, the goal of temporal action detection task is to detect action instances in untrimmed videos with temporal boundaries and action categories, which can be divided into temporal proposal generation and action classification stages.


These two stages are taken apart in most detection methods [23, 25, 35], and are taken together as single model in some methods [18, 2]. For proposal generation task, most previous works [3, 4, 8, 12, 23] adopt top-down fashion to generate proposals with pre-defined duration and interval, where the main drawback is the lack of boundary precision and duration flexibility. There are also some methods [35, 17] adopt bottom-up fashion. TAG [35] generates proposals using temporal watershed algorithm, but lack confidence scores for retrieving. Recently, BSN [17] generates proposals via locally locating temporal boundaries and globally evaluating confidence scores, and achieves significant performance promotion over previous proposal generation methods. In this work, we propose the Boundary-Matching mechanism for proposal confidence evaluation, which can largely simplify the pipeline of BSN and bring significant promotion in both efficiency and effectiveness.

上一篇:[论文学习]Learn to Dance with AIST++: Music Conditioned 3D Dance Generation

下一篇:476. 数字的补数