[paper reading][CVPR 2020] Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

目录

2 Related Work

General Video Classification

  • 3D conv
  • two-stream, optical flow
  • wider range
  • SlowFast, multiple time scales, two pathways
  • feature bank, long-term, correlated, short-term
  • raw pixels, in contrast, objects within scenes

3

[paper reading][CVPR 2020] Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

  • two-branch, distill
  • scene, 2D, resnet, 3D, I3D
  • object features: \(N_T\) objects, each \(o_t^j\) has the same dimension

3.2 Spatio-Temporal Graph

  • decompose our graph into two components: the spatial graph and the temporal graph
  • Spatial: normalized Intersection over Union (IoU) value, explicitly
  • temporal: object transformations, semantic similarities, \(cos\)
    [paper reading][CVPR 2020] Spatio-Temporal Graph for Video Captioning with Knowledge Distillation
  • imagine: # - % = $ x @ structure
上一篇:2021-01-04 工作记录--Wechat applet-圆与圆之间实现半折叠效果


下一篇:Splunk monitor 设置