Abstract
Detection identififies objects as axis-aligned boxes in an image. Most successful object detectors enumerate a nearly exhaustive list of potential object locations and classify each. This is wasteful, ineffificient, and requires additional post-processing. In this paper, we take a different approach. We model an object as a single point — the center point of its bounding box. Our center point based approach, CenterNet, is end-to-end differentiable, simpler, faster, and more accurate than corresponding bounding box based detectors.
简介
检测将对象识别为图像中的水平对齐框.许多成功的物体检测器列举所有相近的潜在的物体位置和类别,这是浪费且无效的,并且需要额外的后处理.在这篇文章中,我们采用不同的方法,我们将一个物体视为一个简单的点,这个点对应其检测框. 基于中心点的方法,是端到端有区别的,简单的,快速的,且更加准确的相比于基于检测框的检测器
Introduction
In this paper, we provide a much simpler and more effificient alternative. We represent objects by a single point at their bounding box center (see Figure 2). Other properties, such as object size, dimension, 3D extent, orientation, and pose are then regressed directly from image features at the center location. Object detection is then a standard keypoint estimation problem [3,39,60]. We simply feed the input image to a fully convolutional network [37, 40] that generates a heatmap. Peaks in this heatmap correspond to object centers. Image features at each peak predict the objects bounding box height and weight. The model trains using standard dense supervised learning [39,60]. Inference is a single network forward-pass, without non-maximal suppression for post-processing.
介绍
在这篇论文中, 我们提供了一种更加简单的和有效的方法。我们通过一个在框中心位置的简单点来表示一个物体,其他属性,类型框的大小,维度, 3d范围, 方向和位置可以通过在图像特征中中心点的位置来回归获得。物体检测是一种标准的点估计问题。我们简单的喂数据给一个全卷机的网络然后生成一个热力图, 在热力图的尖端对应着物体的中心。图片特征中每一个尖端预测物体边界框的长和宽,这个模型训练使用标准的有监督学习。推理是一个简单的前向网络,不需要nms作为后处理。
Related work