《SLAM十四讲》阅读笔记-《2D Grid Mapping and Navigation with ORB SLAM》

Though this point cloud can be useful for obtaining the 3D structure of the environment, it is not as useful for path planning and navigation using algorithms that need a 2D occupancy grid map as input.The point cloud produced by ORB SLAM is somewhat sparse which makes it difficult to generate an occupancy map that contains most of the obstacles while also providing sufficient contiguity in the known free spaces for path planning to work reliably.

尽管RGB-D相机产生的点云对于获得环境的三维结构有用,但是对路径规划导航没用,因为这需要2D occupancy grid map。ORB-SLAM点云的稀疏使得 occupancy map缺少连续性。

3.1.2 Installation of ORB-SLAM
This part consisted of installing ORB-SLAM2 using the instructions provided on the project Github page [14]. The current version of ORB-SLAM2 has some conflicts with Eigen 3.3.3 and OpenCV 3 on ROS-Indigo. As a result, we used Eigen 3.2.10 and did not install OpenCV 3, instead using the OpenCV 2.4 that comes with ROS-Indigo.

ORB-SLAM2按照网址 https://github.com/raulmur/ORB SLAM2.进行安装

3.1.4 Reproduction of results on KITTI and TUM
For this part, we first downloaded sequences 00 and 05 of the KITTI dataset [16] and sequence fr3 walking halfsphere from the TUM dataset [17]. We then ran ORB-SLAM on all 3 sequences following the instructions on the Github page [14] to reproduce the results. Screen shots of the point clouds thus generated are shown in Fig. 1. Following commands were used for running ORB SLAM on the three sequences:

我们下载KITTI和TUM 的数据集,使用ORB SLAM2复现了三个场景,产生的点云如图

./ Examples / Monocular / mono_kitti Vocabulary / ORBvoc . txt Examples / Monocular / KITTI00 -02. yaml
./ KITTI /00
./ Examples / Monocular / mono_kitti Vocabulary / ORBvoc . txt Examples / Monocular / KITTI04 -12. yaml
./ KITTI /05
./ Examples / Monocular / mono_tum Vocabulary / ORBvoc . txt Examples / Monocular / TUM3 . yaml
./ TUM - RGBD / rgbd_dataset_freiburg3_walking_halfsphere

《SLAM十四讲》阅读笔记-《2D Grid Mapping and Navigation with ORB SLAM》

3.1.5 Production of 2D grid map using ORB-SLAM map points
For this step, we created a Python script to generate a 2D occupancy grid map off line by processing all the keyframes and map points produced by ORB SLAM after it has completed processing all the frames in the sequence. This script is included with this report as pointCloudToGridMap2D.py. In order to generate the input data needed by this script, we modified the ORB SLAM monocular application Examples/Monocular/mono tum to output the 3D poses of all keyframes along with all map points visible in each keyframe to a text file. The script parses this file and stores all keyframe/- camera poses and the associated map points into a dictionary structure after projecting them to the XZ plane. This projection is done by simply removing the y coordinate. Since the ORB SLAM coordinate locations are in units of meters, a finer grid resolution is obtained by multiplying all positions by a scaling factor which is chosen as the inverse of the desired resolution in m/cell. For instance, if a resolution of 10 cm/cell or 0.1 m/cell is needed, the scaling factor becomes 10. The keyframes are then processed one at a time and following steps are applied to all map points visible in each keyframe:

使用ORB-SLAM map points产生2D grid map
ORB SLAM完成处理所有的frames 之后,使用一个python脚本离线处理,生成一个 2D occupancy grid map。为了产生这个脚本所需要的数据,我们需要对Examples/Monocular/mono tum产生的数据做一些修正,将每一个关键帧中所有可视点的3D位姿写到一个text里面去。脚本分析这个文件,将相机位姿和所有可见地图点投影到XZ平面上。存入字典结构中。因为ORB-SLAM以米为单位,通过将所有位置乘以一个缩放因子,得到更精细的网格分辨率,该缩放因子被选择为m/单元中所需分辨率的反比。

  1. Cast a ray from the camera position to all the visible points using the Bressenham’s line drawing algorithm [18].
    2.Increment a visit counter for each point along the ray and an occupied counter for the end point that corresponds to the location of the map point

1,使用Bresenham的直线绘制算法 线从相机位置投射到所有可见点;
2,为射线上的每个点增加 一个访问计数器,并为对应于映射点位置的端点增加一个占用计数器;

The visit and occupied counters are stored as integral arrays of the same size as the range of x and z locations of the camera and map points after scaling. Note that ORB SLAM considers the XZ plane as th horizontal plane so that the y coordinate of a point is regarded as its height. Once all keyframes have been processed, the occupancy probability for each cell of the grid map is computed as:

《SLAM十四讲》阅读笔记-《2D Grid Mapping and Navigation with ORB SLAM》

where occupied(i, j) and visit(i, j) are the corresponding entries in the occupied and visit counters respectively and pfree is the probability that this cell is not occupied. This probability map is converted into a ternary cost map by using two thresholds free thresh and occupied thresh such that a particular cell is considered as free if its pfree is greater than free thresh, occupied if it is less than occupied thresh and unknown otherwise.

访问和占用计数器 存储与XZ映射平面同比例的二维矩阵,矩阵每一行列的数值代表单元格位置;计算完所有 关键帧后,栅格图中的每个单元格的未被占用概率计算公式: 其中,occupied(i,j)和visit(i,j)分别代表某一空闲单元格被占用和访问的次数, pfree(i,j)代表该单元格未被占用的概率;通过设定两个阈值将单元格分为三种情况:(1)如 果pfree(i,j)大于某一空闲阈值free_threshold ,则视为空闲单元;(2)如果pfree(i,j)小于某一 占用阈值occupied_threshold ,则视为占用单元;(3)否则,视为未知单元;

Since we do not have access to a real Turtlebot robot, we use simulation for this part instead. First, we create a virtual Turtlebot in an empty world using the Gazebo [19] simulator and then use adaptive Monte Carlo localization (AMCL) [20, 21] for navigation and Rviz [22] for visualization. Following commands are used for running these nodes:

使用Turtlebot 进行在gazebo模拟,使用AMCL定位,rviz可视化

roslaunch turtlebot_gazebo turtlebot_world . launch
world_file :=/ opt / ros / indigo / share / turtlebot_gazebo / worlds / empty . world
roslaunch turtlebot_gazebo amcl_demo . launch map_file :=./ grid_map . yaml
roslaunch turtlebot_rviz_launchers view_navigation . launch

《SLAM十四讲》阅读笔记-《2D Grid Mapping and Navigation with ORB SLAM》

The Python code was converted into C++ and adapted to work in an incremental manner since it is not possible to process in real time all keyframes and map points obtained up to any given point in the sequence. This involved the creation of two ROS nodes by modifying the monocular ROS node Examples/ROS/ORB SLAM2/Mono. The first node is called Monopub and publishes the pose of each keyframe whenever it is added to the map along with all map points visible in that keyframe. In addition, it detects whenever a loop closure is performed and then publishes all keyframes along with all map points added thus far. Monopub was also modified to accept input images from live cameras and ROS topics in addition to reading them from images on disk. Following four commands can be used to run this node on the KITTI 00 sequence, TUM frg3 walking halfsphere sequence, live camera, and a ROS node publishing images to /usb cam/image raw topic:

rosrun ORB_SLAM2 Monopub Vocabulary / ORBvoc . txt Examples / Monocular / KITTI00 -02. yaml ./ KITTI /00 0
rosrun ORB_SLAM2 Monopub Vocabulary / ORBvoc . txt Examples / Monocular / TUM3 . yaml
./ TUM - RGBD / rgbd_dataset_freiburg3_walking_halfsphere
rosrun ORB_SLAM2 Monopub Vocabulary / ORBvoc . txt Examples / Monocular / mono . yaml 0
rosrun ORB_SLAM2 Monopub Vocabulary / ORBvoc . txt Examples / Monocular / demo_cam . yaml -1
/ usb_cam / image_raw

py代码写成了C++的形式,并且适应了增量式的方式,因为无法实时的处理关键帧和地图点。这需要创建两个ROS node,修改Examples/ROS/ORB SLAM2/Mono。第一个叫做Monopub ,当添加该关键帧时发布每个关键帧的位姿,此外还有回环的功能,同时也接受实时相机的数据。

The second node is called Monosub and subscribes to the pose data published by Monopub. When it receives a single keyframe, it processes it using the same method as in Sec. 3.1.5 though with several additional tricks that are described in the subsequent sections. The visit and occupied counters corresponding to each such keyframe are added together to build the map incrementally. Since the counters are updated independently for each keyframe and co-visibility between different keyframes are not taken into account, the map thus produced is only an approximation to the true map. In practice, however, it turned out to be quite accurate and more than sufficient for navigation.

第二个node叫做Monosub ,订阅Monopub的位姿数据,每个这样的关键帧对应的访问和占用计数器被添加到一起,以逐步构建地图。 由于计数器是为每个关键帧独立更新的,并且不考虑不同关键帧之间的共同可见性,因此生成的映射只是对真实映射的近似。 然而,在实践中,事实证明它是相当准确和足够的导航。

When all keyframes are received after a loop closure, the counters are completely rest and recomputed using all the keyframes and map points received from the publisher simultaneously. This allows us to deal with the slight inaccuracies that may have accumulated over time. Monopub also provides the option to periodically publish all keyframes and points even if no loop closure is detected for scenarios where loop closures are too too few and far between. Map resetting is a time consuming process but needs to be performed rarely enough for it to not matter. Monosub publishes the generated map and its meta data along with the camera trajectories to be used as navigation goals in AMCL and Rviz. Due to the many parameters that can be adjusted in Monosub, it accepts a large number of command line arguments that are listed below:

rosrun ORB_SLAM2 Monosub < scale_factor > < resize_factor > < cloud_max_x > < cloud_min_x >
< cloud_max_z > < cloud_min_z > < free_thresh > < occupied_thresh > < use_local_counters >
< visit_thresh > < use_gaussian_counters > < use_boundary_detection >
< use_height_thresholding > < normal_thresh_deg > < canny_thresh > < enable_goal_publishing >
< show_camera_location > < gauss_kernel_size >

当所有关键帧在循环结束后接收时,计数器完全静止并使用同时从发布者接收到的所有关键帧和映射点重新计算。 这使我们能够处理随着时间的推移可能积累的微小不准确之处。 Monopub还提供了周期性发布所有关键帧和点的选项,即使在循环闭包太少且间隔太远的情况下没有检测到循环闭包。 地图重置是一个耗时的过程,但很少需要执行,因为它无关紧要。 Monosub发布生成的地图及其元数据以及在AMCL和Rviz中用作导航目标的相机轨迹。 由于在Monosub中可以调整许多参数,它接受大量命令行参数,这些参数如下:

以上就可以将地图基本构建完成,此外还需要

Local and global counters
Visit thresholding
Height thresholding
Gaussian smoothing of counters
Canny boundary detection
Slope thresholding

详细内容可以参考

上一篇:LSD-SLAM论文总结


下一篇:视觉 SLAM 十四讲 —— 第十三讲 建图