Skip to content

Commit c0c5007

Browse files
[Docs] Refactor docs structure (#2429)
* refactor docs structures * refactor zh_cn docs * fix docs * refactor datasets structures * minor fix
1 parent 4ff1361 commit c0c5007

34 files changed

+212
-360
lines changed
Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,10 @@
11
.. toctree::
22
:maxdepth: 3
33

4-
kitti_det.md
5-
nuscenes_det.md
6-
lyft_det.md
7-
waymo_det.md
8-
sunrgbd_det.md
9-
scannet_det.md
10-
scannet_sem_seg.md
11-
s3dis_sem_seg.md
4+
kitti.md
5+
nuscenes.md
6+
lyft.md
7+
waymo.md
8+
sunrgbd.md
9+
scannet.md
10+
s3dis.md

docs/en/advanced_guides/datasets/kitti_det.md renamed to docs/en/advanced_guides/datasets/kitti.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# KITTI Dataset for 3D Object Detection
1+
# KITTI Dataset
22

33
This page provides specific tutorials about the usage of MMDetection3D for KITTI dataset.
44

docs/en/advanced_guides/datasets/lyft_det.md renamed to docs/en/advanced_guides/datasets/lyft.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Lyft Dataset for 3D Object Detection
1+
# Lyft Dataset
22

33
This page provides specific tutorials about the usage of MMDetection3D for Lyft dataset.
44

docs/en/advanced_guides/datasets/nuscenes_det.md renamed to docs/en/advanced_guides/datasets/nuscenes.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# NuScenes Dataset for 3D Object Detection
1+
# NuScenes Dataset
22

33
This page provides specific tutorials about the usage of MMDetection3D for nuScenes dataset.
44

docs/en/advanced_guides/datasets/s3dis_sem_seg.md renamed to docs/en/advanced_guides/datasets/s3dis.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# S3DIS for 3D Semantic Segmentation
1+
# S3DIS Dataset
22

33
## Dataset preparation
44

docs/en/advanced_guides/datasets/scannet_det.md renamed to docs/en/advanced_guides/datasets/scannet.md

Lines changed: 62 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
1-
# ScanNet for 3D Object Detection
1+
# ScanNet Dataset
2+
3+
MMDetection3D supports LiDAR-based detection and segmentation on ScanNet dataset. This page provides specific tutorials about the usage.
24

35
## Dataset preparation
46

@@ -38,7 +40,7 @@ Under folder `scans` there are overall 1201 train and 312 validation folders in
3840
- `scene0001_01.txt`: Meta file including axis-aligned matrix, etc.
3941
- `scene0001_01_vh_clean_2.labels.ply`: Annotation file containing the category of each vertex.
4042

41-
Export ScanNet data by running `python batch_load_scannet_data.py`. The main steps include:
43+
The procedure of exporting ScanNet data by running `python batch_load_scannet_data.py` mainly includes the following 3 steps:
4244

4345
- Export original files to point cloud, instance label, semantic label and bounding box file.
4446
- Downsample raw point cloud and filter invalid classes.
@@ -224,6 +226,9 @@ scannet
224226
- `points/xxxxx.bin`: The `axis-unaligned` point cloud data after downsample. Since ScanNet 3D detection task takes axis-aligned point clouds as input, while ScanNet 3D semantic segmentation task takes unaligned points, we choose to store unaligned points and their axis-align transform matrix. Note: the points would be axis-aligned in pre-processing pipeline [`GlobalAlignment`](https://github.com/open-mmlab/mmdetection3d/blob/9f0b01caf6aefed861ef4c3eb197c09362d26b32/mmdet3d/datasets/pipelines/transforms_3d.py#L423) of 3D detection task.
225227
- `instance_mask/xxxxx.bin`: The instance label for each point, value range: \[0, NUM_INSTANCES\], 0: unannotated.
226228
- `semantic_mask/xxxxx.bin`: The semantic label for each point, value range: \[1, 40\], i.e. `nyu40id` standard. Note: the `nyu40id` ID will be mapped to train ID in train pipeline `PointSegClassMapping`.
229+
- `seg_info`: The generated infos to support semantic segmentation model training.
230+
- `train_label_weight.npy`: Weighting factor for each semantic class. Since the number of points in different classes varies greatly, it's a common practice to use label re-weighting to get a better performance.
231+
- `train_resampled_scene_idxs.npy`: Re-sampling index for each scene. Different rooms will be sampled multiple times according to their number of points to balance training data.
227232
- `posed_images/scenexxxx_xx`: The set of `.jpg` images with `.txt` 4x4 poses and the single `.txt` file with camera intrinsic matrix.
228233
- `scannet_infos_train.pkl`: The train data infos, the detailed info of each scan is as follows:
229234
- info\['lidar_points'\]: A dict containing all information related to the lidar points.
@@ -285,8 +290,61 @@ train_pipeline = [
285290
- `RandomFlip3D`: randomly flip the input point cloud horizontally or vertically.
286291
- `GlobalRotScaleTrans`: rotate the input point cloud, usually in the range of \[-5, 5\] (degrees) for ScanNet; then scale the input point cloud, usually by 1.0 for ScanNet (which means no scaling); finally translate the input point cloud, usually by 0 for ScanNet (which means no translation).
287292

293+
A typical training pipeline of ScanNet for 3D semantic segmentation is as below:
294+
295+
```python
296+
train_pipeline = [
297+
dict(
298+
type='LoadPointsFromFile',
299+
coord_type='DEPTH',
300+
shift_height=False,
301+
use_color=True,
302+
load_dim=6,
303+
use_dim=[0, 1, 2, 3, 4, 5]),
304+
dict(
305+
type='LoadAnnotations3D',
306+
with_bbox_3d=False,
307+
with_label_3d=False,
308+
with_mask_3d=False,
309+
with_seg_3d=True),
310+
dict(
311+
type='PointSegClassMapping'),
312+
dict(
313+
type='IndoorPatchPointSample',
314+
num_points=num_points,
315+
block_size=1.5,
316+
ignore_index=len(class_names),
317+
use_normalized_coord=False,
318+
enlarge_size=0.2,
319+
min_unique_num=None),
320+
dict(type='NormalizePointsColor', color_mean=None),
321+
dict(type='Pack3DDetInputs', keys=['points', 'pts_semantic_mask'])
322+
]
323+
```
324+
325+
- `PointSegClassMapping`: Only the valid category ids will be mapped to class label ids like \[0, 20) during training. Other class ids will be converted to `ignore_index` which equals to `20`.
326+
- `IndoorPatchPointSample`: Crop a patch containing a fixed number of points from input point cloud. `block_size` indicates the size of the cropped block, typically `1.5` for ScanNet.
327+
- `NormalizePointsColor`: Normalize the RGB color values of input point cloud by dividing `255`.
328+
288329
## Metrics
289330

290-
Typically mean Average Precision (mAP) is used for evaluation on ScanNet, e.g. `[email protected]` and `[email protected]`. In detail, a generic function to compute precision and recall for 3D object detection for multiple classes is called. Please refer to [indoor_eval](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/evaluation/functional/indoor_eval.py) for more details.
331+
- **Object Detection**: Typically mean Average Precision (mAP) is used for evaluation on ScanNet, e.g. `[email protected]` and `[email protected]`. In detail, a generic function to compute precision and recall for 3D object detection for multiple classes is called. Please refer to [indoor_eval](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/evaluation/functional/indoor_eval.py) for more details.
332+
333+
**Note**: As introduced in section `Export ScanNet data`, all ground truth 3D bounding box are axis-aligned, i.e. the yaw is zero. So the yaw target of network predicted 3D bounding box is also zero and axis-aligned 3D Non-Maximum Suppression (NMS), which is regardless of rotation, is adopted during post-processing .
334+
335+
- **Semantic Segmentation**: Typically mean Intersection over Union (mIoU) is used for evaluation on ScanNet. In detail, we first compute IoU for multiple classes and then average them to get mIoU, please refer to [seg_eval](https://github.com/open-mmlab/mmdetection3d/blob/dev-1.x/mmdet3d/evaluation/functional/seg_eval.py).
336+
337+
## Testing and Making a Submission
338+
339+
By default, our codebase evaluates semantic segmentation results on the validation set.
340+
If you would like to test the model performance on the online benchmark, add `--format-only` flag in the evaluation script and change `ann_file=data_root + 'scannet_infos_val.pkl'` to `ann_file=data_root + 'scannet_infos_test.pkl'` in the ScanNet dataset's [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/_base_/datasets/scannet_seg-3d-20class.py#L126). Remember to specify the `txt_prefix` as the directory to save the testing results.
341+
342+
Taking PointNet++ (SSG) on ScanNet for example, the following command can be used to do inference on test set:
343+
344+
```
345+
./tools/dist_test.sh configs/pointnet2/pointnet2_ssg_16x2_cosine_200e_scannet_seg-3d-20class.py \
346+
work_dirs/pointnet2_ssg/latest.pth --format-only \
347+
--eval-options txt_prefix=work_dirs/pointnet2_ssg/test_submission
348+
```
291349

292-
As introduced in section `Export ScanNet data`, all ground truth 3D bounding box are axis-aligned, i.e. the yaw is zero. So the yaw target of network predicted 3D bounding box is also zero and axis-aligned 3D Non-Maximum Suppression (NMS), which is regardless of rotation, is adopted during post-processing .
350+
After generating the results, you can basically compress the folder and upload to the [ScanNet evaluation server](http://kaldir.vc.in.tum.de/scannet_benchmark/semantic_label_3d).

docs/en/advanced_guides/datasets/scannet_sem_seg.md

Lines changed: 0 additions & 128 deletions
This file was deleted.

docs/en/advanced_guides/datasets/sunrgbd_det.md renamed to docs/en/advanced_guides/datasets/sunrgbd.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# SUN RGB-D for 3D Object Detection
1+
# SUN RGB-D Dataset
22

33
## Dataset preparation
44

File renamed without changes.

docs/en/advanced_guides/index.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Supported Tasks
1111
**************
1212

1313
.. toctree::
14-
:maxdepth: 2
14+
:maxdepth: 1
1515

1616
supported_tasks/index.rst
1717

@@ -20,7 +20,7 @@ Customization
2020
**************
2121

2222
.. toctree::
23-
:maxdepth: 2
23+
:maxdepth: 1
2424

2525
customize_dataset.md
2626
customize_models.md

0 commit comments

Comments
 (0)