Depth Prediction¶

MXNet Pytorch

MXNet¶

Here is the model zoo for the task of depth prediction.

Hint

Training commands work with this script: Download train.py

The test script Download test.py can be used for evaluating the models on various datasets.

KITTI Dataset¶

The following table lists pre-trained models trained on KITTI.

Hint

The test script Download test.py can be used for evaluating the models (KITTI RAW results are evaluated using the official server). For example monodepth2_resnet18_kitti_stereo_640x192:

python test.py --model_zoo monodepth2_resnet18_kitti_stereo_640x192 --pretrained_type gluoncv --batch_size 1 --eval_stereo --png

Note

Our pre-trained models reproduce results from recent state-of-the-art approaches. Please check the reference paper for further information.

Modality is the method used during training. Stereo means we use stereo image pairs to calculate the loss, Mono means we use monocular image sequences to calculate the loss, Mono + Stereo means both the stereo image pairs and monocular image sequences are used to calculate the loss.

Resolution is the input size of the model during training. 640x192 means we resize the raw image (1242x375) to 640x192.

Name	Modality	Resolution	Abs. Rel. Error	delta < 1.25	Hashtag	Train Command	Train Log
monodepth2_resnet18_kitti_stereo_640x192 1	Stereo	640x192	0.114	0.860	83eea4a9	shell script	log
monodepth2_resnet18_kitti_mono_640x192 1	Mono	640x192	0.121	0.858	c881771d	shell script	log
monodepth2_resnet18_kitti_mono_stereo_640x192 1	Mono + Stereo	640x192	0.109	0.872	9515c219	shell script	log

PoseNet¶

Monodepth2 trains depth and pose models at the same time via a self-supervised manner. So, we also give reproduced results of our pre-trained models here.

Hint

The test script Download test_pose.py can be used for evaluating the models (KITTI Odometry results are evaluated using the official server). For example monodepth2_resnet18_posenet_kitti_mono_stereo_640x192:

python test_pose.py --model_zoo_pose monodepth2_resnet18_posenet_kitti_mono_640x192 --data_path ~/.mxnet/datasets/kitti/kitti_odom --eval_split odom_9  --pretrained_type gluoncv --batch_size 1 --png

Please check the full tutorials Testing PoseNet from image sequences with pre-trained Monodepth2 Pose models.

Note

Our pre-trained models reproduce results from recent state-of-the-art approaches. Please check the reference paper for further information.

Sequence 09 and Sequence 10 means the model is tested on sequence 9 and sequence 10 of the KITTI Odometry dataset respectively. Results show the average absolute trajectory error (ATE), and standard deviation, in meter.

Name	Modality	Resolution	Sequence 09	Sequence 10
monodepth2_resnet18_posenet_kitti_mono_640x192 1	Mono	640x192	0.021±0.012	0.018±0.011
monodepth2_resnet18_posenet_kitti_mono_stereo_640x192 1	Mono + Stereo	640x192	0.021±0.010	0.016±0.010

PyTorch¶

Models implemented using PyTorch will be added later. Please checkout our MXNet implementation instead.

Reference¶

1(1,2,3,4,5): Clement Godard, Oisin Mac Aodha, Michael Firman and Gabriel J. Brostow. “Digging into Self-Supervised Monocular Depth Prediction.” Proceedings of the International Conference on Computer Vision (ICCV), 2019.