03. Multiple object tracking with pre-trained SMOT models

In this tutorial, we present a method, called Single-Shot Multi Object Tracking (SMOT), to perform multi-object tracking. SMOT is a new tracking framework that converts any single-shot detector (SSD) model into an online multiple object tracker, which emphasizes simultaneously detecting and tracking of the object paths. As an example below, we directly use the SSD-Mobilenet object detector pretrained on COCO from gluoncv-model-zoo and perform multiple object tracking on an arbitrary video. We want to point out that, SMOT is very efficient, its runtime is close to the runtime of the chosen detector.

Predict with a SMOT model

First, we download a video from MOT challenge website,

from gluoncv import utils
video_path = 'https://motchallenge.net/sequenceVideos/MOT17-02-FRCNN-raw.webm'
im_video = utils.download(video_path)

Out:

Downloading MOT17-02-FRCNN-raw.webm from https://motchallenge.net/sequenceVideos/MOT17-02-FRCNN-raw.webm...

  0%|          | 0/5410 [00:00<?, ?KB/s]
  0%|          | 17/5410 [00:00<00:31, 169.42KB/s]
  1%|          | 49/5410 [00:00<00:20, 256.29KB/s]
  2%|2         | 113/5410 [00:00<00:12, 428.84KB/s]
  4%|4         | 241/5410 [00:00<00:06, 760.15KB/s]
  9%|9         | 513/5410 [00:00<00:03, 1457.38KB/s]
 20%|#9        | 1057/5410 [00:00<00:01, 2793.56KB/s]
 33%|###3      | 1793/5410 [00:00<00:00, 4261.42KB/s]
 57%|#####6    | 3073/5410 [00:00<00:00, 6478.63KB/s]
 85%|########5 | 4609/5410 [00:00<00:00, 9077.06KB/s]
100%|##########| 5410/5410 [00:01<00:00, 5105.04KB/s]

Then you can simply use our provided script under /scripts/tracking/smot/demo.py to obtain the multi-object tracking result.

python demo.py MOT17-02-FRCNN-raw.webm --network-name ssd_512_mobilenet1.0_coco --use-pretrained --custom-classes person --use-motion

Our model is able to track multiple persons even when they are partially occluded. If you want to track multiple object categories at the same time, you can simply pass in the extra class names.

For example, let’s download a video from MOT challenge website,

from gluoncv import utils
video_path = 'https://motchallenge.net/sequenceVideos/MOT17-13-FRCNN-raw.webm'
im_video = utils.download(video_path)

Out:

Downloading MOT17-13-FRCNN-raw.webm from https://motchallenge.net/sequenceVideos/MOT17-13-FRCNN-raw.webm...

  0%|          | 0/7685 [00:00<?, ?KB/s]
  0%|          | 33/7685 [00:00<00:39, 195.71KB/s]
  1%|1         | 86/7685 [00:00<00:21, 349.02KB/s]
  3%|2         | 209/7685 [00:00<00:10, 697.54KB/s]
  6%|5         | 437/7685 [00:00<00:05, 1269.01KB/s]
 12%|#1        | 889/7685 [00:00<00:02, 2375.33KB/s]
 23%|##3       | 1789/7685 [00:00<00:01, 4542.28KB/s]
 43%|####3     | 3329/7685 [00:00<00:00, 7999.72KB/s]
 63%|######3   | 4865/7685 [00:00<00:00, 10297.45KB/s]
 83%|########3 | 6401/7685 [00:00<00:00, 11859.57KB/s]
7686KB [00:01, 7281.81KB/s]

Then you can simply use our provided script under /scripts/tracking/smot/demo.py to obtain the multi-object tracking result.

python demo.py MOT17-13-FRCNN-raw.webm --network-name ssd_512_resnet50_v1_coco --use-pretrained --custom-classes person car --detect-thresh 0.7 --use-motion

Now we are tracking both person and cars,


Try SMOT on your own video and see the results!

Total running time of the script: ( 0 minutes 2.933 seconds)

Gallery generated by Sphinx-Gallery