Note
Click here to download the full example code
Prepare COCO datasets¶mscoco.pytrain_ssd.pyDownload Python source code: mscoco.pyDownload Jupyter notebook: mscoco.ipynb
COCO is a large-scale object detection, segmentation, and captioning datasetself. This tutorial will walk through the steps of preparing this dataset for GluonCV.

Hint
You need 42.7 GB disk space to download and extract this dataset. SSD is preferred over HDD because of its better performance.
The total time to prepare the dataset depends on your Internet speed and disk performance. For example, it often takes 20 min on AWS EC2 with EBS.
Prepare the dataset¶
We need the following four files from COCO:
Filename |
Size |
SHA-1 |
---|---|---|
18 GB |
10ad623668ab00c62c096f0ed636d6aff41faca5 |
|
778 MB |
4950dc9d00dbe1c933ee0170f5797584351d2a41 |
|
241 MB |
8551ee4bb5860311e79dace7e79cb91e432e78b3 |
|
401 MB |
e7aa0f7515c07e23873a9f71d9095b06bcea3e12 |
The easiest way to download and unpack these files is to download helper script and run the following command:
which will automatically download and extract the data into ~/.mxnet/datasets/coco
.
If you already have the above files sitting on your disk,
you can set --download-dir
to point to them.
For example, assuming the files are saved in ~/coco/
, you can run:
python mscoco.py --download-dir ~/coco
Read with GluonCV¶
Loading images and labels is straight-forward with
gluoncv.data.COCODetection
.
from gluoncv import data, utils
from matplotlib import pyplot as plt
train_dataset = data.COCODetection(splits=['instances_train2017'])
val_dataset = data.COCODetection(splits=['instances_val2017'])
print('Num of training images:', len(train_dataset))
print('Num of validation images:', len(val_dataset))
Out:
WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Collecting pycocotools
Downloading pycocotools-2.0.6.tar.gz (24 kB)
Installing build dependencies: started
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'done'
Preparing wheel metadata: started
Preparing wheel metadata: finished with status 'done'
Requirement already satisfied: matplotlib>=2.1.0 in /usr/local/lib/python3.6/dist-packages (from pycocotools) (3.3.4)
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from pycocotools) (1.19.5)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (3.0.9)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (0.11.0)
Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (8.2.0)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (2.8.1)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (1.3.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.6/dist-packages (from python-dateutil>=2.1->matplotlib>=2.1.0->pycocotools) (1.15.0)
Building wheels for collected packages: pycocotools
Building wheel for pycocotools (PEP 517): started
Building wheel for pycocotools (PEP 517): finished with status 'done'
Created wheel for pycocotools: filename=pycocotools-2.0.6-cp36-cp36m-linux_x86_64.whl size=267695 sha256=421f0bbcb34e45e2cfac0a788826c480b97f2a17e051abb6550e2036561d9d84
Stored in directory: /root/.cache/pip/wheels/39/5f/a6/d19eb746e1b7525795fa8910576ddc6108d0c9cf343e4155e8
Successfully built pycocotools
Installing collected packages: pycocotools
Successfully installed pycocotools-2.0.6
WARNING: You are using pip version 21.0.1; however, version 21.3.1 is available.
You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.
loading annotations into memory...
Done (t=17.99s)
creating index...
index created!
loading annotations into memory...
Done (t=1.11s)
creating index...
index created!
Num of training images: 117266
Num of validation images: 4952
Now let’s visualize one example.
train_image, train_label = train_dataset[0]
bounding_boxes = train_label[:, :4]
class_ids = train_label[:, 4:5]
print('Image size (height, width, RGB):', train_image.shape)
print('Num of objects:', bounding_boxes.shape[0])
print('Bounding boxes (num_boxes, x_min, y_min, x_max, y_max):\n',
bounding_boxes)
print('Class IDs (num_boxes, ):\n', class_ids)
utils.viz.plot_bbox(train_image.asnumpy(), bounding_boxes, scores=None,
labels=class_ids, class_names=train_dataset.classes)
plt.show()

Out:
Image size (height, width, RGB): (480, 640, 3)
Num of objects: 8
Bounding boxes (num_boxes, x_min, y_min, x_max, y_max):
[[ 1.08 187.69 611.67 472.53]
[311.73 4.31 630.01 231.99]
[249.6 229.27 564.84 473.35]
[ 0. 13.51 433.48 387.63]
[376.2 40.36 450.75 85.89]
[465.78 38.97 522.85 84.64]
[385.7 73.66 468.72 143.17]
[364.05 2.49 457.81 72.56]]
Class IDs (num_boxes, ):
[[45.]
[45.]
[50.]
[45.]
[49.]
[49.]
[49.]
[49.]]
Finally, to use both train_dataset
and val_dataset
for training, we
can pass them through data transformations and load with
mxnet.gluon.data.DataLoader
, see for more information.
Total running time of the script: ( 4 minutes 26.396 seconds)