Prepare COCO datasets

COCO is a large-scale object detection, segmentation, and captioning datasetself. This tutorial will walk through the steps of preparing this dataset for GluonCV.

http://cocodataset.org/images/coco-logo.png

Hint

You need 42.7 GB disk space to download and extract this dataset. SSD is preferred over HDD because of its better performance.

The total time to prepare the dataset depends on your Internet speed and disk performance. For example, it often takes 20 min on AWS EC2 with EBS.

Prepare the dataset

We need the following four files from COCO:

Filename

Size

SHA-1

train2017.zip

18 GB

10ad623668ab00c62c096f0ed636d6aff41faca5

val2017.zip

778 MB

4950dc9d00dbe1c933ee0170f5797584351d2a41

annotations_trainval2017.zip

241 MB

8551ee4bb5860311e79dace7e79cb91e432e78b3

stuff_annotations_trainval2017.zip

401 MB

e7aa0f7515c07e23873a9f71d9095b06bcea3e12

The easiest way to download and unpack these files is to download helper script mscoco.py and run the following command:

which will automatically download and extract the data into ~/.mxnet/datasets/coco.

If you already have the above files sitting on your disk, you can set --download-dir to point to them. For example, assuming the files are saved in ~/coco/, you can run:

python mscoco.py --download-dir ~/coco

Read with GluonCV

Loading images and labels is straight-forward with gluoncv.data.COCODetection.

from gluoncv import data, utils
from matplotlib import pyplot as plt

train_dataset = data.COCODetection(splits=['instances_train2017'])
val_dataset = data.COCODetection(splits=['instances_val2017'])
print('Num of training images:', len(train_dataset))
print('Num of validation images:', len(val_dataset))

Out:

WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Collecting pycocotools
  Downloading pycocotools-2.0.2.tar.gz (23 kB)
Requirement already satisfied: cython>=0.27.3 in /usr/local/lib/python3.6/dist-packages (from pycocotools) (0.29.22)
Requirement already satisfied: matplotlib>=2.1.0 in /usr/local/lib/python3.6/dist-packages (from pycocotools) (3.3.4)
Requirement already satisfied: setuptools>=18.0 in /usr/lib/python3/dist-packages (from pycocotools) (39.0.1)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (0.10.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (2.4.7)
Requirement already satisfied: numpy>=1.15 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (1.19.4)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (1.3.1)
Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (8.0.1)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (2.8.1)
Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from cycler>=0.10->matplotlib>=2.1.0->pycocotools) (1.15.0)
Building wheels for collected packages: pycocotools
  Building wheel for pycocotools (setup.py): started
  Building wheel for pycocotools (setup.py): finished with status 'done'
  Created wheel for pycocotools: filename=pycocotools-2.0.2-cp36-cp36m-linux_x86_64.whl size=265640 sha256=20675169bad298ac1f8cb8edc8b2cb7b55d580f0da818b94553b6e77f9359844
  Stored in directory: /root/.cache/pip/wheels/d8/c2/ba/8f5306f921c2e79ad7b09effdfed6bd966cfcf8c6fe55422d6
Successfully built pycocotools
Installing collected packages: pycocotools
Successfully installed pycocotools-2.0.2
WARNING: You are using pip version 20.2.4; however, version 21.0.1 is available.
You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.
loading annotations into memory...
Done (t=15.04s)
creating index...
index created!
loading annotations into memory...
Done (t=0.50s)
creating index...
index created!
Num of training images: 117266
Num of validation images: 4952

Now let’s visualize one example.

train_image, train_label = train_dataset[0]
bounding_boxes = train_label[:, :4]
class_ids = train_label[:, 4:5]
print('Image size (height, width, RGB):', train_image.shape)
print('Num of objects:', bounding_boxes.shape[0])
print('Bounding boxes (num_boxes, x_min, y_min, x_max, y_max):\n',
      bounding_boxes)
print('Class IDs (num_boxes, ):\n', class_ids)

utils.viz.plot_bbox(train_image.asnumpy(), bounding_boxes, scores=None,
                    labels=class_ids, class_names=train_dataset.classes)
plt.show()
mscoco

Out:

Image size (height, width, RGB): (480, 640, 3)
Num of objects: 8
Bounding boxes (num_boxes, x_min, y_min, x_max, y_max):
 [[  1.08 187.69 611.67 472.53]
 [311.73   4.31 630.01 231.99]
 [249.6  229.27 564.84 473.35]
 [  0.    13.51 433.48 387.63]
 [376.2   40.36 450.75  85.89]
 [465.78  38.97 522.85  84.64]
 [385.7   73.66 468.72 143.17]
 [364.05   2.49 457.81  72.56]]
Class IDs (num_boxes, ):
 [[45.]
 [45.]
 [50.]
 [45.]
 [49.]
 [49.]
 [49.]
 [49.]]

Finally, to use both train_dataset and val_dataset for training, we can pass them through data transformations and load with mxnet.gluon.data.DataLoader, see train_ssd.py for more information.

Total running time of the script: ( 4 minutes 45.143 seconds)

Gallery generated by Sphinx-Gallery