.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "build/examples_datasets/mscoco.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_build_examples_datasets_mscoco.py: Prepare COCO datasets ============================== `COCO `_ is a large-scale object detection, segmentation, and captioning datasetself. This tutorial will walk through the steps of preparing this dataset for GluonCV. .. image:: http://cocodataset.org/images/coco-logo.png .. hint:: You need 42.7 GB disk space to download and extract this dataset. SSD is preferred over HDD because of its better performance. The total time to prepare the dataset depends on your Internet speed and disk performance. For example, it often takes 20 min on AWS EC2 with EBS. Prepare the dataset ------------------- We need the following four files from `COCO `_: +------------------------------------------------------------------------------------------------------------------------+--------+------------------------------------------+ | Filename | Size | SHA-1 | +========================================================================================================================+========+==========================================+ | `train2017.zip `_ | 18 GB | 10ad623668ab00c62c096f0ed636d6aff41faca5 | +------------------------------------------------------------------------------------------------------------------------+--------+------------------------------------------+ | `val2017.zip `_ | 778 MB | 4950dc9d00dbe1c933ee0170f5797584351d2a41 | +------------------------------------------------------------------------------------------------------------------------+--------+------------------------------------------+ | `annotations_trainval2017.zip `_ | 241 MB | 8551ee4bb5860311e79dace7e79cb91e432e78b3 | +------------------------------------------------------------------------------------------------------------------------+--------+------------------------------------------+ | `stuff_annotations_trainval2017.zip `_ | 401 MB | e7aa0f7515c07e23873a9f71d9095b06bcea3e12 | +------------------------------------------------------------------------------------------------------------------------+--------+------------------------------------------+ The easiest way to download and unpack these files is to download helper script :download:`mscoco.py<../../../scripts/datasets/mscoco.py>` and run the following command: .. code-block:: bash pip install cython pip install pycocotools python mscoco.py which will automatically download and extract the data into ``~/.mxnet/datasets/coco``. If you already have the above files sitting on your disk, you can set ``--download-dir`` to point to them. For example, assuming the files are saved in ``~/coco/``, you can run: .. code-block:: bash python mscoco.py --download-dir ~/coco .. GENERATED FROM PYTHON SOURCE LINES 56-61 Read with GluonCV ----------------- Loading images and labels is straight-forward with :py:class:`gluoncv.data.COCODetection`. .. GENERATED FROM PYTHON SOURCE LINES 61-72 .. code-block:: default from gluoncv import data, utils from matplotlib import pyplot as plt train_dataset = data.COCODetection(splits=['instances_train2017']) val_dataset = data.COCODetection(splits=['instances_val2017']) print('Num of training images:', len(train_dataset)) print('Num of validation images:', len(val_dataset)) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip. Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue. To avoid this problem you can invoke Python with '-m pip' instead of running pip directly. Collecting pycocotools Downloading pycocotools-2.0.2.tar.gz (23 kB) Requirement already satisfied: cython>=0.27.3 in /usr/local/lib/python3.6/dist-packages (from pycocotools) (0.29.21) Requirement already satisfied: matplotlib>=2.1.0 in /usr/local/lib/python3.6/dist-packages (from pycocotools) (3.3.3) Requirement already satisfied: setuptools>=18.0 in /usr/lib/python3/dist-packages (from pycocotools) (39.0.1) Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (8.0.1) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (2.4.7) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (0.10.0) Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (2.8.1) Requirement already satisfied: numpy>=1.15 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (1.19.4) Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (1.3.1) Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from cycler>=0.10->matplotlib>=2.1.0->pycocotools) (1.15.0) Building wheels for collected packages: pycocotools Building wheel for pycocotools (setup.py): started Building wheel for pycocotools (setup.py): finished with status 'done' Created wheel for pycocotools: filename=pycocotools-2.0.2-cp36-cp36m-linux_x86_64.whl size=265642 sha256=f31ad3d7ebc5806b4c7222e00042979209ffab5729730f055f15f35230ab6ca7 Stored in directory: /root/.cache/pip/wheels/d8/c2/ba/8f5306f921c2e79ad7b09effdfed6bd966cfcf8c6fe55422d6 Successfully built pycocotools Installing collected packages: pycocotools Successfully installed pycocotools-2.0.2 WARNING: You are using pip version 20.2.4; however, version 20.3.3 is available. You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command. loading annotations into memory... Done (t=20.24s) creating index... index created! loading annotations into memory... Done (t=0.68s) creating index... index created! Num of training images: 117266 Num of validation images: 4952 .. GENERATED FROM PYTHON SOURCE LINES 73-74 Now let's visualize one example. .. GENERATED FROM PYTHON SOURCE LINES 74-89 .. code-block:: default train_image, train_label = train_dataset[0] bounding_boxes = train_label[:, :4] class_ids = train_label[:, 4:5] print('Image size (height, width, RGB):', train_image.shape) print('Num of objects:', bounding_boxes.shape[0]) print('Bounding boxes (num_boxes, x_min, y_min, x_max, y_max):\n', bounding_boxes) print('Class IDs (num_boxes, ):\n', class_ids) utils.viz.plot_bbox(train_image.asnumpy(), bounding_boxes, scores=None, labels=class_ids, class_names=train_dataset.classes) plt.show() .. image:: /build/examples_datasets/images/sphx_glr_mscoco_001.png :alt: mscoco :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Image size (height, width, RGB): (480, 640, 3) Num of objects: 8 Bounding boxes (num_boxes, x_min, y_min, x_max, y_max): [[ 1.08 187.69 611.67 472.53] [311.73 4.31 630.01 231.99] [249.6 229.27 564.84 473.35] [ 0. 13.51 433.48 387.63] [376.2 40.36 450.75 85.89] [465.78 38.97 522.85 84.64] [385.7 73.66 468.72 143.17] [364.05 2.49 457.81 72.56]] Class IDs (num_boxes, ): [[45.] [45.] [50.] [45.] [49.] [49.] [49.] [49.]] .. GENERATED FROM PYTHON SOURCE LINES 90-94 Finally, to use both ``train_dataset`` and ``val_dataset`` for training, we can pass them through data transformations and load with :py:class:`mxnet.gluon.data.DataLoader`, see :download:`train_ssd.py <../../../scripts/detection/ssd/train_ssd.py>` for more information. .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 6 minutes 7.555 seconds) .. _sphx_glr_download_build_examples_datasets_mscoco.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: mscoco.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: mscoco.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_