Prepare the ImageNet dataset¶
The ImageNet project contains millions of images and thousands of objects for image classification. It is widely used in the research community for benchmarking state-of-the-art models.
The dataset has multiple versions. The one commonly used for image classification is ILSVRC 2012. This tutorial will go through the steps of preparing this dataset for GluonCV.
You need at least 300 GB disk space to download and extract the dataset. SSD (Solid-state disks) is preferred over HDD because of faster speed.
First, go to the download page (you may need to register an account), and find the page for ILSVRC2012. Next, find and download the following two files:
Assuming the tar files are saved in folder
~/ILSVRC2012. We can use the
following command to prepare the dataset automatically.
python imagenet.py --download-dir ~/ILSVRC2012
Extracting the images may take a while. For example, it takes about 30min on an AWS EC2 instance with EBS.
imagenet.py will extract the images into
can specify a different target folder by setting
Read with GluonCV¶
The prepared dataset can be loaded with utility class
directly. Here is an example that randomly reads 128 images each time and
performs randomized resizing and cropping.
from gluoncv.data import ImageNet from mxnet.gluon.data import DataLoader from mxnet.gluon.data.vision import transforms train_trans = transforms.Compose([ transforms.RandomResizedCrop(224), transforms.ToTensor() ]) # You need to specify ``root`` for ImageNet if you extracted the images into # a different folder train_data = DataLoader( ImageNet(train=True).transform_first(train_trans), batch_size=128, shuffle=True)
(128, 3, 224, 224) (128,)
Plot some validation images
Total running time of the script: ( 2 minutes 37.729 seconds)