Panoptic segmentation, as a holistic real-world computer vision task, is highly dependent on the quality of datasets. Herein, we highlighted 15 typical high-quality panoptic segmentation datasets, which can empower applications like auto-driving, healthcare, remote sensing, agriculture, digital imaging, smart city and so on.
Open Datasets
1. COCO panoptic dataset
COCO dataset is one of the most prestigious image recognition datasets, which can be used for in the wild semantic segmentation, object detection, keypoint estimation, and instance segmentation. In addition, the novel panoptic segmentation version assigns the instance and semantic labels for each pixel of any image with different colors, which is appropriate to the panoptic segmentation task. Finally, COCO has 118,000 training images, 5,000 validation, and 41,000 testing images.
The COCO can be used for panoptic segmentation tasks in the wild, which means the scenery and objects are flexible. COCO has a total of 80 things classes and 53 stuff classes.
a solid baseline method for the COCO Panoptic Dataset is Panoptic-Deeplab, which has been introduced in our Methodology section, the code can be found in https://github.com/bowenc0221/panoptic-deeplab. Panoptic-Deeplab achieves 35.5 PQ on the test set.
Link: https://github.com/cocodataset/cocodataset.github.io
2. Mapillary Vistas Dataset
Mapillary Vistas Dataset can be used, for instance, semantic and panoptic segmentation, which is represented as a traffic-related dataset with a large-scale collection of segmented images. This dataset is a more challenging dataset consisting of 25,000 street scene images which are split into 18000 training images, 2000 validation images, and 5000 testing sets. The number of classes is 65 in total, where 28 refers to “stuff” classes and 37 is for “thing” classes. Moreover, it incorporates different image sizes ranging from 1024 × 768 to 4000 × 6000. As shown in Fig 30, Mapillary Vistas Dataset contains diverse street-level images with pixel‑accurate and instance‑specific human annotations for understanding street scenes.
Panoptic-Deeplab achieves state-of-the-art performance on the Mapillary Vistas Dataset. With a SWideRNet, panoptic-deeplab achieves 44.8 on the validation set. codes can be found at https://github.com/google-research/deeplab2, it’s notable that this database is the official codebase of Deeplab serious models, including Axial-Deeplab, Panoptic-Deeplab and ViP-Deeplab etc.
Link: https://www.mapillary.com/dataset/vistas
3. Cityscapes Dataset
Cityscapes Dataset is the most used dataset for panoptic segmentation, focusing on semantic understanding of urban street scenes. It collects street views of 50 cities within a several-month span. The cityscapes dataset contains 5000 pictures of self-centered driving scenes in an urban environment. And it split into 2975 train set, 500 val set, and 1525 test set. It has 19 classes of dense pixel annotations, and 8 of the 19 classes have instance-level masks.
PanopticFCN achieves 61.4 PQ on this dataset with a single-path framework, code can be found in https://github.com/dvlab-research/PanopticFCN.
Link: https://www.cityscapes-dataset.com/
4. ADE20K Dataset
ADE20K is another in-the-wild image segmentation dataset that includes 25K images containing different types of objects. This dataset is divided into the 20k training set, 2k validation set, and 3k test set. This repository also contains 50 stuff (sky and grass, etc.) and 100 things (cars, beds, person, etc.)
As shown in Fig 32, ADE20k segments each instance and background by fine-grained in the wild annotations.
Detectron2 https://github.com/facebookresearch/detectron2 is a gold-like code framework for a computer vision problem, it provides comprehensive tools and codes to facilitate detection and segmentation tasks, and a lot of panoptic segmentation methods are built based on the detectron2 framework, such as PanopticFCN, PanopticFPN, etc, ADE20k panoptic segmentation can easy to implement on this codebase.
Link: https://groups.csail.mit.edu/vision/datasets/ADE20K/
5. Indian Driving Dataset
Indian Driving Dataset (IDD) proposes a novel dataset for road scene understanding in unstructured environments. Unlike other urban scene understanding datasets, IDD consists of scenes that do not have well-delineated infrastructures, such as lanes and sidewalks. As a result, it has a significantly more number of ‘thing’ instances in each scene compared to other datasets, and it only has a small number of well-defined categories for traffic participants.
It consists of 10,000 images, finely annotated with 34 classes collected from 182 drive sequences on Indian roads. The label set is expanded compared to popular benchmarks such as Cityscapes to account for new classes. The dataset consists of images obtained from a front-facing camera attached to a car. The car was driven around Hyderabad, Bangalore cities, and their outskirts. The images are mostly of 1080p resolution, but there are also some images with 720p and other resolutions.
Link: https://idd.insaan.iiit.ac.in/
6. BDD100K Panoptic Segmentation
BDD is a large driving video dataset captured in different cities in the US. It consists of 100,000 +-40s videos, of which 10,000 videos have pixel-wise annotations. The annotations use 10 thing categories (mainly for non-stationary objects) and 30 stuff categories.
Link: https://doc.bdd100k.com/download.html
7. SemanticKITTI Panoptic Segmentation
SemanticKITTI is a dataset of lidar sequences of street scenes in Karlsruhe (Germany). It contains 11 driving sequences with panoptic segmentation labels. The labels use 6 thing and 16 stuff categories.
Link: http://www.semantic-kitti.org/dataset.html#download
8. nuScenes-lidarseg
nuScenes is a large-scale autonomous driving dataset. It consists of 1000 20s scenes of urban street scenes in Singapore and Boston. The dataset includes point clouds captured by a lidar sensor, as well as synchronized camera data. The nuScenes-lidarseg annotations use 23 thing and 9 stuff classes.
Link: https://www.nuscenes.org/nuscenes
9. COVID-19 X-Ray Dataset (V7)
It is V7’s original dataset containing 6500 images of AP/PA chest X-Rays with pixel-level polygonal lung segmentations. There are 517 cases of COVID-19 amongst these.
Lung annotations are polygons following pixel-level boundaries. You can export them in COCO, VOC, or Darwin JSON formats. Each annotation file contains a URL to the original full-resolution image and a reduced-size thumbnail.
For more details, check out: https://github.com/v7labs/covid-19-xray-dataset
10. NIH
100,000 chest x-rays with diagnoses, labels, and annotations.
For more details, check out: https://nihcc.app.box.com/v/ChestXray-NIHCC
11. OASIS
The Open Access Series of Imaging Studies (OASIS) is a project aimed at making neuroimaging data sets of the brain freely available to the scientific community.
For more details, check out: https://www.oasis-brains.org/
12. Pastis: Panoptic Agricultural Satellite TIme Series
Pastis is a dataset of agricultural satellite images. It contains 2,433 variable-length time series of multispectral images. In the images, 18 different kinds of parcels are annotated with their respective crop types.
For more details, check out: https://github.com/VSainteuf/pastis-benchmark
13. ScanNet v2
ScanNet is an RGB-D video dataset of indoor scenes containing 2.5 million views in 1513 scans. It uses 38 thing categories for items and furniture in the rooms and 2 stuff categories (wall and floor). It is not a complete panoptic dataset, as the labels only cover about 90% of all surfaces.
For more details, check out: https://github.com/ScanNet/ScanNet
Commercial datasets
maadaa.ai is committed to providing high-quality datasets from instance segmentation, and semantic segmentation to panoptic segmentation.
1. Panoptic Scenes Segmentation Dataset (MD-Image-039)
About 21.3k Internet-collected images, resolution ranges from 660 x 371 to 5472 x 3648. The dataset includes horizontal plane (desktop, ground, ceiling, etc.), vertical plane (wall, etc.), buildings, people, animals, furniture, etc.
Link: https://maadaa.ai/dataset/panoptic-scenes-segmentation/
2. Human And Multi-object Panoptic Segmentation (MD-Image-070)
About 8k Internet-collected images, the resolution is over 1280 x 700. This dataset contains most of the accessible natural scenery, people scenes, buildings, animals, etc.
Link: https://maadaa.ai/dataset/human-and-multi-object-panoptic-segmentation/