#OpenDataset #Fashion #personalization #AI #E-commerce
What’s at the core of AI technologies? We believe the answer is Data.
Most of the deep learning models adopt the data-driven way to conduct supervised optimization training, and providing diverse and accurate datasets for the models is the cornerstone of the subject research.
Therefore, maadaa.ai collected several open datasets and commercial datasets for you and your AI models.
Open Datasets
1. Watch and Buy (WAB)
The WAB dataset was collected by Alibaba from Taobao daily clothing and live streaming. There were 1,042,178 labeled images, 1,654,780 labeled detection frame instances, and 70,000 transcribed labeled video texts.
The data annotated 23 clothing detection categories and detection frame positions, which can be used for object detection algorithm research. Box-level instance numbers are annotated in the data, and about 80,000 groups of commodity sequences of the same type are constructed, which can be used for object retrieval and recognition algorithm research.
Links: https://tianchi.aliyun.com/competition/entrance/531893/information
2. Deep Fashion
DeepFashion consists of four subsets. The main task of Category and Attribute Prediction Benchmark is to classify, including 28922 images, each image has Category annotation, Attribute annotation, Bbox border and Landmarks. In-shop Clothes Retrieval Benchmark provides a total of 52,712 pose images.
Each ID has a variety of pose images corresponding to it. The main tasks are image Retrieval and Content Retrieval. The folder corresponding to each product ID contains a seller show and several buyers show, a total of 239,557 pictures. The main task is also for image Retrieval and Content Retrieval.
The task of the Fashion Landmark Detection Benchmark is segmentation Detection, including 123,016 images, each of which has Landmarks and Bbox markers, as well as category labels including upper-body Clothes, Lower-body Clothes full-body Clothes.
Link: http://mmlab.ie.cuhk.edu.hk/projects/DeepFashion.html
3. DeepFashion2
DeepFashion2 is an open source garment related large-scale dataset from the Chinese University of Hong Kong. The dataset contains approximately 80W different images, with detailed labeling information for clothing items.
DeepFashion2 is further optimization of DeepFashion. DeepFashion had at least one item of clothing per image and at most seven items of clothing per image, compared with just one category tag per image. Each garment is manually marked Bbox, Landmark. The dataset contains 80.1w images of 13 popular clothing categories from commercial venues and the Internet.
Link: https://github.com/switchablenorms/DeepFashion2
4. Chictopia10K
Chictopia dataset contains 17,706 images collected from Chictopia Fashion. The dataset has 18 categories, including 12 clothing categories, backgrounds and 5 human categories.
Link:https://files.is.tue.mpg.de/classner/gp/
Commercial Datasets
1. Clothing Classification Dataset(MD-Fashion-1)
MD-fashion-1 is the dataset of clothing classification. There are about 200W images, data collected from e-commerce, fashion shows, social media and other scenes. The annotation method of the dataset adopts image category annotation and BBox annotation, covering a total of 80 category tags including different clothing styles and scenes.
Link: https://maadaa.ai/dataset/clothing-classification-dataset/
2.Clothing Keypoints Dataset(MD-Fashion-5)
MD-fashion-5 is a dataset of clothing key points, containing 100W pictures. The dataset covers 80 clothing types with the coordinates of key points and Bbox as annotated information.
Link: https://maadaa.ai/dataset/clothing-keypoints-dataset/
3. Scarf Segmentation Dataset(MD-Image-061)
MD-image-061 is a scarf fabric segmentation dataset. The dataset contains 2000 images with resolutions between 504x 678 and 192x 2880. The data set makes high-precision semantic segmentation annotation for the scarf images.
Link: https://maadaa.ai/dataset/scarf-segmentation/
4. Person And Clothes Semantic Segmentation Dataset (MD-Image-026)
MD-image-026 splits datasets for people and clothing. The dataset contains 19.7w images with a minimum resolution of 92 x 153 and a maximum resolution of 3024x 5381. Clothing categories include background, hat, hair, sunglasses, coat, skirt, pants, hat, gloves, sunglasses, coat, socks, skirt, shoes, and body parts such as the face, left and right legs, left and right arms, etc. Compared with MD-image-027, MD-image-026 adds more semantic segmentation categories of body parts, such as faces, etc.
Links: https://maadaa.ai/dataset/person-and-clothes-semantic-segmentation/
5. Clothes Segmentation Dataset (MD-Image-027)
MD-image-027 is mainly a clothing segmentation dataset collected from the Internet. The dataset contains 1.43w images with resolutions between 183 x 275 and 3024 x 4032. Through pixel level segmentation semantic annotation of background, hat, hair, sunglasses, coat, skirt, pants, dress, tie, left shoe, right shoe, face, left leg, right leg, left arm, right arm, bag, scarf, mobile phone and other large accessories, a total of about 30 target categories, making the dataset in e-commerce, Many scenes such as visual entertainment and metasurverse virtual human have important application value.
Links: https://maadaa.ai/dataset/clothes-segmentation/
6. Human Body Parts Fine Segmentation Dataset (MD-Video-005)
MD-Video-005, diversified scenes such as dancing, talent shows, movies, TV stories. Includes 19 categories: background, face, hair, top, left arm, right arm, trousers, left leg, right leg, skirt, left shoe, right shoe, bag, etc.
Links: https://maadaa.ai/dataset/high-precision-human-body-segmentation/
7. Human Body Segmentation Dataset (MD-Image-016)
MD-Image-016, a large division of the human body including the human body, arms, hands, background or background, obstructions, hair, Sunglasses + glasses, and skin in different areas.
Links: https://maadaa.ai/dataset/human-body-segmentation-2/
8. Clothing Pattern Classification Dataset(MD-Fashion-2)
Md-fashion-2 is a clothing pattern classification dataset. The total amount of data is 20W images. Different from MD-Fashion-1, this dataset focuses more on the classification of clothing pattern features. The annotation method of dataset adopts image classification label, which has 30 common classification categories.
Link: https://maadaa.ai/dataset/clothing-pattern-classification-dataset/
9. Clothing Segmentation and Fabrics Classification Dataset(MD-Fashion-4)
Md-fashion-4 is the dataset of clothing fabric classification. The total amount of data is about 20W images. The data set provides classification labels and masks for clothing materials, including 11 common fabric categories.
Link: https://maadaa.ai/dataset/fabrics-classification-dataset/
10. Nails Contour Segmentation Dataset (MD-Image-051)
MD-Image-051, about 5.9k images. Offline human fingernails collection. The resolution is 1920 * 1080.
Link: https://maadaa.ai/dataset/nails-contour-segmentation/
Further reading:
AI for virtual fitting: inspired by datasets (Open & Commercial)
AI for fake Detection in Fashion and E-commerce industries: The related open & commercial datasets
Face Parsing: use cases and open datasets