AI-powered personalization of E-commerce and Fashion: open and commercial datasets

  • Posted by maadaa
  • November 7, 2022
  • Updated 11:52 am

#OpenDataset   #Fashion     #personalization   #AI     #E-commerce

What’s at the core of AI technologies? We believe, the answer is Data.

Most of the deep learning models adopt the data-driven way to conduct supervised optimization training, and providing diverse and accurate datasets for the models is the cornerstone of the subject research.

Therefore, collected several open datasets and commercial datasets for you and your AI models.


Open Datasets

1 Watch and Buy WAB

The WAB dataset was collected by Alibaba from taobao daily clothing and live streaming. There were 1,042,178 labeled images, 1,654,780 labeled detection frame instances, and 70,000 transcribed labeled video texts.

The data annotated 23 clothing detection categories and detection frame positions, which can be used for object detection algorithm research. Box-level instance numbers are annotated in the data, and about 80,000 groups of commodity sequences of the same type are constructed, which can be used for object retrieval and recognition algorithm research.

WAB datasets



2.Deep Fashion

DeepFashion consists of four subsets. The main task of Category and Attribute Prediction Benchmark is to classify, including 28922 images, each image has Category annotation, Attribute annotation, Bbox border and Landmarks. In-shop Clothes Retrieval Benchmark provides a total of 52,712 pose images.

Each ID has a variety of pose images corresponding to it. The main tasks are image Retrieval and Content Retrieval. The folder corresponding to each product ID contains a seller show and several buyers show, a total of 239,557 pictures. The main task is also for image Retrieval and Content Retrieval.

The task of the Fashion Landmark Detection Benchmark is segmentation Detection, including 123,016 images, each of which has Landmarks and Bbox markers, as well as category labels including upper-body Clothes, Lower-body Clothes full-body Clothes.




3. DeepFashion2

DeepFashion2 is an open source garment related large-scale dataset from the Chinese University of Hong Kong. The dataset contains approximately 80W different images, with detailed labeling information for clothing items.

DeepFashion2 is further optimization of DeepFashion. DeepFashion had at least one item of clothing per image and at most seven items of clothing per image, compared with just one category tag per image. Each garment is manually marked Bbox, Landmark. The dataset contains 80.1w images of 13 popular clothing categories from commercial venues and the Internet.




4. Chictopia10K

Chictopia dataset contains 17,706 images collected from Chictopia Fashion. The dataset has 18 categories, including 12 clothing categories, backgrounds and 5 human categories.



Commercial Datasets

1. Clothing Classification DatasetMD-Fashion-1

MD-fashion-1 is the dataset of clothing classification. There are about 200W images, data collected from e-commerce, fashion shows, social media and other scenes. The annotation method of the dataset adopts image category annotation and BBox annotation, covering a total of 80 category tags including different clothing styles and scenes.

Clothing Classification Dataset


2.Clothing Keypoints DatasetMD-Fashion-5

MD-fashion-5 is a dataset of clothing key points, containing 100W pictures. The dataset covers 80 clothing types with the coordinates of key points and Bbox as annotated information.



3. Scarf Segmentation DatasetMD-Image-061

MD-image-061 is a scarf fabric segmentation dataset. The dataset contains 2000 images with resolutions between 504x 678 and 192x 2880. The data set makes high-precision semantic segmentation annotation for the scarf images.


4. Person And Clothes Semantic Segmentation Dataset (MD-Image-026)

MD-image-026 splits datasets for people and clothing. The dataset contains 19.7w images with a minimum resolution of 92 x 153 and a maximum resolution of 3024x 5381. Clothing categories include background, hat, hair, sunglasses, coat, skirt, pants, hat, gloves, sunglasses, coat, socks, skirt, shoes, and body parts such as the face, left and right legs, left and right arms, etc. Compared with MD-image-027, MD-image-026 adds more semantic segmentation categories of body parts, such as faces, etc.

Person And Clothes Semantic Segmentation Dataset



5. Clothes Segmentation Dataset (MD-Image-027)

MD-image-027 is mainly a clothing segmentation dataset collected from the Internet. The dataset contains 1.43w images with resolutions between 183 x 275 and 3024 x 4032. Through pixel level segmentation semantic annotation of background, hat, hair, sunglasses, coat, skirt, pants, dress, tie, left shoe, right shoe, face, left leg, right leg, left arm, right arm, bag, scarf, mobile phone and other large accessories, a total of about 30 target categories, making the dataset in e-commerce, Many scenes such as visual entertainment and metasurverse virtual human have important application value.

Clothes Segmentation Dataset


6. Human Body Parts Fine Segmentation Dataset (MD-Video-005)

MD-Video-005, diversified scenes such as dancing, talent shows, movies, TV stories. Includes 19 categories: background, face, hair, top, left arm, right arm, trousers, left leg, right leg, skirt, left shoe, right shoe, bag, etc.

Human Body Parts Fine Segmentation Dataset


7. Human Body Segmentation Dataset (MD-Image-016)

MD-Image-016, a large division of the human body including the human body, arms, hands, background or background, obstructions, hair, Sunglasses + glasses, and skin in different areas.

Human Body Segmentation Dataset


8. Clothing Pattern Classification Dataset(MD-Fashion-2)

Md-fashion-2 is a clothing pattern classification dataset. The total amount of data is 20W images. Different from MD-Fashion-1, this dataset focuses more on the classification of clothing pattern features. The annotation method of dataset adopts image classification label, which has 30 common classification categories.

maadaa md-fashion-2



9. Clothing Segmentation and Fabrics Classification Dataset(MD-Fashion-4)

Md-fashion-4 is the dataset of clothing fabric classification. The total amount of data is about 20W images. The data set provides classification labels and masks for clothing materials, including 11 common fabric categories.

maadaa md-fashion-4


10. Nails Contour Segmentation Dataset (MD-Image-051)

MD-Image-051, about 5.9k images. Offline human fingernails collection. The resolution is 1920 * 1080.

Nails Contour Segmentation Dataset


Further reading

👉 Face Parsing: use cases and open datasets

👉 Video face segmentation: use cases and open datasets

👉 AI for virtual fitting: inspired by datasets (Open & Commercial)

👉 AI for fake Detection in Fashion and E-commerce industries: The related open & commercial datasets

👉 Human Video Segmentation: use cases and enable datasets

👉 Autonomous Driving: Dash Cam Video Datasets (V1.0) by

Leave a Reply

Your email address will not be published. Required fields are marked *

Hair Semantic Segmentation Dataset maadaa
#AI #AIfuture #ecommerce #fashion #datasets #maadaaAs AI enters into every industry, the development of AI has taken the Fashion and
fashion datasets maadaa
Keywords: clothing classification, clothing pattern classification, clothing fabric classification, clothing key point detection, clothing & human body semantic, segmentation, scarf
Clothing Classification Dataset
#OpenDatasets   #fashion    #Ecommerce   #AI   #Trend1. Open DatasetsMost of the deep learning models adopt the data-driven way to conduct supervised
talk to sales