teaser (1).jpeg
maadaa AI News& Open dataset Share: Meta’s WhatsApp AI, Hugging Face’s Health AI Standard, Microsoft’s VASA-1 & More
June 12, 2024Updated 9:05 am

(maadaa AI News Weekly: April 16~ April 21)

1. Meta Introduces AI Image Generation To WhatsApp

News:

Meta’s new beta feature on WhatsApp allows US users to create real-time AI-generated images from text prompts using the Meta Llama 3 model.

Key Points:

Real-Time Image Generation: Type prompts and the AI creates the scene visually, adding details as provided in real-time。

Enhanced AI Model: The Meta Llama 3 now produces higher-quality images with excellent text rendering.

Animation Feature: Users can also convert any images they provide into animated GIFs, enhancing the interactivity and shareability of content within WhatsApp.

Why It Matters?

WhatsApp’s new feature of creating real-time AI-generated images emphasizes the significance of high-quality training data in AI model development. Accurate and consistent labeling of diverse and representative datasets can mitigate AI bias and ensure fair and accurate outcomes. High-quality training data fosters trust and confidence among users.

2. Hugging Face Unveils Benchmark For Healthcare AI

News:

Hugging Face has launched Open Medical-LLM, a benchmark test for healthcare-focused AI models created with Open Life Science AI and the University of Edinburgh. It includes MedQA, PubMedQA, and MedMCQA tests to cover various medical domains.

Key Points:

- Hugging Face has launched Open Medical-LLM, a benchmark for evaluating generative AI models in healthcare.

- The benchmark was created in collaboration with researchers from Open Life Science AI and the University of Edinburgh.

- Existing tests MedQA, PubMedQA, and MedMCQA were combined to create this comprehensive evaluation tool.

- Open Medical-LLM features various multiple-choice and open-ended questions from US and Indian medical licensing exams, along with college biology test banks.

Why It Matters?

The introduction of Open Medical-LLM underscores the crucial role of data in driving advancements in healthcare. Open Medical-LLM introduces a standardized benchmark to evaluate AI models in healthcare. It ensures diverse medical data is used, leading to better performance and more accurate predictions. This highlights the need for continuous testing and refinement of AI models to remain effective in real-world conditions.

3. Microsoft’s VASA-1: Generating Lifelike Videos from Single Photos

News:

Microsoft has unveiled a new AI model, VASA-1, that can generate highly realistic talking head videos from a single still image and an audio clip. The model can produce nuanced expressions, natural head motions, and even singing performances.

Key Points:

- VASA-1 requires only a single photo and a speech audio file to create a lifelike talking video.

- The model can generate natural expressions, head motions, and realistic singing performances.

- Users can manipulate aspects of the generated video, such as eye gaze direction, head distance, and emotional tone.

Why It Matters?

The launch of VASA-1 emphasizes the importance of data in AI advancements, demonstrating its capability to produce highly realistic videos from a single image and audio clip. This showcases the impact of data quality and diversity on AI model performance, with potential applications ranging from virtual avatars to gaming enhancements. However, it also underscores the risks of data misuse, like deepfakes, stressing the need for strict data governance and ethics in AI.

Additional News:

  1. Meta has launched Llama 3, its new open-source AI Large Language Model. It’s ideal for developers, researchers, and businesses to build, experiment and scale their AI ideas.
  2. YouTube has introduced an AI-powered “Ask” button for its Premium users in the US. The feature lets users ask questions about a video in real time without pausing it.
  3. A new report indicates a 37% yearly increase in AI demand, with the global AI market expected to hit $1,811.75 billion by 2028. This growth is driven by AI’s expanding use in healthcare, manufacturing, and retail.
  4. Microsoft researchers have launched AutoDev, an AI-powered framework designed to streamline software engineering by automating complex tasks with autonomous AI agents.
  5. Nothing Earbuds launched two new models featuring ChatGPT integration, enabling AI interaction through a simple squeeze.
  6. Google has restructured to boost AI development, forming a new ‘Platform and Devices’ team and combining Research with DeepMind.
  7. Mentee Robotics recently unveiled the Menteebot, an advanced humanoid robot designed to execute sophisticated tasks. It can learn and adapt through natural language commands, making it an innovative leap forward in robotics technology.

Open & Commercial AI Training Datasets

1. FASSEG Repository:

This collection offers datasets for frontal face segmentation (Frontal01 and Frontal02) and a dataset for faces in multiple poses (Multipose01). These datasets can be valuable for training models to perform face segmentation in different orientations and conditions.

https://massimomauro.github.io/FASSEG-repository/ 

 

2. AI Photo-Video Editing Open Dataset

AI photo-video editing apps are becoming the hot Apps that can help you create, edit, and enhance photos/videos with ease. High-quality fine-segmented datasets are the key for such advanced Apps, enabling variable functions like matting, background virtualization, inpainting, and so on. This dataset includes:

1. Specific Fine-Segmentation Datasets for Precise Object Detection and Editing

2. Human-Body Segmentation for Advanced Body-related Editing

3. Facial Segmentation for Realistic and Personalized Facial Editing

 

3. Howto100M

HowTo100M is a large-scale dataset of narrated videos with an emphasis on instructional videos where content creators teach complex tasks with an explicit intention of explaining the visual content on screen. This dataset contains more than 136 Million video clips with captions sourced from 1.2M YouTube videos. There are 23k activities in various domains, such as cooking, handcrafting, personal care, gardening, and fitness. For large-scale datasets, Howto100m becomes the gold-standard dataset for video-language pre-training methods.

https://paperswithcode.com/dataset/howto100m 

 

4. Large-Scale Professional Domain Corpus Dataset — Chinese

The dataset Features:

- Licensed Data Authorization: All data are properly licensed to ensure copyright compliance during the training and application of generative AI models.

- Diverse Data Types: The dataset covers a wide range of large-scale data types, including text, images, videos, and audio, fully meeting the needs of multimodal AI model development.

- High-Quality Professional Annotation: The dataset includes image-text corpus, video-text corpus, etc., all of which are accurately semantically annotated and professionally calibrated to ensure the accuracy of Generative AI model training.

- Industry Domain Customizable: Covering nearly 100 industries and application scenarios with specialized datasets, supporting the customization of high-quality datasets for industry-specific Generative AI model development.

Typical Application Scenarios:

Generative AI-enabled search engine, chatbot, professional Q&A, professional assistants, domain-specific content generation, etc.

https://maadaa.ai/datasets/GenDatasetDetail/Large-Scale-Professional-Domain-Corpus-Dataset---Chinese  

Any further information, please contact us.

contact us