(maadaa AI News Weekly: April 22~ April 28)
1. Apple Releases OpenELM: Open-Source Large Language Models for On-Device AI
News:
Apple has released OpenELM, eight open-source LLMs for on-device use. The models are available on Hugging Face Hub and aim to improve AI apps’ efficiency and accuracy without relying on cloud servers.
Key Points:
- Open Source Availability: The models, along with their training logs, code, and multiple versions, are accessible on the Hugging Face Hub.
- Enhanced Privacy and Efficiency: By operating on-device, these models reduce reliance on cloud servers, potentially increasing end-user privacy and speed.
- Comprehensive Release: Unlike previous releases, Apple has included the model weights, inference code, and the complete training framework and configurations.
Why It Matters?
Apple’s OpenELM model release as open-source is a significant step towards transparency and collaboration in the AI field. This encourages the creation of diverse and inclusive training datasets, leading to fairer and more effective AI systems. It could inspire other tech giants to follow suit, transforming the AI ecosystem.
2. Moderna Partners with OpenAI to Accelerate mRNA Medicine Development with Generative AI
News:
Moderna has partnered with OpenAI to use AI and machine learning to speed up the development of mRNA-based medicines. This collaboration aims to bring new insights and innovations to the field of mRNA technology in healthcare.
Key Points:
- Moderna and OpenAI will collaborate to explore new frontiers in mRNA medicine by combining their expertise in mRNA research and development with advanced AI capabilities.
- The partnership will use AI and machine learning to optimize mRNA-based therapies for better treatments.
- The partnership will use AI and machine learning to optimize mRNA-based therapies for better treatments.
Why It Matters?
Moderna and OpenAI are partnering to advance AI-driven mRNA research, generating new data for the design, development, and manufacturing of mRNA-based therapies. This can help researchers train more accurate AI models, ultimately improving mRNA-based treatments and patient outcomes.
3. Wayve AI Unveils LINGO-2: A Groundbreaking Vision-Language-Action Model for Transparent Autonomous Driving
News:
Wayve AI has developed LINGO-2, an advanced driving model that combines vision, language, and action to provide real-time driving commentary and explain its decision-making process. LINGO-2 can adapt its actions and explanations based on various scene elements, offering a new level of transparency and control for autonomous driving systems.
Key Points:
- Driving Commentary: LINGO-2 can leverage language to explain its actions and decisions, shedding light on the AI’s decision-making process.
- Adaptability: The model can adapt its behavior and language-based explanations based on different driving scenarios and instructions.
- Weather Identification: LINGO-2 can accurately describe the weather conditions, ranging from “very cloudy” to “sunny” and “clear with a blue sky.”
- Limitations: While LINGO-2 represents significant progress, more work is needed to quantify the alignment between the model’s explanations and its actual decision-making.
Why It Matters?
LINGO-2 can improve transparency and trust in autonomous driving systems by providing natural language explanations. Understanding the model’s decision-making process can help identify biases and improve the fairness of AI systems.
Additional News:
- Chinese tech giant SenseTime has launched SenseNova 5.0, an AI model that surpasses GPT-4 Turbo in most benchmarks with 600 billion parameters. They also plan to release a text-to-video model that ensures stylistic consistency.
- Sanctuary AI unveiled the seventh-gen Phoenix humanoid robot with improved design, AI systems, and training process for longer operational times, lower production costs, and improved capabilities.
- Adobe’s VideoGigaGAN AI model can upscale blurry videos up to 8 times their original resolution, adding fine details.
- TikTok’s latest Android app version includes a new AI text-to-speech feature that lets users clone their voice.
- Tesla CEO Elon Musk says that the company’s humanoid robot, Optimus, could be available for purchase as soon as next year.
- The University of Luxembourg developed an AI named WARN that detects atrial fibrillation up to 30 minutes before onset. It can be integrated into smartphones to give timely alerts to patients.
- Google’s “Google for Startups Growth Academy: AI for Education” program will support tech startups in Europe, Africa, and the Middle East by providing them with workshops, mentorship, and networking opportunities.
Open & Commercial AI Training Datasets
1. WebVid
There are two splits of WebVid datasets, i.e., WebVid-2M and WebVid-10M. The most popular version is the WebVid-2m, which comprises over two million videos with weak captions scraped from the internet. Web2vid dataset consists of manually generated captions that are, for the most part well-formed sentences. In contrast, HowTo100M is generated from continuous narration with incomplete sentences that lack punctuation.
So the WebVid dataset is more suited for video-language pre-training tasks for learning open domain cross-modal representations. As shown in Fig15, the WebVid dataset contains manually annotated captions.
Papers with Code - WebVid Dataset
WebVid contains 10 million video clips with captions, sourced from the web. The videos are diverse and rich in their…
2. HD-VILA
High-resolution and Diversified VIdeo-LAnguage pretraining model (HD-VILA)dataset recently introduced a large-scale dataset for video-language pretraining. This data is 1) the first high-resolution dataset consisting of 100Million video clips and sentence pairs from 3.3 million videos with 371.5K hours of 720p videos, and 2) the most diversified dataset covering 15 popular YouTube categories.
XPretrain/hd-vila-100m at main · microsoft/XPretrain
Multi-modality pre-training. Contribute to microsoft/XPretrain development by creating an account on GitHub.
3. Sunny Day City Road Dash Cam Video Dataset
The “Sunny Day City Road Dash Cam Video Dataset” captures the vibrant dynamics of city roads under sunny conditions, essential for autonomous driving systems’ development.
This dataset is recorded with high-resolution driving recorders, boasting over 1920 x 1080 resolution and a frame rate of over 33 fps, ensuring crystal-clear imagery and fluid motion capture. It includes bounding boxes and tags for more than 10 typical urban object categories, including humans, cars, electric bicycles, vans, trucks, and more, providing a rich training ground for AI to recognize and respond to various elements in sunlit urban environments.
maadaa ai
Sunny Day City Road Dash Cam Video Dataset Welcome to the fascinating world of autonomous driving, powered by our rich…
4. Fashion & E-commerce Open Dataset
This open dataset offers a vast range of potential use cases, from object detection and segmentation to pose estimation and beyond. It provides a unique context in various applications including personalized recommendations, virtual fittings, beauty AI, and product recognition.
maadaa ai
Edit description
Citation:
- https://www.macrumors.com/2024/04/24/apple-ai-open-source-models/?utm_source=www.neatprompts.com&utm_medium=newsletter&utm_campaign=nvidia-s-700m-purchase
- https://analyticsindiamag.com/doctors-use-apple-vision-pro-to-enhance-shoulder-arthroscopy-surgery/
- https://www.sensetime.com/en/news-detail/51167731?utm_source=www.aiwithvibes.com&utm_medium=newsletter&utm_campaign=is-this-the-best-chatgpt-rival