(maadaa AI News Weekly: July 23 ~ July 29)
1. OpenAI Launches SearchGPT: A New AI-Powered Search Engine to Challenge Google
News:
OpenAI has unveiled SearchGPT, an AI-driven search engine prototype designed to deliver timely and accurate answers with clear source attribution. Currently in testing with a limited user base, SearchGPT aims to enhance the search experience by integrating real-time information and conversational capabilities, positioning itself as a competitor to Google.
Key Points:
- Prototype Launch: SearchGPT is currently a prototype available to 10,000 test users.
- Real-Time Information: The search engine will provide answers with direct citations, improving the reliability of information.
- Publisher Partnerships: OpenAI collaborates with established news organizations to enhance content quality and user experience.
Why It Matters:
This development is significant as it enhances OpenAI’s training dataset by incorporating high-quality, real-time information from trusted publishers. This strategy not only improves the accuracy and relevance of search results but also fosters a more informed user experience, potentially reshaping the landscape of online search.
2. Runway AI Accused of Utilizing Pirated Content in Training Text-to-Video Generator
News:
Runway, an AI startup, trained its text-to-video generator on thousands of YouTube videos and pirated films, according to a report. The training dataset obtained by 404 Media includes links to channels belonging to major entertainment companies, popular creators, and news outlets.
Key Points:
- Runway used a massive web crawler to download videos from various YouTube channels, including those owned by Netflix, Disney, Nintendo, Rockstar Games, and prominent creators like MKBHD and Linus Tech Tips.
- The dataset also contains links to piracy sites like KissCartoon, which allows users to watch animated content for free.
- It’s unclear if Runway used all the videos in the dataset to train its latest model, Gen-3 Alpha, which can “create videos in any style you can imagine.”
Why It Matters:
The news highlights how AI companies like Runway are expanding their training datasets to include a wide range of online content, including potentially copyrighted material. This approach can enhance the capabilities of AI models, but it also raises questions about the ethical and legal implications of such practices. The extensive dataset used by Runway suggests that the company is striving to develop a highly versatile and capable video generation tool, which could have significant implications for the future of content creation and distribution.
3. DeepMind’s AlphaProof AI Earns Silver Medal at International Mathematical Olympiad
News:
An AI system called AlphaProof, developed by Google DeepMind, has achieved a silver medal at the prestigious International Mathematical Olympiad (IMO) — the first time an AI has reached the podium in this competition for young mathematicians. AlphaProof demonstrated the ability to solve a wide range of mathematical problems, including geometry, number theory, algebra, and combinatorics.
Key Points:
- AlphaProof scored 28 out of 42 points, just one point shy of the gold medal threshold.
- The AI system uses a reinforcement learning approach, but had to overcome the challenge of translating the English-based IMO problems into a programming language it could understand.
- While AlphaProof works slowly compared to human contestants, it is able to produce verified solutions that can be easily checked.
- The success of AlphaProof represents a significant milestone in the field of AI and mathematics, and could inspire more teams to enter the AI Mathematical Olympiad (AIMO) prize competition.
Why It Matters:
The success of AlphaProof in the IMO competition demonstrates the rapidly improving capabilities of AI systems in mathematics. DeepMind’s achievement could lead to further advancements in AI-powered mathematical problem-solving with important implications for scientific research, engineering, and technology development.
4. MIT Researchers Revolutionize Home Robot Training with Realistic iPhone-Powered Simulations
News:
Researchers at MIT CSAIL have developed a novel method for training home robots in simulation using iPhone scans of real home environments. This approach allows robots to practice millions of tasks in virtual homes, improving their adaptability to dynamic real-world settings.
Key Points:
- Robots struggle to operate in unstructured home environments due to the diversity of layouts, surfaces, and obstacles.
- Simulation-based training enables robots to practice tasks thousands or millions of times, without the risk and cost of real-world failures.
- Using iPhone scans to create accurate virtual home environments enhances the simulation’s ability to mimic real-world conditions.
- A robust database of diverse home environments makes the robots more adaptable to changes, such as furniture rearrangement or the presence of unexpected objects.
Why It Matters:
The news is significant because it demonstrates how leveraging accessible scanning technology can dramatically improve the training and performance of home robots. By creating highly realistic simulation environments, researchers can bridge the gap between virtual and physical worlds, enabling robots to learn and adapt more effectively to the complexities of real homes. This approach has the potential to accelerate the deployment of capable, versatile home robots that can assist with a wide range of everyday tasks.
Additional News:
1. Anthropic’s ClaudeBot allegedly violated scraping policies by scraping iFixit. iFixit claims ClaudeBot hit their servers almost a million times in 24 hours, causing issues. Anthropic says its crawler follows robots.txt, which doesn’t let website owners control scraping.
2. The U.S. Senate has unanimously passed the DEFIANCE Act to combat nonconsensual deepfake pornography, allowing victims to sue creators and distributors for up to $150,000 in damages.
3. Apple Music in iOS 18 will have a new feature to create AI-generated artwork for playlists using the “Image Playground” image generator.
4. OpenAI could face a $5 billion loss and might run out of cash within the next 12 months due to high costs for AI training and staffing.
5. Google is now the only search engine that can surface results from Reddit. Other search engines won’t show results from the last week when you search Reddit using “site:reddit.com.”
6. Meta has introduced the “Imagine Me” tool for Meta AI, which can turn your selfies and prompts into AI-generated self-images or portraits. Just add a prompt like “Imagine me as a hero” or “Imagine me in a surrealist painting” and the tool will do the rest.
maadaa.ai Shared Open and Commercial Datasets:
Open Dataset 1: RGB-D Object Dataset
The RGB-D Object Dataset introduces a new dimension for training multimodal generative AI models, incorporating depth data to provide 3D representations of scenes. This enables AI models to identify objects based on color and also understand their shape, size, and precise location. The dataset features meticulously calibrated RGB and depth channels and captures each object from multiple viewpoints, opening up new possibilities for tasks like 3D object recognition, precise object localization, and robotic grasping and manipulation.
URL: https://rgbd-dataset.cs.washington.edu/publications.html
Open Dataset 2: Flickr30k
Flickr30k Entities is a valuable dataset with 31,000 curated images from Flickr, each accompanied by 5 reference sentences and bounding box annotations. It enables AI models to understand and annotate individual objects accurately by integrating textual and visual annotations, allowing for improved image descriptions and object identification within scenes.
URL: https://www.kaggle.com/datasets/hsankesara/flickr-image-dataset
Commercial Dataset 1: Facial 17 Parts Segmentation Dataset
The “Facial 17 Parts Segmentation Dataset” is specifically compiled for the visual entertainment industry, featuring a range of internet-collected facial images with resolutions exceeding 1024 x 682 pixels. This dataset is dedicated to semantic segmentation, delineating 17 facial categories such as eyebrows, lips, eye pupils, and more. It also includes a selection of portrait images with occlusions, adding complexity and diversity to the dataset for more realistic application scenarios.
URL: https://maadaa.ai/datasets/DatasetsDetail/Facial-17-Parts-Segmentation-Dataset
Commercial Dataset 2: Medical Speech Dataset
Data Type: Audio
Volume: About 160k
Annotation Notes: 160 people, male: female 1:1, medical terms command words, common words, difficult words, multi-disciplinary diagnosis results.
Application Scenarios: Medical Voice
URL: https://maadaa.ai/datasets/DatasetsDetail/Medical-Speech-Dataset