1. OpenAI Unveils “Voice Engine”: A Leap into the Future of Voice Cloning Amid Ethical Debates
OpenAI has recently introduced an innovative voice cloning technology called “Voice Engine”. This technology has generated widespread interest due to its advanced capabilities and the ethical discussions it has sparked. Remarkably, it can create a highly natural-sounding speech that reflects the voice of a specific individual from as little as a 15-second audio sample. Although the potential uses for this technology are vast and varied, OpenAI has chosen not to make it widely available at this time. This decision stems from concerns about the potential for abuse and the dangers of creating speech that can closely mimic the voices of real people, especially in critical situations such as elections.
2. Microsoft and OpenAI Announce “Stargate” Initiative: A $100 Billion Investment in the Future of AI Supercomputing
OpenAI and Microsoft are joining forces on an unprecedented $100 billion venture to create “Stargate,” a supercomputer poised to redefine AI development. Slated for a 2028 launch, Stargate marks the climax of a five-phase plan, aiming to host millions of AI chips in a U.S.-based data center. This colossal project, expected to consume power on par with several large data centers, represents a significant leap forward, with Microsoft backing the ambitious endeavor financially. Get ready for a future where AI’s potential is unleashed like never before.
3. Meta to Enhance Ray-Ban Smart Glasses with AI Features
Meta is launching AI features for its Ray-Ban smart glasses next month, enabling voice-activated translations, object, animal, and monument identification, and environmental information. Users activate the assistant with “Hey Meta,” but the glasses have limitations, such as difficulty identifying distant or exotic subjects. Initially supporting five languages, the update marks Meta’s push to infuse AI into its offerings. Priced at $299.99, the glasses also allow photo taking, video recording, and music playback, enhancing the user experience with real-time AI assistance.
4. xAI and Elon Musk unviel Grok 1.5
xAI just announced Grok-1.5, the latest iteration of its open-source large language model, boasting improved reasoning capabilities and a massive 128,000 token context length.
Grok-1.5 has notable improvements in coding and math, achieving high scores on benchmarks like MATH (50.6%), GSM8K (90%), and HumanEval (74.1%).
The model will soon be available to early testers and existing Grok users on the X (formerly Twitter).
5. OpenAI Unleashes Sora’s Creative Potential as Filmmakers Craft Surreal Short Films
OpenAI has given select filmmakers early access to its new text-to-video AI model Sora. The filmmakers used Sora to create imaginative short films, praising its ability to generate high-quality, surreal visuals that bring impossible ideas to life. Notable creations include a film about a balloon-headed man, an alternate reality nature documentary, and experimental videos blending retro aesthetics with strange imagery. The filmmakers refined Sora’s outputs with clever prompting and some post-production. While demonstrating AI’s creative potential, the Sora films also raise questions about its future impact on the film industry.
To find out more, https://openai.com/blog/sora-first-impressions
Recommend Datasets:
1. Multi-modal Generative AI Large Datasets — Licensed
The multi-modal large language models (MLLMs), known for their ability to understand and generate content across various data types, have garnered widespread interest from both the research community and the tech industry. maadaa.ai’s large dataset is specially developed for state-of-the-art multi-modal large language models, including various structured datasets like image-text pairs, video-text pairs and e-book in markdown. Follow the rules of international copyright authorization, this large dataset ensures the infusion of authenticity and diversity into Generative AI models training, propelling Generative AI models towards unprecedented accuracy and innovation.
Product Highlights:
- Over 300 Million Image-Text pairs: covers an extensive range of high-resolution professional shooting images including humans, animals, scenes, photography and vector images.
- More than 6 Million Video-Text pairs: provides rich text descriptions of characters, scenes, relationships, actions, etc.
- More than 2 million e-books and 15,000 journals: enriching the dataset with literary and academic depth.
- Genuine Media Reporting Data: Incorporating text data from major domestic media outlets ensures the inclusion of current and relevant content.
Product Statistics
Image-Text Pairs Statistics:
Video-Text Pairs Statistics:
2. Large-Scale Professional Domain Corpus Dataset — Chinese
Data Type:
Multi-modal corpus, markdown format, with embedded images
Data Collection Method:
licensed or license-free e-books
Key Features:
120M Electronic Documents
2PB fine-structured data
Most popular e-book formats
Hundreds of professional domains
Comprehensive Format Support: most of the popular e-book formats such as PDF, EPUB, mobi, azw (3), and DjVu.
Advanced OCR engine for Formulas: Equations and multiline formulas in PDFs are transformed into Latex text with high accuracy.
Precise Layout Reproduction: Ensures the original formatting of PDFs is preserved, including text arrangement, headings, and diagrams.
Application Scenarios:
Generative AI-enabled search engine, chatbot, professional Q&A, professional assistants, domain-specific content generation, etc.
Citation:
- https://economictimes.indiatimes.com/tech/technology/openai-unveils-voice-cloning-tool-voice-engine-all-you-need-to-know/articleshow/108892414.cms
- https://www.reuters.com/technology/microsoft-openai-planning-100-billion-data-center-project-information-reports-2024-03-29/
- https://www.ray-ban.com/usa/electronics/RW4006ray-ban%20%7C%20meta%20wayfarer-black/8056597982788
- https://www.livemint.com/ai/artificial-intelligence/elon-musk-announces-launch-of-grok-1-5-ai-chatbot-on-x-next-week-how-will-it-change-social-media-11711731077996.html
- https://arstechnica.com/ai/2024/03/openai-shows-off-sora-ai-video-generator-to-hollywood-execs/