[ML News] Groq, Gemma, Sora, Gemini, and Air Canada's chatbot troubles

Yannic Kilcher

1 Mar 202442:34

Summary

TLDRThis episode of ML News covers a range of significant developments in AI and machine learning over the past two weeks. Highlights include Google's release of Gemma, a set of open models named after Gemini with impressive performance metrics, and the controversy surrounding Gemini's image generation biases. Additionally, Grock's new hardware for serving language models rapidly, Nvidia's supercomputer EOS, and the emergence of AI-generated images in peer-reviewed journals are discussed. Other topics include the EU's AI Act, OpenAI's video generation model Sora, and various innovations in AI applications from image generation to legal implications of chatbot errors. The episode also touches on recent AI research, including a new technique called ring attention, and the potential impact of AI on various industries.

Takeaways

😀 Google released GEMMA models - smaller, more efficient language models that outperform comparable models
🤖 Groq built a fast specialized card for serving language models, enabling new use cases
📈 Nvidia unveiled a supercomputer with 18.4 xflops performance to power AI
🎥 Sora by Anthropic can generate convincing 1-minute video clips
📚 Gemini 1.5 Pro shows strong performance across its 1 million token context
👀 A peer-reviewed paper included AI-generated nonsensical images
📜 The EU's AI Act categorizes applications into risk levels with regulations
🌍 Cohere released AYA, a 101 language model covering languages globally
🖼 Stability AI announced Stable Diffusion 3 for improved image generation
😊 Eyelash application robot uses CV and robotics to precisely apply lashes

Q & A

What new large language models did Google recently release?
-Google released Gemma, which are open models with 2 billion and 7 billion parameters. They outperform comparable LLMs like LLAMAS and are available for some commercial use.
What hardware development allows for faster language model inference?
-A company called Groq built a new card optimized for language models that allows over 500 tokens per second on a 7 billion parameter model, enabling new use cases.
What does Demis Hassabis say is needed in addition to scale to reach AGI?
-Demis believes you need several more innovations in addition to maximum scale to reach AGI, as scaling alone will not lead to new capabilities like planning, tool use, and agent-like behavior.
What does the EU's new AI law regulate?
-The EU AI Act categorizes applications into risk levels and ties requirements to those risk levels. The highest risk category, called unacceptable risk, bans certain uses of AI like inferring sensitive characteristics from biometric data.
What multilingual language model did Cohere release?
-Cohere launched Aya, an open source 7 billion parameter model covering 101 languages, along with a large accompanying multilingual dataset.
What new advancement in text-to-image models did Stability AI announce?
-Stability AI announced Stable Diffusion 3, which uses a diffusion transformer architecture for improved performance in areas like multi-prompt image generation and spelling ability.
What data licensing deal did Reddit make?
-Ahead of its IPO, Reddit signed a $60 million annual content licensing deal with an unnamed large AI company to make use of data from Reddit posts.
How could AI help visually impaired people?
-Robot guide dogs built with computer vision and other sensors to help with navigation and safely getting from point A to B could help address shortages in availability of service animals.
What new way to interact with computers does OS co-pilot explore?
-The OS co-pilot paper looks at an AI agent that can interact with a computer OS via natural language to open apps, fill out forms etc to behave more like an assistant.
What product is Apple developing to rival Github Copilot?
-Apple is reportedly developing AI auto-complete features inside Xcode, its iOS/Mac development environment, to compete with Github's Copilot coding assistant.