New HYBRID AI Model Just SHOCKED The Open-Source World - JAMBA 1.5
Summary
TLDRAI 21 Labs 推出了两款新的开源大型语言模型:Jambo 1.5 mini 和 Jambo 1.5 large,它们采用了独特的混合架构SSM Transformer,结合了传统Transformer模型和结构化状态空间模型(SSM),以更高效地处理长序列数据。这些模型在处理复杂任务,如长文档摘要或多轮对话时,能提供更准确、有意义的回应,同时降低成本。Jambo模型在新的ruler基准测试中表现优异,速度快,内存占用低,支持多语言,且具有开发者友好的特性,如JSON输出和引用生成,非常适合企业级应用。
Takeaways
- 🌟 AI 21 Labs发布了两款新的开源大型语言模型:Jambo 1.5 mini和Jambo 1.5 large,它们具有独特的混合架构。
- 🔧 这些模型采用了SSM Transformer架构,结合了经典Transformer模型和结构化状态空间模型(SSM),以提高处理长序列数据的能力。
- 🚀 Jambo模型能够处理更长的上下文窗口,这对于需要大量上下文理解的复杂任务是一大优势。
- 🏆 Jambo模型在新的ruler基准测试中表现出色,超越了其他知名模型,如llama 317b和llama 3.1 45b。
- 🔑 Jambo模型的关键组件Mamba,具有更低的内存占用和更高效的注意力机制,使其能够轻松处理长上下文窗口。
- 📈 AI 21 Labs开发了一种新的量化技术experts int 8,通过减少模型计算中使用的数字精度,节省内存和计算成本。
- 🌐 Jambo模型支持多种语言,包括西班牙语、法语、葡萄牙语、意大利语、荷兰语、德语、阿拉伯语和希伯来语,使其适用于全球应用。
- 🛠️ Jambo模型为开发者友好,支持结构化JSON输出、函数调用和引用生成,适用于企业环境中的复杂AI应用。
- 📊 Jambo 1.5 large能够在单个8 GPU节点上运行,同时使用其完整的256k上下文长度,显示出极高的资源效率。
- 📈 Jambo 1.5模型在长上下文处理速度上是竞争对手的2.5倍,非常适合需要快速响应的企业应用。
- 📘 Jambo模型在AI 21 Studio、Google Cloud、Microsoft Azure、Nvidia Nim等平台上开放使用,易于部署和实验。
Q & A
AI 21 Labs是什么组织?
-AI 21 Labs是一个专注于开发先进人工智能语言模型的组织,他们最近推出了两个新的开源大型语言模型,名为Jambo 1.5 mini和Jambo 1.5 large。
Jambo 1.5 mini和Jambo 1.5 large模型的主要特点是什么?
-这两个模型采用了独特的混合架构,结合了最新的技术以增强人工智能性能,特别是它们能够更有效地处理长文本数据。
什么是SSM Transformer架构?
-SSM Transformer是一种新型的混合架构,它结合了传统的Transformer模型和一个称为结构化状态空间模型(SSM)的模型,以提高处理长序列数据的效率。
为什么处理长上下文窗口对于AI模型很重要?
-处理长上下文窗口的能力对于AI模型在现实世界应用中至关重要,尤其是在需要大量上下文信息以提供准确和有意义的响应的企业级应用中。
Jambo模型如何提高处理长上下文的能力?
-Jambo模型通过使用Mamba组件,它具有更低的内存占用和更高效的注意力机制,能够轻松处理更长的上下文窗口。
AI 21 Labs开发的新基准测试RULER是什么?
-RULER是AI 21 Labs创建的新基准测试,用于评估模型在多跳追踪、检索聚合和问答等任务上的表现。
Jambo 1.5 mini和Jambo 1.5 large在RULER基准测试中的表现如何?
-在RULER基准测试中,Jambo 1.5 mini和Jambo 1.5 large一致地超越了其他模型,如llama 317b、llama 3.1 45b和misra large 2。
Jambo 1.5模型的速度优势是什么?
-Jambo 1.5模型在长上下文处理上的速度是竞争对手的2.5倍,这使得它们在企业应用中非常实用,无论是运行客户支持聊天机器人还是AI驱动的虚拟助手。
AI 21 Labs开发的experts int 8量化技术是什么?
-experts int 8是一种新的量化技术,通过将模型中的权重量化为8位精度格式,并在GPU运行时直接进行反量化,从而减少模型大小并加快处理速度。
Jambo 1.5模型支持哪些语言?
-Jambo 1.5模型除了支持英语外,还支持西班牙语、法语、葡萄牙语、意大利语、荷兰语、德语、阿拉伯语和希伯来语,使其非常适合全球应用。
Jambo 1.5模型如何支持开发者友好的特性?
-Jambo 1.5 mini和large都内置了对结构化JSON输出、函数调用甚至引用生成的支持,这使得开发者可以创建更复杂的AI应用程序。
AI 21 Labs对Jambo 1.5模型的开放性承诺是什么?
-AI 21 Labs承诺保持Jambo 1.5模型的开放性,它们在Jambo开放模型许可下发布,允许开发者、研究人员和企业自由地进行实验。
Outlines
🤖 AI 21 Labs发布新型开源语言模型
AI 21 Labs推出了两款新型开源语言模型:Jambo 1.5 mini和Jambo 1.5 large。这些模型采用独特的混合架构,结合了尖端技术以提升AI性能。它们可以自行在hugging face等平台尝试,或在Google Cloud vertex AI、Microsoft Azure和Nvidia等云服务上运行。Jambo模型通过SSM Transformer架构,结合了传统Transformer模型和结构化状态空间模型(SSM),后者基于更高效的神经网络技术,使得Jambo模型能够处理更长的数据序列,适用于需要大量上下文的任务,如复杂推理或长文档摘要。这种长上下文窗口处理能力对企业级应用至关重要,因为它能提供更准确、有意义的响应,减少重复数据处理,提高质量和降低成本。
🚀 Jambo模型的混合架构及其效率
Jambo模型的混合架构中,Mamba组件基于Carnegie Mellon和普林斯顿大学研究人员的见解,具有更低的内存占用和更高效的注意力机制,能够轻松处理更长的上下文窗口。与传统Transformer相比,Mamba通过维护较小状态并在处理数据时更新,从而实现更快、资源消耗更少的操作。AI 21 Labs还开发了一种新的量化技术experts int 8,通过将模型中的混合专家层的权重量化为8位精度格式,并在GPU运行时直接进行反量化,从而减小模型大小并加快处理速度。Jambo 1.5 large能够在单个8 GPU节点上使用完整的256k上下文长度,使其成为资源效率极高的模型之一。此外,Jambo模型支持多种语言,包括西班牙语、法语、葡萄牙语、意大利语、荷兰语、德语、阿拉伯语和希伯来语,使其成为全球应用的多功能选择。Jambo 1.5还提供内置支持,如结构化JSON输出、函数调用和引用生成,非常适合企业环境中的复杂AI应用。AI 21 Labs承诺保持这些模型的开源性,允许开发者、研究人员和企业自由实验,并在多个平台和云合作伙伴上提供广泛的部署和实验选项。
Mindmap
Keywords
💡AI 21 Labs
💡Jambo 1.5 mini和Jambo 1.5 large
💡混合架构
💡Transformer架构
💡结构化状态空间模型(SSM)
💡长上下文窗口
💡Mamba组件
💡RULER基准测试
💡量化技术(Experts INT 8)
💡多语言支持
💡开发者友好
Highlights
AI 21 Labs发布了两款新的开源大型语言模型Jambo 1.5 mini和Jambo 1.5 large。
这些模型采用独特的混合架构,结合尖端技术以提高AI性能。
Jambo模型可以在Hugging Face平台或Google Cloud Vertex AI、Microsoft Azure和Nvidia等云服务上运行。
Jambo模型使用SSM Transformer架构,结合了经典Transformer模型和结构化状态空间模型(SSM)。
SSM基于更高效的技术,如神经网络和卷积神经网络,以提高处理效率。
Jambo模型能够处理更长的数据序列,而不会降低速度。
处理长上下文窗口的能力对于企业级生成AI应用至关重要。
Jambo模型通过保持更多相关信息在内存中,减少了重复数据处理的需求。
Jambo模型在ruler基准测试中,表现出色,超越了其他大型AI模型。
Jambo 1.5 mini和large在长上下文任务上的速度是竞争对手的2.5倍。
Mamba组件具有更低的内存占用和更高效的注意力机制。
AI 21 Labs开发了一种新的量化技术experts int 8,以减少模型大小并加快处理速度。
Jambo 1.5 large可以在单个8 GPU节点上运行,同时使用其完整的256k上下文长度。
Jambo模型支持多种语言,包括西班牙语、法语、葡萄牙语、意大利语、荷兰语、德语、阿拉伯语和希伯来语。
Jambo模型具有内置的支持结构化JSON输出、函数调用和引用生成的功能。
Jambo模型在多个平台上可用,并且与多个云合作伙伴合作,易于开发者和研究人员使用。
AI 21 Labs致力于保持Jambo模型的开源,允许自由实验和部署。
Jambo 1.5模型的混合架构使其在处理复杂数据密集型任务时更加高效、快速和多样化。
Transcripts
so AI 21 Labs the brains behind the
Jurassic language models has just
dropped two brand new open- source llms
called Jambo 1.5 mini and Jambo 1.5
large and these models are designed with
a unique hybrid architecture that
incorporates Cutting Edge techniques to
enhance AI performance and since they're
open source you can try them out
yourself on platforms like hugging face
or run them on cloud services like
Google Cloud vertex AI Microsoft aure
and Nvidia Nim definitely worth checking
out all right so what's this hybrid
architecture all about okay let's break
it down in simple terms most of the
language models you know like the ones
used in chat GPT are based on the
Transformer architecture these models
are awesome for a lot of tasks but
they've got this one big limitation they
struggle when it comes to handling
really large context Windows think about
when you're trying to process a super
long document or a full transcript from
a long meeting regular Transformers get
kind of bogged down because they have to
deal with all that data at once and
that's where these new Jamba models from
AI 21 Labs come into play with a totally
new game-changing approach so AI 21 has
cooked up this new hybrid architecture
they're calling the SSM Transformer now
what's cool about this is it combines
the classic Transformer model with
something called a structured State
space model or SSM the SSM is built on
some older more efficient techniques
like neural networks and convolutional
neural networks basically these are
better at handling computations
efficiently so by using this mix the
Jamba models can handle much longer
sequences of data without slowing down
that's a massive win for tasks that need
a lot of context like if you're doing
some complex generative AI reasoning or
trying to summarize a super long
document now why is handling a long
context window such a big deal well
think about it when you're using AI for
real world applications especially in
businesses you're often dealing with
complex tasks maybe you're analyzing
long meeting transcripts or summarizing
a giant policy document or even running
a chatbot that needs to remember a lot
of past conversations the ability to
process large amounts of context
efficiently means these models can give
you more accurate and meaningful
responses or denan the VP of product at
AI 21 Labs actually nailed it when he
said an AI model that can effectively
handle long context is crucial for many
Enterprise generative AI applications
and he's right without this ability AI
models often tend to hallucinate or just
make stuff up because they're missing
out on important information but with
the Jamba models and their unique
architecture they can keep more relevant
info in memory leading to way better
outputs and less need for repetitive
data processing and you know what that
means better quality and lower cost all
right let's get into the nuts and bolts
of what makes this hybrid architecture
so efficient so there's one part of the
model called Mamba which is actually
very important it's developed with
insights from researchers at Carnegie
melon and Princeton and it has a much
lower memory footprint and a more
efficient attention mechanism than your
typical Transformer this means it can
handle longer context windows with ease
unlike Transformers which have to look
at the entire context every single time
slowing things down Mamba keeps a
smaller state that gets updated as it
processes the data this makes it way
faster and less resource intensive
now you might be wondering how do these
models actually perform well AI 21 Labs
didn't just hype them up they put them
to the test they created a new Benchmark
called ruler to evaluate the models on
tasks like multihop tracing retrieval
aggregation and question answering and
guess what the Jamba models came out on
top consistently outperforming other
models like llama 317b llama 3.1 45b and
mistra large 2 on the arena hard
Benchmark which is all about testing
models on really tough tasks Jamba 1.5
mini and large outperformed some of the
biggest names in AI Jamba 1.5 mini
scored an impressive
46.1 beating models like mixol 8 x22 B
and command R plus while Jambo 1.5 large
scored a whopping 65.4 outshining even
the big guns like llama 317b and
45b one of the standout features of
these models is their speed in
Enterprise applications speed is
everything whether you're running a
customer support chatbot or an AI
powered virtual assistant the model
needs to respond quickly and efficiently
the Jambo 1.5 models are reportedly up
to 2.5 times faster on Long context than
their competitors so not only are they
powerful but they're also super
practical for high-scale operations and
it's not just about speed the Mamba
component in these models allows them to
operate with a lower memory footprint
meaning they're not as demanding on
hardware for for example Jambo 1.5 mini
can handle context lengths up to 140,000
tokens on a single GPU that's huge for
developers looking to deploy these
models without needing a massive
infrastructure all right here's where it
gets even cooler to make these massive
models more efficient AI 21 Labs
developed a new quantization technique
called experts int 8 now I know that
might sound a bit technical but here's
the gist of it quantization is basically
a way to reduce the Precision of the
numbers used in the model's computations
this can save on memory and
computational costs Without Really
sacrificing quality experts in eight is
special because it specifically targets
the weights in the mixture of experts or
Mo layers of the model these layers
account for about 85% of the models
weights in many cases by quantizing
these weights to an 8bit Precision
format and then de quantizing them
directly inside the GPU during runtime
AI 21 Labs managed to cut down the model
size size and speed up its processing
the result Jamba 1.5 large can fit on a
single 8 GPU node while still using its
full context length of
256k this makes Jamba one of the most
resource efficient models out there
especially if you're working with
limited Hardware now besides English
these models also support multiple
languages including Spanish French
Portuguese Italian Dutch German Arabic
and Hebrew which makes them super
versatile for Global applications and
here's a cherry on top AI 21 Labs made
these models developer friendly both
Jamba 1.5 mini and large come with
built-in support for structured Json
output function calling and even
citation generation this means you can
use them to create more sophisticated AI
applications that can perform tasks like
calling external tools digesting
structured documents and providing
reliable references all of which are
Super useful in Enterprise settings one
of the coolest things about Jamba 1.5 is
AI 21 lab's commitment to keeping these
models open they're released under the
Jamba open model license which means
developers researchers and businesses
can experiment with them freely and with
availability on multiple platforms and
Cloud Partners like AI 21 Studio Google
Cloud Microsoft Azure Nvidia Nim and
soon on Amazon Bedrock datab bricks
Marketplace and more you've got tons of
options for how you want to deploy and
experiment with these models looking
ahead it's pretty pretty clear that AI
models that can handle extensive context
windows are going to be a big deal in
the future of AI as Oran from AI 21 Labs
pointed out these models are just better
suited for complex data heavy tasks that
are becoming more common in Enterprise
settings they're efficient fast and
versatile making them a fantastic choice
for developers and businesses looking to
push the boundaries in AI so if you
haven't checked out Jamba 1.5 mini or
large yet now's the perfect time to dive
in and see what these models can do for
you all right if you found this video
helpful smash that like button hit
subscribe and stay tuned for more
updates on the latest in AI te thanks
for watching and I'll catch you in the
next one
浏览更多相关视频
[ML News] Jamba, CMD-R+, and other new models (yes, I know this is like a week behind 🙃)
GPT-4o Mini First Impressions: Fast, Cheap, & Dang Good.
Augmentation of Data Governance with ChatGPT and Large LLMs
【人工智能】万字通俗讲解大语言模型内部运行原理 | LLM | 词向量 | Transformer | 注意力机制 | 前馈网络 | 反向传播 | 心智理论
Googles GEMINI 1.5 Just Surprised EVERYONE! (GPT-4 Beaten Again) Finally RELEASED!
A little guide to building Large Language Models in 2024
5.0 / 5 (0 votes)