Nvidia's Breakthrough AI Chip Defies Physics (GTC Supercut)
Summary
TLDRThe transcript discusses the rapid advancements in computing, highlighting the increase in computation by 1,000 times in 8 years and the introduction of the Hopper chip and Blackwell GPU. It emphasizes the transformative impact of these technologies on AI and robotics, showcasing the development of the Jetson autonomous processor, the Omniverse simulation engine, and the potential for humanoid robotics. The speaker predicts a future where everything that moves will be robotic, requiring a digital platform like Omniverse for coordination and development.
Takeaways
- 🚀 Computing power has increased by 1,000 times in the last 8 years, far surpassing Moore's Law's prediction of 10x every 5 years.
- 🌟 NVIDIA has introduced Hopper, a new advanced GPU with 28 billion transistors, marking a significant leap in computing capabilities.
- 🔗 The Blackwell chip features a unique design where two dies are abutted, allowing for 10 terabytes per second of data transfer without memory locality or cache issues.
- 💡 Blackwell is part of a new platform that redefines what GPUs look like, offering unprecedented computation and networking capabilities.
- 🔧 The new Transformer engine in Blackwell is 2.5 times more powerful than Hopper for training per chip and introduces a new format called FP6 for improved inference capability.
- 📈 Blackwell's energy efficiency and networking bandwidth optimizations are set to save significant amounts of time and resources in computation.
- 🤖 NVIDIA envisions a future where everything that moves will be robotic, with a focus on AI robotics and the development of transformative technologies.
- 🏭 The company is building end-to-end systems for robotics, including the Jetson autonomous processor, Omniverse for simulation, and the DGX AI system for training.
- 🚗 NVIDIA's technology is being adopted by major automotive companies like Mercedes and BYD, with the Thor platform designed for the next generation of robotics.
- 🌐 The Omniverse platform is presented as the digital twin platform and operating system for the robotics world, essential for the next Industrial Revolution.
Q & A
How has the rate of computing advancement changed over the last 8 years compared to the past?
-Over the last 8 years, computing has increased by 1,000 times, which is a significant acceleration compared to the historical rate of 10 times every 5 years as per Moore's law during the PC revolution.
What is the significance of the Hopper chip in the context of GPU advancement?
-The Hopper chip represents a significant leap in GPU technology, being the most advanced GPU in the world at the time of the transcript. It features 28 billion transistors and enables two dies to work together as one chip, with 10 terabytes of data transfer per second.
What is the Blackwell chip and how does it differ from traditional GPUs?
-The Blackwell chip is a revolutionary GPU platform that has redefined what a GPU looks like. It is not just a single chip but a system that includes advanced features like two abutting dies that function as one, with no memory locality or cache issues, essentially forming one giant chip.
How does the new Transformer engine in the Blackwell chip enhance performance?
-The new Transformer engine in the Blackwell chip features a fifth-generation MV link that is twice as fast as Hopper. It introduces computation in the network, allowing for faster information sharing and synchronization among multiple GPUs, which amplifies overall performance.
What is the fp6 format and how does it benefit the chip's performance?
-The fp6 format is a new computational format that allows the chip to store more parameters in memory without changing the computation speed. This effectively doubles the throughput, which is crucial for inference tasks and significantly improves the chip's performance.
What is the significance of the mvlink switch chip with 50 billion transistors?
-The mvlink switch chip is significant because it enables every single GPU to communicate with every other GPU at full speed simultaneously. With four MV links each capable of 1.8 terabytes per second, this chip is essential for building a system where GPUs can interact at maximum efficiency.
How does the DGX system exemplify the power of the Blackwell chip and MV link technology?
-The DGX system, with its MV link spine capable of 130 terabytes per second, showcases the power of the Blackwell chip and MV link technology by providing a system that can handle immense computational tasks. It represents a leap from 170 teraflops to 720 petaflops, almost reaching an exaflop for training, all in a single rack.
What are the key components of the end-to-end system for AI robotics?
-The key components of the end-to-end system for AI robotics include the DGX system for training the AI, the AGX system as the autonomous processor, and the Omniverse as the digital representation of the world for the robot to learn and interact with.
How does the Jetson AGX system support autonomous systems?
-The Jetson AGX system is designed to be a low-power, high-speed processor for sensor processing and AI, making it ideal for autonomous systems that require real-time decision-making and action, such as self-driving cars or robots.
What is the potential impact of the Thor platform on the future of robotics?
-The Thor platform is designed for Transformer engines and is expected to be used in the next generation of robotics, including humanoid robots. It signifies a shift towards more generalized and adaptable robotic systems, capable of learning and functioning in a variety of environments with human-like adaptability.
How does the Isaac simulation engine contribute to the development of robotics?
-The Isaac simulation engine provides a virtual environment where humanoid robots can learn and adapt to the physical world. It serves as a 'gym' for robots, allowing them to develop and refine their capabilities in a controlled and safe digital setting before real-world application.
Outlines
🚀 Computing Power Advancements and the Introduction of Blackwell GPU
The paragraph discusses the significant advancements in computing power over the last eight years, highlighting a 1,000-fold increase. It contrasts this growth with the historical progress of Moore's Law and the PC Revolution. The introduction of the Blackwell GPU is announced, emphasizing its large scale and capabilities. The Blackwell chip, with its 28 billion transistors and innovative design allowing two dies to function as one, is described as a game-changer in the world of computing. The paragraph also mentions the challenges of ramping up production and infrastructure to support the new technology.
🌐 The Evolution of Hopper and the Emergence of Generative AI
This paragraph delves into the specifics of the Hopper version for the current hgx configuration and the development of a prototype board. It introduces a new Transformer engine and the concept of computation in the network, which enhances the performance of the system. The paragraph also discusses the introduction of a new format called fp6, which increases the memory capacity for parameters. The focus shifts to the future of generative AI, emphasizing the creation of a processor designed for this era and the importance of content token generation. The potential for scaling up the technology is highlighted, with the mention of a new chip, the mvlink switch, and its capabilities.
🤖 Robotics and AI Integration: The Next Wave of Automation
The paragraph envisions the future of robotics and AI, suggesting that the advancements in AI will soon be applied to robotics. It describes the development of end-to-end systems for robotics, including the AI system dgx, the autonomous system agx, and the simulation engine Omniverse. The paragraph provides an example of how AI and Omniverse will work together in a robotics building, illustrating the interaction between autonomous systems and humans. The potential for humanoid robotics is also discussed, with the introduction of the Thor computer designed for Transformer engines, marking a significant step in the evolution of general robotics.
🌟 The Future of Robotics and AI: A Digital Platform for the Robotics World
The final paragraph outlines the broader implications of the advancements in computing and AI for the future of robotics. It predicts a new Industrial Revolution where every moving object will be robotic, emphasizing the safety and convenience of such a future. The paragraph also highlights the importance of a digital platform, Omniverse, as the operating system for the robotics world. The speaker concludes by thanking the audience and expressing excitement for the future of GTC and the transformative impact of these technologies.
Mindmap
Keywords
💡Computing Advancement
💡Hopper
💡Blackwell
💡Generative AI
💡Transformer Engine
💡FP4 Format
💡MVLink Switch
💡DGX System
💡Robotics
💡Jetson
💡Omniverse
Highlights
Computing has advanced by 1,000 times in the last 8 years, far surpassing Moore's Law.
A new chip, Hopper, has been developed, but there's a need for even bigger GPUs.
Introduction of a very large GPU platform named Blackwell.
Blackwell features 28 billion transistors and a unique design where two dies communicate as one chip with 10 terabytes per second data transfer.
The Blackwell chip eliminates memory locality and cache issues, presenting as a single, giant chip.
Blackwell's ambitions were considered beyond the limits of physics, but the engineering team overcame these challenges.
The Blackwell chip is used in two types of systems: fit function compatible with Hopper and a prototype board with a Grace CPU.
The new Transformer engine in Blackwell is 2.5 times more powerful than Hopper for training per chip and introduces a new format called FP6.
FP6 format allows for amplified token generation and inference capability, vital for the future of generative AI.
The new era of computing is focused on generative AI, with a processor designed specifically for this purpose.
Content token generation is a key part of the new generative AI era.
The mvlink switch chip, with 50 billion transistors, enables every GPU to communicate at full speed simultaneously.
The DGX system, now with Blackwell, achieves 720 petaflops, almost an exaflop for training.
The DGX system in a single rack is the world's first exaflop AI system.
The DGX MV link spine has a bandwidth of 130 terabytes per second, surpassing the aggregate bandwidth of the internet.
The entire DGX rack is liquid-cooled, saving significant energy and allowing for high-performance computing.
The future of robotics is discussed, with a focus on AI robotics and physical AI.
The Jetson autonomous processor and the OVX computer for running Omniverse are introduced as key components for robotics.
An example of a robotics building with autonomous systems, including humans and forklifts, is shared to demonstrate the future of AI and robotics integration.
The potential for humanoid robotics is discussed, with the necessary technology now available for generalized human robotics.
The project 'General Robotics 003' is introduced, aiming to bring about a new industrial revolution with robotics at its core.
Transcripts
the rate at which we're advancing
Computing is insane over the course of
the last 8 years we've increased
computation by 1,000 times 8 years 1,000
times remember back in the good old days
of Mo's law it was 10x every 5 years 100
times every 10 years in the middle of
the PC
Revolution 100 times every 10 years in
the last eight years we've gone 1,000
times we have two more years to
go
the rate at which we're advancing
Computing is insane and it's still not
fast enough so we built another chip
Hopper is
fantastic but we need bigger
gpus and so ladies and gentlemen I would
like to introduce you to a very very big
GPU ladies and gentlemen enjoy
this
[Music]
[Music]
to
what
[Music]
Blackwell is not a chip Blackwell is the
name of a
platform uh people think we make
gpus and and we do but gpus don't look
the way they used to this is the most
advanced GPU in the world in production
today this is
Hopper this is hopper Hopper changed the
world this is
Blackwell
it's okay
Hopper 28 billion transistors and so so
you could see you I can see that there's
a small line between two dies this is
the first time two dieses have abutted
like this together in such a way that
the two chip the two dieses think it's
one chip there's 10 terabytes of data
between it 10 terabytes per second so
that these two these two sides of the
Blackwell Chip have no clue which side
they're on there's no memory locality
issues no cash issues it's just one
giant chip when we were told that
Blackwell's Ambitions were beyond the
limits of physics uh the engineer said
so what and so this is what happened and
so this is the Blackwell chip and it
goes into two types of systems the first
one is for fit function compatible to
Hopper and so you slide on Hopper and
you push in Blackwell that's the reason
why one of the challenges of ramping is
going to be so efficient there are
installations of Hoppers all over the
world and the same infrastructure same
design the power the electricity The
Thermals the software identical push it
right back and so this is a hopper
version for the current hgx
configuration the second Hopper looks
like this now this is a prototype board
and this is a fully functioning board
and I just be careful here this right
here is I don't know1
billion the second one's
five it gets cheaper after that so the
way it's going to go to production is
like this one here two Blackwell chips
and four Blackwell dyes connected to a
Grace CPU the grace CPU has a super for
fast chipto chip link what's amazing is
this computer is the first of its kind
where this much computation fits into
this small of a place but we need a
whole lot of new features in order to
push the limits Beyond if you will the
limits of physics and so one of the
things that we did was We Invented
another Transformer engine and so this
new Transformer engine we have a fifth
generation MV
link it's now twice as fast as Hopper
but very importantly it has computation
in the network and the reason for that
is because when you have so many
different gpus working together we have
to share our information with each other
we have to synchronize and update each
other having extraordinarily fast links
and being able to do mathematics right
in the network allows us to essentially
amplify even further so even though it's
1.8 terabytes per second it's
effectively higher than that and so it's
many times that of Hopper overall
compared to
Hopper it is two and a half times the
fp8 performance for training per chip it
also has this new format called fp6 so
that even though the computation speed
is the same the amount of parameters you
can store in the memory is now Amplified
fp4 effectively doubles the throughput
this is vitally important for inference
the amount of energy we save the amount
of networking bandwidth we save the the
amount of waste of time we save will be
tremendous the future is generative
which is the reason why we call it
generative AI which is the reason why
this is a brand new industry the way we
compute is fundamentally different we
created a processor for the generative
AI era and one of the most important
parts of it is content token generation
we call it this format is
fp4 that's a lot of computation
5x the token generation 5x the inference
capability of Hopper seems like enough
but why stop
there and so we would like to have a
bigger GPU even bigger than this one and
so we decided to scale it so we built
another
chip this chip is just an incredible
chip we call it the mvlink switch it's
50 billion transistors it's almost the
size of Hopper all by itself this switch
ship has four MV links in
it each 1.8 terabytes per second what is
this chip
for if we were to build such a chip we
can have every single GPU talk to every
other GPU at full speed at the same time
that's
insane
and as a result you can build a system
that looks like this this is what a dgx
looks like now remember just six years
ago I delivered the uh first djx1 to
open AI that dgx by the way was
170
teraflops that's
0.17 pedop flops so this is
720 and so this is now 720 pedop flops
almost an exop flop for training and the
world's first one exop flops machine in
one
rack just so you know there are only a
couple two three exop flops machines on
the planet as we
speak and so this is an exif flops AI
system in one single rack well let's
take a look at the back of it so this is
what makes it possible that's the back
that's the that's the back the dgx MV
link spine 130 terabytes per
second goes through the back of that
chassis that is more than the aggregate
bandwidth of the internet we could
basically send everything to everybody
within a second 5,000 mvlink cables in
total two
miles now this is the amazing thing if
we had to use Optics we would have had
to use transceivers and retim and those
transceivers and re ERS alone would have
cost
20,000
watts 2 kilowatt of just transceivers
alone just to drive the MV link spine as
a result we did it completely for free
over mvlink switch and we were able to
save the 20 Kow for computation this
entire rack is 120 kilowatt so that 20
kilow makes a huge difference it's
liquid cooled what goes in is 25° C
about room temperature what comes out is
45° C about your jacuzzi so room
temperature goes in jacuzzi comes out 2
L per second we could sell a
peripheral 600,000 Parts somebody used
to say you know you guys make gpus and
we do but this is what a GPU looks like
to me when somebody says GPU I see this
two years ago when I saw a GPU was the
hgx it was 70 lbs 35,000 parts our gpus
now are 600,000 parts and 3,000 lb okay
so 3,000 lb ton and a half so it's not
quite an elephant now let's see what it
looks like in operation if you were to
train a GPT model 1.8 trillion parameter
model it took about 3 to 5 months or so
uh with 25,000 ampers uh if we were to
do it with hopper it would probably take
something like 8,000 gpus and it would
consume 15 megawatt 8,000 gpus on 15
megawatts it would take 90 days about
three months if you were to use
Blackwell to do this it would only take
2,000
gpus 2,000 gpus same 90 days but this is
the amazing part only four megawatts of
power so from 15 yeah that's
right Blackwell would be the most
successful product launch in our history
and so I can't wait to see that let's
talk about the next wave of Robotics the
next wave of AI robotics physical
AI so far all of the AI that we've
talked about is one
computer data comes into one computer we
take all of the data we put it into a
system like dgx we compress it into a
large language model trillions of tokens
becomes billions of parameters these
billions of parameters becomes your AI
so I just described in very simple terms
essentially what just happened in large
language models except the chat GPT
moment for robotics may be right around
the corner and so we've been building
the end to end systems for robotics for
some time I'm super super proud of the
work we have the AI system
dgx we have the lower system which is
called agx for autonomous systems the
world's first robotics processor when we
first built this thing people are what
are you guys building it's a s so it's
one chip it's designed to be very low
power but it's designed for high-speed
sensor processing and Ai and so so if
you want to run Transformers in a car or
anything um that moves uh we have the
perfect computer for you it's called the
Jetson and so the dgx on top for
training the AI the Jetson is the
autonomous processor and in the middle
we need another computer we need a
simulation engine that represents the
world digitally for the robot so that
the robot has a gym to go learn how to
be a robot we call that virtual world
Omniverse and the compter computer that
runs Omniverse is called ovx and ovx the
computer itself is hosted in the Azure
Cloud okay and so basically we built
these three things these three systems
on top of it we have algorithms for
every single one now I'm going to show
you one super example of how Ai and
Omniverse are going to work together the
example I'm going to show you is kind of
insane but it's going to be very very
close to tomorrow it's a robotics
building this robotics building is
called a warehouse inside the robotics
building are going to be some autonomous
systems some of the autonomous systems
are going to be called humans and some
of the autonomous systems are going to
be called forklifts and these autonomous
systems are going to interact with each
other of course autonomously and it's
going to be overlooked upon by this
Warehouse to keep everybody out of
Harm's Way the warehouse is essentially
an air traffic controller and whenever
it sees something happening it will
redirect traffic and give New Way points
just new way points to the robots and
the people and they'll know exactly what
to do this warehouse this building you
can also talk to of course you could
talk to it hey and all of this is
running in real time what about all the
robots all of those robots you were
seeing just now they're all running
their own autonomous robotic stack let's
talk about robotics everything that
moves will be robotic there's no
question about that it's safer it's more
convenient and one of the largest
Industries is going to be Automotive
beginning of next year we will be
shipping in Mercedes and then shortly
after that jlr today we're announcing
that byd the world's largest ev company
is adopting our next Generation it's
called Thor Thor is designed for
Transformer engines Thor our next
Generation AV computer will be used by
byd the next generation of Robotics will
likely be a humanoid
robotics we now have the Necessary
Technology to imagine generalized human
robotics in a way human robotics is
likely easier and the reason for that is
because we have a lot more training data
that we can provide the robots because
we are constructed in a very similar way
it could be in video form it could be in
virtual reality form we then created a
gym for it called Isaac reinforcement
learning gym which allows the humanoid
robot to learn how to adapt to the
physical world and then an incredible
computer the same computer that's going
to go into a robotic car this computer
will run inside a human or robot called
Thor it's designed for Transformer
engines the soul of
Nvidia the intersection of computer
Graphics physics artificial intelligence
it all came to bear at this moment the
name of that project general robotics
003 I know
super
good super
good well I think we have some special
guests do
we hey
guys so I understand you guys are
powered by
Jetson they're powered by Jetson
little Jetson robotics computers inside
they learn to walk in Isaac
Sim ladies and gentlemen this this is
orange and this is the famous green they
are the bdx robots of
Disney amazing Disney
research come on you guys let's wrap up
let's go
five things where you
going what are you
saying no it's not time to
eat it's not time to
[Music]
eat I'll give I'll give you a snack in a
moment let me finish up real quick first
a new Industrial Revolution every data
center should be accelerated a trillion
dollars worth of installed data centers
will become modernized over the next
several years second the computer of
this revolution the computer of this
generation generative AI trillion
parameters this is what we announce to
you today this is Blackwell amazing
amazing processors MV link switches
networking systems and the system design
is a miracle this is Blackwell and this
to me is what a GPU looks like in my
mind everything that moves in the future
will be robotic you're not going to be
the only one and these robotic systems
whether they are humanoid amrs
self-driving cars forklifts manipulating
arms they will all need one thing Giant
stadiums warehouses factories they going
to be factories that are robotic
manufacturing lines that are robotics
building cars that are robotics these
systems all need one thing they need a
platform a digital platform a digital
twin platform and we call that Omniverse
the operating system of the robotics
world thank
you thank you have a great have a great
GTC thank you all for coming thank you
Weitere ähnliche Videos ansehen
Nvidia's Breakthrough AI Chip Defies Physics
WATCH ASAP! TECHNICAL ANALYSIS FOR NVDA - AUTONOMOUS AI BREAKTHROUGH BY NVIDIA
FIGURE 01 AI Robot Update w/ OpenAI + Microsoft Shocks Tech World (THEMIS HUMANOID DEMO)
NVIDIA Reveals STUNNING Breakthroughs: Blackwell, Intelligence Factory, Foundation Agents [SUPERCUT]
NVIDIA AI Solutions for Efficient Supply Chain Operation
You Won't Believe What the Nvidia CEO Jensen Huang Just Said | NVDA Stock Analysis | Nvidia Stock
5.0 / 5 (0 votes)