#黃仁勳 驚喜為 #美超微 #梁見後 #COMPUTEX 主題演講站台|完整精華

天下雜誌 video
5 Jun 202421:35

Summary

TLDRIn this engaging keynote, Nvidia CEO Jensen Huang discusses the transformative impact of AI and accelerated computing on data centers. Highlighting the convergence of efficiency and performance, he introduces the concept of 'Green Computing' and the emergence of generative AI. Huang emphasizes the need for modernizing data centers to harness the potential of these technologies, which are projected to grow to a $3 trillion industry by 2030. He also unveils new products, including advanced liquid cooling systems, designed to optimize energy consumption and computational throughput, ultimately driving revenue in what he refers to as 'AI factories.' The talk underscores the importance of safety, technology, and policy advancements in the AI domain.

Takeaways

  • 🧠 AI is revolutionizing computing with the advent of accelerated computing and Green Computing, focusing on energy efficiency.
  • 📈 The demand for AI is soaring, with data centers needing to be modernized to handle the transition to generative AI, which is expected to impact every data center globally.
  • 💡 Green Computing is not just an environmentally friendly approach but also a cost-efficient one, aiming to reduce waste and energy consumption in data centers.
  • 🚀 Nvidia is at the forefront of this change with new products and technologies, including a significant number of new products aimed at accelerating data centers.
  • 💧 Nvidia is also innovating in cooling technologies, such as data center liquid cooling (DLC), to lower power consumption and enable more AI chips to be manufactured.
  • 🔢 The scale of operations is massive, with Nvidia shipping thousands of units per month and aiming to increase this number significantly in the near future.
  • 🔑 The importance of software cannot be overstated, as it plays a crucial role in the performance and efficiency of high-performance computing systems.
  • 🌐 Networking is evolving into a computing fabric, facilitating distributed computing and the efficient distribution of workloads across networks.
  • 🔄 Checkpoint restart is a vital feature for high uptime and utilization in AI systems, and new technologies like Grace CPU are designed with this in mind.
  • 🛡️ Safety is a paramount concern in AI, with the need for guardrails, monitoring systems, and good practices to ensure the responsible advancement of AI technology.
  • 🌟 The future of AI is bright, with the potential for significant revenue generation through the creation and utilization of intelligent tokens across various industries.

Q & A

  • What is the significance of the term 'Green Computing' as mentioned in the transcript?

    -In the context of the transcript, 'Green Computing' refers to energy-efficient computing. It's about making data centers more efficient and reducing wasted energy and costs, which is a key focus for Nvidia and the future of computing.

  • What is the current state of CPU scaling according to the transcript?

    -The transcript mentions that CPU scaling has slowed for many years, leading to an enormous amount of wasted energy and cost trapped inside data centers.

  • What is the role of accelerated computing in the context of data centers?

    -Accelerated computing is crucial for making data centers more efficient. It helps to release the trapped waste and use that energy for new purposes, such as accelerating every application and data center.

  • What does the term 'generative AI' refer to in the transcript?

    -Generative AI in the transcript refers to the process of AI generating new content such as text, images, and videos. It is a significant shift in computing that will affect every data center globally.

  • How does the speaker describe the impact of generative AI on data centers?

    -The speaker suggests that the transition to generative AI will impact every single data center in the world, necessitating the modernization of the trillion-dollar worth of data centers that are established.

  • What is the significance of the number '1,000 R' mentioned in the transcript?

    -The number '1,000 R' refers to the shipping of 1,000 units of a certain product per month, which is part of Nvidia's efforts to lower power consumption and enable the manufacturing of more AI chips.

  • How does liquid cooling (DLC) contribute to AI chip manufacturing according to the transcript?

    -Liquid cooling (DLC) is being shipped in production to lower power consumption, allowing for the manufacturing of more AI chips and contributing to the efficiency and performance of data centers.

  • What is the goal for the year mentioned by the speaker in relation to shipping?

    -The goal for the year, as mentioned in the transcript, is to ship more than 10,000 units, indicating a significant increase in production and shipping targets.

  • What does the speaker mean by 'AI factories'?

    -The term 'AI factories' refers to data centers that are directly generating revenues for factories by utilizing AI technologies to create value through processes like token generation.

  • How does the speaker view the future of computing throughput and its relation to revenue?

    -The speaker views computing throughput as directly tied to revenue generation. The faster the generation of tokens (which embed intelligence), the higher the throughput, utilization, and consequently, the revenues.

  • What is the importance of software compatibility in high-performance computing as discussed in the transcript?

    -Software compatibility is crucial in high-performance computing because it allows for the seamless integration of systems and ensures that all components work together efficiently, which is vital for achieving high performance and efficiency in data centers.

  • What are the three important software stacks mentioned in the transcript?

    -The three important software stacks mentioned in the transcript are CUDA, which is famous for parallel computing, a networking stack for creating a computing fabric, and DOA, which is used for distributing workload across the network efficiently.

  • How does the speaker emphasize the importance of safety in AI?

    -The speaker emphasizes the importance of safety in AI by comparing it to autopilot in airplanes, stating that just as many technologies and practices were needed to keep autopilot safe, similar measures will be necessary for AI, including guardrails, monitoring systems, and good policies.

Outlines

00:00

🤖 AI and Green Computing Revolution

The speaker, Jensen Huang, CEO of Nvidia, discusses the transformative impact of AI and the concept of Green Computing. He emphasizes the arrival of accelerated computing to address the exponential growth in data processing and the inefficiency of CPU scaling. The talk highlights the potential for significant energy and cost savings in data centers through the adoption of accelerated computing. Huang also introduces the idea of generative AI, which involves creating new content such as text, images, and videos, and predicts its widespread influence on data centers globally. The conversation touches on the preparation of Supermicro to modernize these data centers with new systems and products, reflecting the urgent need for such advancements due to the growing demand and the potential for energy efficiency and cost savings.

05:01

🚀 Advancements in AI Chip Technology and Liquid Cooling

The discussion shifts to the specifics of AI chip production and the innovative steps taken to enhance power efficiency. The speaker mentions the shipping of data center liquid cooling (DLC) systems to reduce power consumption, allowing for the manufacturing of more AI chips. The conversation includes playful banter about the听不懂 (bu dong ting) between Chinese and American colleagues, highlighting cultural and language barriers. The speaker also provides insights into the massive scale of operations, with plans to ship thousands of units per month, and the technological marvel of creating the most advanced computers in the world. The summary underscores the importance of energy efficiency in data centers and the financial implications of these advancements, introducing the concept of token generation as a new commodity with monetary value.

10:02

🏭 Transforming Data Centers into AI Factories

The speaker elaborates on the transition of traditional data centers into AI factories, which are revenue-generating entities. He explains that these AI factories are not merely for file retrieval or email exchanges but are directly involved in generating income through AI computations. The speaker discusses the significance of factors like reliability, throughput, and startup time in maximizing factory output and revenue. The conversation also covers the integration of systems into a rack scale for efficient operation and the importance of software compatibility. The speaker assures that all systems are ready to serve customers, emphasizing the readiness of Supermicro to meet the demands of the AI factory era.

15:03

🛠️ The Importance of Software Stacks in AI Development

The speaker delves into the crucial role of software stacks in the development and operation of AI systems. He mentions CUDA as a renowned software stack and discusses the importance of networking as a computing fabric, especially in the context of high-performance computing. The speaker highlights the advancements in networking speeds, moving from gigahertz to terahertz, and the significance of software that operates on top of these networks for distributed computing. The conversation also touches on the integration of various chips and systems, such as Grace CPU and Blackwell GPUs, and the energy efficiency of high-speed interconnects. The speaker concludes by emphasizing the importance of energy efficiency in translating to higher performance and the concept of Green Computing.

20:05

🛡️ Prioritizing Safety and Advancement in AI

In the final paragraph, the speaker addresses the importance of safety and continuous advancement in AI technology. He draws a parallel between the safety measures implemented in aviation, such as autopilot systems, air traffic control, and pilot monitoring, to the safety measures needed for AI. The speaker stresses the need for guardrails, monitoring systems, and good practices to ensure the safe operation of AI. He also underscores the importance of good policies and the collective responsibility to advance good science, engineering, business, and industrial practices. The speaker concludes with a humorous note, suggesting that buying more products equates to increased safety, highlighting the company's commitment to providing safe and advanced AI solutions.

Mindmap

Keywords

💡AI

AI, or Artificial Intelligence, refers to the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions. In the video, AI is central to discussions about the future of computing and data centers, as well as generative AI, which involves AI creating content such as text, images, and videos.

💡Accelerated Computing

Accelerated Computing is a concept where computational tasks are performed faster than they would be on traditional CPUs. The script mentions that accelerated computing has arrived at a time when CPU scaling has slowed down, and it is crucial for handling the exponential increase in data processing with more efficiency and less energy waste.

💡Green Computing

Green Computing is the practice of designing, manufacturing, using, and disposing of computers and computing resources efficiently and effectively to minimize environmental impact. In the script, green computing is associated with energy-efficient computing, which Nvidia is focusing on to reduce waste and save energy in data centers.

💡Generative AI

Generative AI is a subset of AI that focuses on creating new content rather than just recognizing or classifying existing content. The script discusses generative AI's potential to transform every data center by generating text, images, and videos, which is a significant shift from traditional inference tasks.

💡Data Centers

Data Centers are large facilities that house numerous servers used to store, process, and manage large amounts of data. The script highlights the need to modernize existing data centers, which are estimated to be worth a trillion dollars, with new technologies to improve efficiency and performance.

💡Supermicro

Supermicro is a company mentioned in the script that specializes in high-performance server technology, storage solutions, and computing. The CEO of Nvidia, Jensen Huang, discusses Supermicro's readiness to provide products and services for the modernization of data centers.

💡Liquid Cooling

Liquid Cooling is a method of cooling that uses a liquid coolant to absorb and dissipate heat from computer components. The script mentions that Supermicro is now shipping data center liquid cooling (DLC) products to reduce power consumption and enable more AI chips to be manufactured.

💡GPU

A GPU, or Graphics Processing Unit, is a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. In the script, GPUs are discussed as a critical component in the complex systems that make up modern data centers and AI factories.

💡Cuda

CUDA, which stands for Compute Unified Device Architecture, is a parallel computing platform and application programming interface (API) model created by Nvidia. It allows software developers to use Nvidia GPUs for general purpose processing (GPGPU). The script refers to Cuda as a foundational software stack for building applications on top of Nvidia's technology.

💡Networking

In the context of the script, networking refers to the infrastructure that allows computers to communicate with each other within a data center or across the internet. The script discusses the evolution of networking from simple communication to a computing fabric that is essential for distributed computing and high-performance computing.

💡Checkpoint Restart

Checkpoint Restart is a technique used in computing to save the state of a process or computation at certain points (checkpoints) so that it can be resumed if it is interrupted. The script mentions the importance of checkpoint restart for high utilization and uptime in data centers, particularly during the training phase of AI models.

Highlights

Nvidia CEO Jensen Huang shares his vision on AI and its impact on computing.

Accelerated Computing and Green Computing are two concurrent trends shaping the future of data centers.

Data processing needs have grown exponentially, while CPU scaling has slowed, leading to energy and cost inefficiencies.

Accelerating data centers can lead to significant savings by reducing waste.

Introduction of 220 new products by Nvidia, emphasizing their commitment to innovation.

Generative AI is emerging as a new form of computing, distinct from inference.

Generative AI will transform every data center, with a projected $3 trillion worth by 2030.

The demand for modernizing data centers with advanced systems is immense.

Supermicro is ready to support the modernization with their products and services.

Nvidia is shipping data center liquid cooling systems to reduce power consumption.

The significance of energy efficiency in AI chip manufacturing and its impact on the industry.

Nvidia's goal to ship more than 10,000 units, highlighting their growth and ambition.

The complexity and technological marvel of Nvidia's GPUs and the systems they are part of.

The importance of software in running advanced computer systems and its role in Nvidia's offerings.

Nvidia's focus on three key software stacks: CUDA, networking as a computing fabric, and distributed computing.

The innovative use of liquid cooling systems to eliminate costs and improve data center efficiency.

The concept of token generation as a new commodity in the AI industry, with direct revenue implications.

The transformation of data centers into AI factories, emphasizing their role in generating revenue.

The integration of systems into rack scale for efficient startup, utilization, and throughput.

Nvidia's commitment to advancing AI technology, safety, and policy to ensure responsible AI development.

Transcripts

play00:00

I know only

play00:01

some fortunately we are very lucky again

play00:05

to invite the AI

play00:07

genius our common

play00:13

friend our common friend is very busy

play00:15

huh Invidia found CEO Jensen to share

play00:20

his great vision with us

play00:23

[Applause]

play00:28

[Music]

play00:29

[Applause]

play00:29

[Music]

play00:33

thank

play00:33

you hi

play00:38

everybody now

play00:41

what that AI is changing minium because

play00:47

of you what's new

play00:51

today I have to admit just now when I

play00:55

was coming to your keynote in the car I

play00:58

fell asleep

play01:01

and so right now right now I'm a little

play01:04

bit groggy so if I say nonsense things

play01:08

please I let me apologize first no well

play01:12

let's see um uh Charles we've gone back

play01:15

a very long ways yeah and and um uh what

play01:20

are we doing oh I needed some water I

play01:23

need to spe up okay right my energy

play01:28

yeah they said I was on this side and

play01:30

you keep going on my

play01:33

side this is what happens when we don't

play01:36

practice you don't need to and you are

play01:39

no time you you don't need and so so um

play01:43

I uh what what were we saying um this is

play01:46

a very important time because we have a

play01:47

new agent Computing coming there are two

play01:50

things that are happening at the same

play01:51

time the first is accelerated Computing

play01:54

accelerated Computing has arrived at a

play01:57

time

play01:58

oh Green Computing

play02:01

yeah Green computer yeah okay

play02:11

Computing I think I think when you say

play02:14

Green Computing you mean energy

play02:15

efficient Computing right yes Nvidia is

play02:18

energy efficient Computing yes we have S

play02:21

we follow you all

play02:23

right look Green Computing and Green

play02:27

Computing all right so so um uh

play02:30

accelerated computing's time has come

play02:32

because for a very long time the amount

play02:35

of data processing has been increasing

play02:38

exponentially yeah and yet CPU scaling

play02:41

has slowed for many many years so we've

play02:44

been we have now an enormous amount of

play02:47

waste wasted energy and wasted cost

play02:50

trapped inside the data centers so when

play02:53

we accelerate the data centers the

play02:56

savings

play02:57

incredible because it has been sold long

play03:00

of waste

play03:02

trapped and so now we can release the

play03:05

waste and use that energy for a new

play03:10

purpose number one accelerate every

play03:12

application accelerate every data center

play03:15

these

play03:16

amazing servers here right so many new

play03:19

products so many new products you have

play03:21

220 new products unbelievable did he

play03:24

tell you that already no I W very high

play03:30

I came to announce super micros products

play03:33

and so that's the first thing the second

play03:35

thing is because the Energy Efficiency

play03:38

and the performance efficiency and the

play03:39

cost efficiency is so incredibly great

play03:42

with accelerated Computing a new way of

play03:44

doing Computing has emerged and it's

play03:46

called generative AI generative AI is an

play03:50

incredible thing people say generative

play03:53

AI inference it's related not the same

play03:56

inference

play03:58

recognizing C dog speech inference

play04:03

generation text Generation image

play04:06

Generation video generation that's what

play04:09

we call a generative AI the pressure of

play04:12

generative AI to not the pressure but

play04:15

the the transition to generative AI will

play04:18

affect every single data center in the

play04:19

world we have a trillion dat a trillion

play04:22

dollars worth of data centers in the

play04:23

world that's established $3 trillion

play04:26

probably by 2030 in another 6 years we

play04:29

have to to modernize all of them with

play04:32

these amazing systems yeah that's the

play04:35

reason why the demand is so great

play04:36

because all of these data centers has to

play04:38

be modernized and Charles and the super

play04:42

micro team is ready to take your

play04:47

order Json I'm your I'm your best sales

play04:51

guy thank you I work on commission no

play04:57

commission we buy more cheaper from you

play04:59

don't buy more

play05:00

[Laughter]

play05:05

chips so

play05:08

that's Jon sh Michael is now shipping

play05:12

data center uh liqu cooling DLC R inum

play05:16

production now to lower the power

play05:18

consumption so you can manufacture more

play05:21

AI chip yeah yeah thousand of how here

play05:25

you see

play05:31

[Applause]

play05:39

[Laughter]

play05:43

I have many American colleagues they

play05:45

don't understand my Chinese I have many

play05:47

Chinese colleagues they don't understand

play05:49

my

play05:52

Chinese hi

play05:56

y we are shipping up to 1,000 R per

play06:01

month now 1,000 R like it is multiply by

play06:05

ASP yeah you're going to be a gigantic

play06:08

company yeah thank

play06:11

you that's why I need a more

play06:14

CH did you guys all do the

play06:17

math Millions

play06:19

times thousands time 52 no no no you

play06:24

charging me $2 million more than $2

play06:26

million for d

play06:31

[Laughter]

play06:37

are we allowed to do this on TV are we

play06:40

on

play06:41

TV I I guess the well is

play06:49

this so we are shipping about 1,000

play06:53

that's incredible now this this uh

play06:55

600,000 Parts this is probably more than

play06:57

600,000 parts how many pounds oh I don't

play07:02

know can I move three I think it's 3,000

play07:06

lb more than 3,000 lb

play07:09

yeah it's incredible so yeah our goal

play07:13

this year is to ship more than 10,000

play07:17

record you know the Charles this is the

play07:19

thing that's really amazing uh people

play07:21

think that we're building

play07:22

gpus you know GP is a

play07:25

chip there are 72 chips in here and then

play07:28

there are six

play07:30

600,000 other

play07:32

parts

play07:33

it's 72 chips probably weighs one

play07:37

pound this is 3, 2,999 other

play07:42

pounds so the amount of Technology

play07:44

that's inside one of these RS is really

play07:46

quite extraordinary this is a technology

play07:48

Marvel the most most most complex most

play07:52

advanced computer the world's ever made

play07:54

yeah exactly the p in the world now yeah

play07:57

absolutely incredible and the software

play07:59

that it takes to run this is

play08:01

unbelievable yeah unbelievable isn't

play08:03

that right and so I think that that

play08:06

people now are starting to realize that

play08:08

when we say GPU server of course the

play08:11

brain is the GPU yeah but the system is

play08:14

much much more complex than that and

play08:15

super micro does amazing engineering

play08:18

thank

play08:20

[Laughter]

play08:27

you huh what I

play08:37

okay then we there's some Americans this

play08:40

year we are going to ship hopefully make

play08:44

when we're together sometimes we speak

play08:46

Taiwanese sometimes we speak Mandarin

play08:48

and then when we disagree we speak

play08:50

[Laughter]

play08:53

English we try to make a thlc mar share

play08:56

from 1% to 15%

play09:00

this year wow Save lot of power for your

play09:02

TB yeah yeah the Energy Efficiency is so

play09:05

much better the cost to the data center

play09:07

is cheaper cheaper that's right people

play09:09

don't realize this liquid cooled systems

play09:12

eliminates an enormous amount of cost in

play09:14

the data center yeah so that you can use

play09:17

that waste capture that waste and put it

play09:20

into Computing in the future in the

play09:23

future Computing throughput is

play09:26

revenues because it's token generation

play09:30

and token generation is dollars per

play09:34

million tokens just like

play09:37

energy dollars per kilowatt hour we have

play09:42

now invented a new commodity this is a

play09:44

very important idea for all of you this

play09:46

is a new commodity it has value and the

play09:50

faster you can generate it the higher

play09:53

throughput the greater utilization the

play09:56

higher your revenues it is absolutely

play09:59

true and it's directly measurable that's

play10:01

why this is a factory not a data center

play10:04

that's why this is a factory not a file

play10:06

server it's not a retrieval of files

play10:09

it's not used for exchanging emails this

play10:11

is directly generating revenues for

play10:14

factories that's why we call it AI

play10:16

factories and so

play10:18

powerful and only s million

play10:23

[Laughter]

play10:27

dollars a

play10:30

such okay so $3 million and you can

play10:34

generate who knows how much revenue per

play10:37

year right uh 3 million 1,000 and every

play10:42

year have how many

play10:44

months

play10:46

12 the the return on the return on large

play10:50

language model generation token

play10:52

generation is going to be very very good

play10:54

yeah be huge and the reason for that is

play10:56

because the token embeds intelligence

play10:58

yeah and the int could be used in so

play11:00

many different Industries and so the

play11:02

future is very important it's time to

play11:08

Startup yeah time to

play11:10

Startup throughput yeah

play11:15

utilization all matter so

play11:18

reliability has Revenue implication

play11:20

throughput has Revenue implication

play11:23

startup has Revenue implication yeah

play11:26

that's why it's so important that we

play11:27

integrate the whole s whole system into

play11:29

a rack scale get all the software

play11:32

working connected to all the all the

play11:34

networking so that and we build all of

play11:37

our own data centers we build our own

play11:39

supercomputers so that we know when you

play11:42

install this when you install super

play11:44

micro in your factories the startup time

play11:47

will be extremely fast your utilization

play11:50

will be extremely high and your

play11:52

throughput will be extremely high

play11:54

because your revenues depends on it

play11:57

Factory output is measured by all of

play12:00

those factors very complicated yeah and

play12:03

all of those R are Invidia sofware

play12:07

license all certified so the sound of

play12:10

that parking the cable and they can run

play12:13

and it runs that's right and all of the

play12:15

Nvidia Nims all of the large language

play12:17

models it just runs on all these systems

play12:19

yeah

play12:29

[Laughter]

play12:32

[Applause]

play12:37

we are shipping thousand R

play12:41

very

play12:52

yes very

play12:54

beautiful Charles Charles said that this

play12:57

is

play12:58

everything everything in here is NVIDIA

play13:01

for all the American citizens

play13:04

there

play13:06

from to H AI everything all Nvidia sare

play13:11

all all Green Computing all Green

play13:14

Computing all green computer all all

play13:17

support that's

play13:19

fantastic

play13:20

good let go through something detail

play13:29

okay okay

play13:32

okay H1 H2 B1 for you cooling wow

play13:39

shipping in B wow and this one your p200

play13:44

uhhuh fully ready beautiful for your

play13:46

chip beautiful beautiful this will be

play13:50

how many time faster than this so we

play13:53

have we have we have uh uh for Blackwell

play13:57

Blackwell has air cool

play14:00

liquor

play14:01

cooled

play14:03

x86

play14:05

Grace MV link 8 MV link 2 MV link 36 MV

play14:11

link 72 yeah so many different

play14:14

configurations yeah so that depending on

play14:16

the type of type of utilization type of

play14:19

use case you have the type of data

play14:21

center that you have uh Charles is ready

play14:23

to serve you immediately right

play14:25

immediately doesn't need to acheve yeah

play14:27

one hand we got to acheve second hand we

play14:30

Shi to C W thank goodness we only need

play14:33

two hands in two weeks in two

play14:39

weeks that's incredible and all of it

play14:41

software compatible this is really this

play14:44

is really the amazing thing certifi

play14:45

literally everything here is software

play14:47

compatible one% yeah and software as we

play14:50

know is the most complex part of high

play14:52

performance Computing yeah thank you for

play14:54

those great offering they are all ready

play14:57

to service our customer there are three

play14:59

very important software Stacks that we

play15:01

have in our company that everything is

play15:03

built on top of the first of course is

play15:05

Cuda very famous the second for all of

play15:07

the networking because networking is

play15:09

just not networking networking today

play15:12

networking today is a Computing

play15:15

fabric networking today is a Computing

play15:17

fabric not just for sending email to

play15:19

each

play15:20

other

play15:21

4 Mez a gigahertz megahertz this is not

play15:27

1980s

play15:35

be

play15:36

Mez

play15:38

kilohertz gahz gigahertz yes 400

play15:43

gigabits per second 800 gigabits per

play15:45

second and and then of course Next

play15:46

Generation coming 1600 but the important

play15:48

thing is all of the software that we

play15:51

have that runs on the networking for

play15:54

distributed computing is on top of two

play15:58

software Stacks one is called DOA for

play16:00

the nick nickel for the fabric yeah and

play16:04

it enables us to distribute the workload

play16:07

across the network very very efficiently

play16:10

because ethernet was was not designed

play16:12

for hyperform computing you make our job

play16:14

easier but still very py because you

play16:16

have so many

play16:19

great my job is to help give you

play16:23

[Laughter]

play16:27

job we

play16:29

and because because you do such a good

play16:31

job it becomes gives me job oh don't

play16:34

forget that your another

play16:40

baby yeah yeah yeah

play16:50

yeah inside

play16:53

here this this is an incredible

play16:57

incredible system in fact in

play17:00

fact in fact um these chips are all

play17:02

connected together using high-speed

play17:05

interconnect the world's fastest CIS the

play17:07

CIS is incredibly fast and very energy

play17:11

efficient and so we can connect this

play17:14

great CPU to dual Blackwell gpus and

play17:20

that's very important because in the

play17:22

training stage the memory system of

play17:26

Grace could be used for checkpoint

play17:27

restart checkpoint and restarting is

play17:30

very important for high utilization and

play17:32

high uptime and so checkpoint restart uh

play17:35

could be stored in the system memory

play17:37

that system memory is very low energy

play17:39

very low power and the link between

play17:42

Blackwell and Grace is very very high

play17:44

second during inference time as you know

play17:47

there's a concept called

play17:49

prompts context in context training

play17:54

prompting that prompt memory that

play17:56

context memory is right here this is the

play17:58

memory memory the thinking memory the

play18:00

working memory of AI and so this memory

play18:03

needs to be very high performance very

play18:05

low energy and so during training we

play18:07

have good use for gray gray CPU during

play18:10

inference we have excellent use for gray

play18:12

CPU and the interconnect is very very

play18:15

high speed very low power F optimiz and

play18:17

so the re the benefit is because we

play18:20

compress so many in one system yeah if

play18:23

we

play18:24

save 20 watts 50 Watts on the

play18:27

interconnect you multiply by the whole

play18:30

rack then we can take the energy and use

play18:32

it for computing y so Energy Efficiency

play18:36

translates to higher performance to

play18:39

that's right Green

play18:42

[Applause]

play18:45

Computing

play18:52

huh I am a super micro employee

play18:59

super micro

play19:06

employee where AI Control

play19:12

us of course not um we we have to we

play19:16

have to uh the most important thing of

play19:19

course at the moment is we have to make

play19:21

AI work

play19:23

well right now ai is of course uh

play19:26

working extremely well and in many

play19:28

applications AI has become good enough

play19:31

to good enough to become useful it has

play19:35

achieved the plateau of good enough very

play19:37

useful however we want it to be

play19:40

incredibly good we want it to be very

play19:42

functional everything from Guard railing

play19:45

for uh fine-tuning skill learning there

play19:48

are many different things that we still

play19:50

have to improve okay so we know that AI

play19:52

is AI still has long ways to go that's

play19:55

job number one is Advance the technology

play19:57

at the same time we have to advanced

play19:59

Safety technology as you know uh our the

play20:02

planes that we all flew on to come here

play20:05

has autopilot and autopilot is automatic

play20:08

technology in order for planes to be

play20:11

safe a great deal of Technology had to

play20:13

be invented to keep the plane safe yeah

play20:16

also practices to monitor the planes air

play20:19

traffic control other planes monitor the

play20:22

planes Pilots monitoring each other many

play20:24

different ways to keep uh AI uh keep

play20:28

autopilot safe in the future we'll do

play20:30

the same thing with AI there will be AIS

play20:32

that watch AIS there are people that

play20:33

watch AIS there's gu right guard rails

play20:36

that keep AI guard rail and so there's

play20:38

going to be a whole lot of different

play20:39

Technologies we need to create for

play20:41

safety technology for safety and then

play20:43

third of course we need to have good

play20:45

policies for safety good practices and

play20:47

good policies for safety talking about

play20:50

it is very important so that we can all

play20:52

remind each other that we have to do

play20:54

good science good engineering good

play20:57

business practice good policy practice

play20:59

good industrial practice all of those

play21:01

things has to advance so perfect

play21:03

strategy so the conclusion is one the

play21:07

more you buy the more

play21:10

safe the more you buy the more you safe

play21:12

the more you buy the more you safe

play21:15

yeah thank you Jas thank you so much

play21:18

good job thank you

play21:20

everybody thank you okay thank you thank

play21:25

you thank you thank you all right have a

play21:28

great

play21:29

thank you

play21:32

[Music]

Rate This

5.0 / 5 (0 votes)

Related Tags
AI FutureGreen ComputingTech EventInnovationData CentersAccelerated ComputingGenerative AIEnergy EfficiencySupermicro ProductsNvidia Technology