Future Computers Will Be Radically Different (Analog Computing)

Veritasium
1 Mar 202221:42

Summary

TLDRThis script explores the resurgence of analog computing in the era of artificial intelligence. It explains how analog computers, once dominant for tasks like predicting eclipses and guiding anti-aircraft guns, fell out of favor with the advent of digital computers. However, factors like energy efficiency, speed, and the unique requirements of neural networks may signal a comeback for analog technology. The script delves into the history of AI, the limitations of perceptrons, and the breakthroughs that led to the current AI boom, highlighting the potential of analog computing to perform matrix multiplications essential for AI with less power and greater efficiency.

Takeaways

  • 🕒 Analog computers were once the most powerful on Earth, used for predicting eclipses, tides, and guiding anti-aircraft guns.
  • 🔄 The advent of solid-state transistors led to the rise of digital computers, which have become the standard for virtually all computing tasks today.
  • 🔧 Analog computers can be reprogrammed by changing the connections of wires to solve a range of differential equations, such as simulating a damped mass oscillating on a spring.
  • 🌐 Analog computers represent data as varying voltages rather than binary digits, which allows for certain computations to be performed more directly and with less power.
  • ⚡ The efficiency of analog computers is highlighted by the fact that adding two currents can be done by simply connecting two wires, compared to the hundreds of transistors required in digital computers.
  • 🚫 However, analog computers are not general-purpose devices and cannot perform tasks like running Microsoft Word.
  • 🔄 They also suffer from non-repeatability and inexactness due to the continuous nature of their inputs and outputs and the variability in component manufacturing.
  • 🌟 The potential resurgence of analog technology is linked to the rise of artificial intelligence and the specific computational needs of neural networks.
  • 🤖 The perceptron, an early neural network model, demonstrated the potential of AI but was limited in its capabilities and faced criticism, leading to the first AI winter.
  • 📈 The development of deeper neural networks, like AlexNet, and the use of large datasets like ImageNet have significantly improved AI performance, but also increased computational demands.
  • 🔌 Modern challenges for AI, such as high energy consumption and the limitations of Moore's Law, suggest that analog computing could offer a more efficient solution for certain AI tasks.
  • 🛠️ Companies like Mythic AI are developing analog chips to run neural networks, offering high computational power with lower energy consumption compared to digital counterparts.

Q & A

  • What were analog computers historically used for?

    -Historically, analog computers were used for predicting eclipses, tides, and guiding anti-aircraft guns.

  • How did the advent of solid-state transistors impact the development of computers?

    -The advent of solid-state transistors marked the rise of digital computers, which eventually became the dominant form of computing due to their versatility and efficiency.

  • What is the fundamental difference between analog and digital computers in terms of processing?

    -Analog computers process information through continuous signals, like voltage oscillations, whereas digital computers process information in discrete binary values, zeros and ones.

  • How does the analog computer in the script simulate a physical system like a damped mass on a spring?

    -The analog computer simulates a damped mass on a spring by using electrical circuitry that oscillates like the physical system, allowing the user to see the position of the mass over time on an oscilloscope.

  • What is the Lorenz system and why is it significant in the context of analog computers?

    -The Lorenz system is a set of differential equations that model atmospheric convection and is known for being one of the first discovered examples of chaos. It is significant because analog computers can simulate such systems and visualize their behavior, like the Lorenz attractor.

  • What are some advantages of analog computers mentioned in the script?

    -Advantages of analog computers include their ability to perform computations quickly, requiring less power, and having a simpler process for certain mathematical operations like addition and multiplication.

  • What are the main drawbacks of analog computers as discussed in the script?

    -The main drawbacks of analog computers are their single-purpose nature, non-repeatability of exact results, and inexactness due to component variations, which can lead to errors.

  • Why might analog computers be making a comeback according to the script?

    -Analog computers may be making a comeback due to the rise of artificial intelligence and the need for efficient processing of matrix multiplications, which are common in neural networks, and the limitations being reached by digital computers.

  • What is the perceptron and how was it initially perceived in the context of AI?

    -The perceptron was an early neural network model designed to mimic how neurons fire in our brains. It was initially perceived as a breakthrough in AI, with claims that it could perform original thought and even distinguish between complex patterns like cats and dogs.

  • How did the development of self-driving cars contribute to the resurgence of AI and neural networks?

    -The development of self-driving cars, such as the one created by researchers at Carnegie Mellon using an artificial neural network called ALVINN, demonstrated the potential of neural networks to process complex data like images for steering, contributing to the resurgence of AI.

  • What is the significance of ImageNet and how did it impact AI research?

    -ImageNet, a database of 1.2 million human-labeled images created by Fei-Fei Li, was significant because it provided a large dataset for training neural networks. It led to the ImageNet Large Scale Visual Recognition Challenge, which spurred advancements in neural network performance and understanding.

  • What challenges are neural networks facing as they grow in size and complexity?

    -Neural networks face challenges such as high energy consumption for training, the Von Neumann Bottleneck due to data fetching taking up most of the computation time and energy, and the limitations of Moore's Law as transistor sizes approach atomic levels.

  • How does Mythic AI's approach to using analog technology for neural networks address these challenges?

    -Mythic AI uses analog chips that repurpose digital flash storage cells as variable resistors to perform matrix multiplications, which are common in neural networks. This approach can potentially offer high computational power with lower energy consumption and may bypass some of the limitations faced by digital computers.

  • What are some potential applications of Mythic AI's analog computing technology?

    -Potential applications of Mythic AI's technology include security cameras, autonomous systems, manufacturing inspection equipment, and smart home speakers for wake word detection, offering efficient AI processing with lower power requirements.

  • How does the script suggest the future of information technology might involve a return to analog methods?

    -The script suggests that the future of information technology might involve a return to analog methods due to their potential for efficient processing of tasks like matrix multiplications, which are central to AI and neural networks, and the limitations being reached by digital technologies.

Outlines

00:00

🌌 The Resurgence of Analog Computing

This paragraph introduces the historical significance of analog computers and their potential resurgence due to current technological challenges. Analog computers, once pivotal for predicting eclipses and guiding weaponry, fell out of favor with the rise of digital computers. However, factors such as energy efficiency, speed, and the unique requirements of artificial intelligence are creating a scenario where analog computing could regain relevance. The narrator demonstrates an analog computer's capability to solve differential equations and highlights its advantages, such as requiring fewer components for basic operations compared to digital computers. Despite these advantages, analog computers are limited by their single-purpose nature, non-repeatability, and inexactness, which led to their decline. The paragraph concludes by suggesting that the rise of AI might be the catalyst for a comeback of analog technology.

05:02

🤖 The Evolution of Artificial Neural Networks

The second paragraph delves into the history and evolution of artificial neural networks, beginning with the perceptron developed by Frank Rosenblatt in 1958. The perceptron aimed to mimic neural activity in the brain, using a grid of photocells to represent input neurons and adjustable weights to simulate synaptic connections. Despite initial optimism and media hype, the perceptron had limitations and could not perform complex tasks such as distinguishing between cats and dogs. This led to the first AI winter. However, in the 1980s, a resurgence occurred with the development of ALVINN, a neural network capable of steering a vehicle by recognizing road patterns. This was achieved through a process called backpropagation, which adjusted the network's weights to improve its performance. The paragraph illustrates the progress in AI and the shift towards more sophisticated neural networks with hidden layers, setting the stage for further advancements in the field.

10:03

🔍 The Impact of Data on Neural Network Advancements

This paragraph discusses the role of data in advancing neural networks. Fei-Fei Li's creation of ImageNet, a vast database of labeled images, revolutionized AI by providing a rich source of data for training neural networks. The ImageNet Large Scale Visual Recognition Challenge spurred significant improvements in neural network performance, as evidenced by the dramatic reduction in error rates. The key to this success was the scale and depth of the neural networks, which required substantial computational power and data to train effectively. The paragraph highlights the computational intensity of training neural networks, the use of GPUs to manage the large matrix multiplications involved, and the resulting improvements in AI's ability to recognize and classify images, even surpassing human performance levels.

15:03

🚀 The Potential of Analog Computing in AI

The fourth paragraph explores the potential of analog computing in addressing the challenges faced by digital computers in running AI algorithms, particularly neural networks. As digital computers reach their limits in energy consumption, data processing speed, and miniaturization, analog computers offer an alternative approach. The paragraph introduces Mythic AI, a startup developing analog chips for neural networks, which can perform matrix multiplications more efficiently. The company uses repurposed digital flash storage cells as variable resistors to perform multiplications, resulting in a compact chip capable of executing a vast number of operations per second with relatively low power consumption. This technology could be applied in various fields, including security cameras, autonomous systems, and manufacturing inspection, offering a promising solution for AI workloads.

20:04

🌐 The Future of Computing: Analog or Digital?

The final paragraph contemplates the future of computing, suggesting that analog computing might be better suited for the tasks we want computers to perform today. It discusses the challenges of analog computing, such as signal distortion and the need for conversion between analog and digital domains, while also acknowledging its potential in specific applications like AI. The narrator reflects on the historical progression from analog to digital and speculates that we might be returning to analog methods to achieve true artificial intelligence. The paragraph concludes with a personal note from the narrator about the value of hands-on learning and a promotion for Brilliant, an educational platform that offers interactive lessons on neural networks and problem-solving.

Mindmap

Keywords

💡Analog computer

An analog computer is a type of computing device that uses continuous signals instead of discrete digital signals. It is defined by its ability to perform calculations using the physical properties of its components, such as voltages that oscillate to represent phenomena like a mass on a spring. In the video, analog computers are discussed as having a resurgence due to their potential in handling tasks like matrix multiplication, which is fundamental to artificial neural networks.

💡Digital computer

A digital computer is a computing system that processes data in discrete form, using binary digits (bits) to represent information. The video script contrasts digital computers with analog computers, highlighting the shift from analog to digital with the advent of solid-state transistors and the limitations of digital computers in handling large-scale AI tasks due to energy consumption and the Von Neumann bottleneck.

💡Differential equations

Differential equations are equations that involve derivatives, which describe the rates of change in various quantities. In the context of the video, analog computers can be programmed to solve a range of differential equations, showcasing their ability to model and solve complex physical systems, such as simulating a damped mass oscillating on a spring.

💡Oscilloscope

An oscilloscope is a device used to display and analyze the waveform of electronic signals. In the script, the oscilloscope is used to visualize the position of a mass over time in the analog computer's simulation of a spring-mass system, demonstrating the analog computer's ability to provide real-time graphical output of its calculations.

💡Lorenz system

The Lorenz system is a set of differential equations that describe atmospheric convection and are known for exhibiting chaotic behavior. The video mentions the Lorenz system as an example of a complex model that can be simulated on an analog computer, highlighting the system's famous 'butterfly shape' attractor, which is a visual representation of chaos theory.

💡Artificial Intelligence (AI)

Artificial Intelligence refers to the simulation of human intelligence in machines, which can involve learning, reasoning, and self-improvement. The video discusses the evolution of AI, from the early perceptron models to modern neural networks, and how AI's growing demand for computational power might lead to a resurgence of analog computing techniques.

💡Neuron

A neuron is a cell that is the fundamental unit of the nervous system, responsible for transmitting information through electrical and chemical signals. The video script uses the neuron as an analogy to explain the functioning of artificial neural networks, where individual units (neurons) can 'fire' based on weighted inputs, similar to biological neurons.

💡Perceptron

A perceptron is an early model of artificial neural networks, designed to mimic the behavior of biological neurons. The video script describes the perceptron's ability to perform basic image recognition tasks and its role in the development of AI, despite its limitations and the subsequent 'AI winter' that followed due to its inability to handle complex patterns.

💡Backpropagation

Backpropagation is a method used to train artificial neural networks by adjusting the weights of the connections between neurons based on the error in the network's output. The video mentions backpropagation in the context of training the ALVINN system, which was an early application of neural networks for self-driving cars.

💡ImageNet

ImageNet is a large-scale database of images that was created to provide a vast amount of labeled data for training and testing AI systems, particularly in the field of computer vision. The video script discusses ImageNet's role in advancing AI by enabling the training of deep neural networks and the subsequent improvement in their performance on visual recognition tasks.

💡Matrix multiplication

Matrix multiplication is a mathematical operation that takes a pair of matrices (two-dimensional arrays of numbers) and produces a new matrix by combining the values in a specific way. In the context of the video, matrix multiplication is a fundamental operation in neural networks, particularly for processing layers of neurons and their weights, and is a task where analog computing could offer advantages in terms of speed and energy efficiency.

Highlights

Analog computers, once the most powerful on Earth, predicted eclipses and guided anti-aircraft guns before being overshadowed by digital computers.

A resurgence of analog technology is being driven by a perfect storm of factors, including the limitations of digital computers in handling AI workloads.

An analog computer can be programmed to solve a range of differential equations by connecting wires in specific ways, simulating physical phenomena like damped mass oscillation.

The Lorenz system, a basic model of atmospheric convection and an early example of chaos, can be simulated on an analog computer, demonstrating its ability to handle complex systems.

Analog computers offer advantages like fast computations and low power usage, but they are limited by their single-purpose nature and lack of precision.

The simplicity of analog computation allows for adding two currents by merely connecting wires, contrasting with the complexity and resource intensity of digital computation.

The drawbacks of analog computers include their inability to handle general-purpose computing tasks and the inherent inaccuracy due to continuous inputs and outputs.

Artificial intelligence's evolution has been marked by periods of progress and setbacks, with the perceptron being an early but limited model of neural networks.

The perceptron's training process involved adjusting weights based on the difference between expected and actual outputs, a precursor to modern neural network training.

The limitations of the perceptron and subsequent AI models led to the first AI winter, a period of reduced interest and funding in AI research.

The resurgence of AI in the 1980s was marked by the creation of ALVINN, an artificial neural network that enabled one of the first self-driving cars.

The success of modern AI, particularly in image recognition, is attributed to the availability of large datasets like ImageNet and the use of deep neural networks.

The increasing demand for larger neural networks poses challenges for digital computers, particularly in terms of energy consumption and computational efficiency.

Mythic AI is pioneering the use of analog chips to run neural networks, offering a solution to the energy and efficiency issues faced by digital computers.

Mythic AI's technology utilizes digital flash storage cells as variable resistors to perform matrix multiplications, a key operation in neural networks.

The potential applications of analog computing in AI include virtual reality, security cameras, autonomous systems, and manufacturing inspection equipment.

Analog computing may offer a more suitable approach for certain tasks, challenging the notion of digital as the optimal way of processing information.

The future of information technology might not see digital as the endpoint but rather as a starting point, with analog computing potentially playing a significant role in achieving true artificial intelligence.

Transcripts

play00:00

- For hundreds of years,

play00:01

analog computers were the most powerful computers on Earth,

play00:05

predicting eclipses, tides, and guiding anti-aircraft guns.

play00:09

Then, with the advent of solid-state transistors,

play00:12

digital computers took off.

play00:14

Now, virtually every computer we use is digital.

play00:18

But today, a perfect storm of factors is setting the scene

play00:21

for a resurgence of analog technology.

play00:24

This is an analog computer,

play00:27

and by connecting these wires in particular ways,

play00:30

I can program it to solve a whole range

play00:32

of differential equations.

play00:34

For example, this setup allows me to simulate

play00:37

a damped mass oscillating on a spring.

play00:40

So on the oscilloscope, you can actually see the position

play00:43

of the mass over time.

play00:45

And I can vary the damping,

play00:48

or the spring constant,

play00:51

or the mass, and we can see how the amplitude

play00:54

and duration of the oscillations change.

play00:57

Now what makes this an analog computer

play01:00

is that there are no zeros and ones in here.

play01:03

Instead, there's actually a voltage that oscillates

play01:06

up and down exactly like a mass on a spring.

play01:10

The electrical circuitry is an analog

play01:12

for the physical problem,

play01:14

it just takes place much faster.

play01:16

Now, if I change the electrical connections,

play01:19

I can program this computer

play01:20

to solve other differential equations,

play01:22

like the Lorenz system,

play01:24

which is a basic model of convection in the atmosphere.

play01:27

Now the Lorenz system is famous because it was one

play01:29

of the first discovered examples of chaos.

play01:32

And here, you can see the Lorenz attractor

play01:35

with its beautiful butterfly shape.

play01:38

And on this analog computer,

play01:39

I can change the parameters

play01:42

and see their effects in real time.

play01:46

So these examples illustrate some

play01:47

of the advantages of analog computers.

play01:50

They are incredibly powerful computing devices,

play01:53

and they can complete a lot of computations fast.

play01:56

Plus, they don't take much power to do it.

play02:01

With a digital computer,

play02:02

if you wanna add two eight-bit numbers,

play02:05

you need around 50 transistors,

play02:08

whereas with an analog computer,

play02:09

you can add two currents,

play02:12

just by connecting two wires.

play02:15

With a digital computer to multiply two numbers,

play02:18

you need on the order of 1,000 transistors

play02:20

all switching zeros and ones,

play02:23

whereas with an analog computer,

play02:24

you can pass a current through a resistor,

play02:28

and then the voltage across this resistor

play02:31

will be I times R.

play02:34

So effectively,

play02:35

you have multiplied two numbers together.

play02:40

But analog computers also have their drawbacks.

play02:42

For one thing,

play02:43

they are not general-purpose computing devices.

play02:46

I mean, you're not gonna run Microsoft Word on this thing.

play02:49

And also, since the inputs and outputs are continuous,

play02:52

I can't input exact values.

play02:55

So if I try to repeat the same calculation,

play02:58

I'm never going to get the exact same answer.

play03:01

Plus, think about manufacturing analog computers.

play03:04

There's always gonna be some variation

play03:06

in the exact value of components,

play03:08

like resistors or capacitors.

play03:10

So as a general rule of thumb,

play03:12

you can expect about a 1% error.

play03:15

So when you think of analog computers,

play03:17

you can think powerful, fast, and energy-efficient,

play03:20

but also single-purpose, non-repeatable, and inexact.

play03:25

And if those sound like deal-breakers,

play03:28

it's because they probably are.

play03:30

I think these are the major reasons

play03:31

why analog computers fell out of favor

play03:33

as soon as digital computers became viable.

play03:36

Now, here's why analog computers may be making a comeback.

play03:41

(computers beeping)

play03:43

It all starts with artificial intelligence.

play03:46

- [Narrator] A machine has been programmed to see

play03:48

and to move objects.

play03:51

- AI isn't new.

play03:52

The term was coined back in 1956.

play03:55

In 1958, Cornell University psychologist,

play03:58

Frank Rosenblatt, built the perceptron,

play04:01

designed to mimic how neurons fire in our brains.

play04:05

So here's a basic model of how neurons in our brains work.

play04:08

An individual neuron can either fire or not,

play04:12

so its level of activation can be represented

play04:14

as a one or a zero.

play04:16

The input to one neuron

play04:18

is the output from a bunch other neurons,

play04:21

but the strength of these connections

play04:22

between neurons varies,

play04:24

so each one can be given a different weight.

play04:27

Some connections are excitatory,

play04:29

so they have positive weights,

play04:30

while others are inhibitory,

play04:32

so they have negative weights.

play04:34

And the way to figure out

play04:35

whether a particular neuron fires,

play04:37

is to take the activation of each input neuron

play04:40

and multiply by its weight,

play04:42

and then add these all together.

play04:44

If their sum is greater than some number called the bias,

play04:47

then the neuron fires,

play04:49

but if it's less than that, the neuron doesn't fire.

play04:53

As input, Rosenblatt's perceptron had 400 photocells

play04:57

arranged in a square grid,

play04:59

to capture a 20 by 20-pixel image.

play05:02

You can think of each pixel as an input neuron,

play05:04

with its activation being the brightness of the pixel.

play05:07

Although strictly speaking,

play05:09

the activation should be either zero or one,

play05:11

we can let it take any value between zero and one.

play05:15

All of these neurons are connected

play05:18

to a single output neuron,

play05:20

each via its own adjustable weight.

play05:23

So to see if the output neuron will fire,

play05:25

you multiply the activation of each neuron by its weight,

play05:28

and add them together.

play05:30

This is essentially a vector dot product.

play05:33

If the answer is larger than the bias, the neuron fires,

play05:36

and if not, it doesn't.

play05:38

Now the goal of the perceptron

play05:40

was to reliably distinguish between two images,

play05:43

like a rectangle and a circle.

play05:45

For example,

play05:46

the output neuron could always fire

play05:48

when presented with a circle,

play05:49

but never when presented with a rectangle.

play05:52

To achieve this, the perception had to be trained,

play05:55

that is, shown a series of different circles

play05:58

and rectangles, and have its weights adjusted accordingly.

play06:02

We can visualize the weights as an image,

play06:05

since there's a unique weight for each pixel of the image.

play06:09

Initially, Rosenblatt set all the weights to zero.

play06:12

If the perceptron's output is correct,

play06:14

for example, here it's shown a rectangle

play06:16

and the output neuron doesn't fire,

play06:19

no change is made to the weights.

play06:21

But if it's wrong, then the weights are adjusted.

play06:23

The algorithm for updating the weights

play06:25

is remarkably simple.

play06:27

Here, the output neuron didn't fire when it was supposed to

play06:30

because it was shown a circle.

play06:32

So to modify the weights,

play06:33

you simply add the input activations to the weights.

play06:38

If the output neuron fires when it shouldn't,

play06:40

like here, when shown a rectangle,

play06:42

well, then you subtract the input activations

play06:45

from the weights, and you keep doing this

play06:48

until the perceptron correctly identifies

play06:50

all the training images.

play06:52

It was shown that this algorithm will always converge,

play06:55

so long as it's possible to map the two categories

play06:58

into distinct groups.

play07:00

(footsteps thumping)

play07:02

The perceptron was capable of distinguishing

play07:04

between different shapes, like rectangles and triangles,

play07:07

or between different letters.

play07:09

And according to Rosenblatt,

play07:10

it could even tell the difference between cats and dogs.

play07:14

He said the machine was capable

play07:15

of what amounts to original thought,

play07:18

and the media lapped it up.

play07:20

The "New York Times" called the perceptron

play07:22

"the embryo of an electronic computer

play07:25

that the Navy expects will be able to walk, talk,

play07:28

see, write, reproduce itself,

play07:30

and be conscious of its existence."

play07:34

- [Narrator] After training on lots of examples,

play07:36

it's given new faces it has never seen,

play07:39

and is able to successfully distinguish male from female.

play07:43

It has learned.

play07:45

- In reality, the perceptron was pretty limited

play07:47

in what it could do.

play07:48

It could not, in fact, tell apart dogs from cats.

play07:52

This and other critiques were raised

play07:53

in a book by MIT giants, Minsky and Papert, in 1969.

play07:58

And that led to a bust period

play08:00

for artificial neural networks and AI in general.

play08:03

It's known as the first AI winter.

play08:06

Rosenblatt did not survive this winter.

play08:09

He drowned while sailing in Chesapeake Bay

play08:12

on his 43rd birthday.

play08:14

(mellow upbeat music)

play08:17

- [Narrator] The NAV Lab is a road-worthy truck,

play08:19

modified so that researchers or computers

play08:22

can control the vehicle as occasion demands.

play08:25

- [Derek] In the 1980s, there was an AI resurgence

play08:28

when researchers at Carnegie Mellon created one

play08:30

of the first self-driving cars.

play08:32

The vehicle was steered

play08:33

by an artificial neural network called ALVINN.

play08:36

It was similar to the perceptron,

play08:37

except it had a hidden layer of artificial neurons

play08:41

between the input and output.

play08:43

As input, ALVINN received 30 by 32-pixel images

play08:47

of the road ahead.

play08:48

Here, I'm showing them as 60 by 64 pixels.

play08:51

But each of these input neurons was connected

play08:54

via an adjustable weight to a hidden layer of four neurons.

play08:57

These were each connected to 32 output neurons.

play09:01

So to go from one layer of the network to the next,

play09:04

you perform a matrix multiplication:

play09:06

the input activation times the weights.

play09:10

The output neuron with the greatest activation

play09:12

determines the steering angle.

play09:15

To train the neural net,

play09:16

a human drove the vehicle,

play09:18

providing the correct steering angle

play09:20

for a given input image.

play09:22

All the weights in the neural network were adjusted

play09:24

through the training

play09:25

so that ALVINN's output better matched that

play09:27

of the human driver.

play09:30

The method for adjusting the weights

play09:31

is called backpropagation,

play09:33

which I won't go into here,

play09:34

but Welch Labs has a great series on this,

play09:37

which I'll link to in the description.

play09:40

Again, you can visualize the weights

play09:41

for the four hidden neurons as images.

play09:44

The weights are initially set to be random,

play09:46

but as training progresses,

play09:48

the computer learns to pick up on certain patterns.

play09:51

You can see the road markings emerge in the weights.

play09:54

Simultaneously, the output steering angle coalesces

play09:58

onto the human steering angle.

play10:00

The computer drove the vehicle at a top speed

play10:03

of around one or two kilometers per hour.

play10:06

It was limited by the speed

play10:07

at which the computer could perform matrix multiplication.

play10:12

Despite these advances,

play10:13

artificial neural networks still struggled

play10:15

with seemingly simple tasks,

play10:17

like telling apart cats and dogs.

play10:19

And no one knew whether hardware

play10:22

or software was the weak link.

play10:24

I mean, did we have a good model of intelligence,

play10:26

we just needed more computer power?

play10:28

Or, did we have the wrong idea

play10:30

about how to make intelligence systems altogether?

play10:33

So artificial intelligence experienced another lull

play10:36

in the 1990s.

play10:38

By the mid 2000s,

play10:39

most AI researchers were focused on improving algorithms.

play10:43

But one researcher, Fei-Fei Li,

play10:45

thought maybe there was a different problem.

play10:48

Maybe these artificial neural networks

play10:50

just needed more data to train on.

play10:52

So she planned to map out the entire world of objects.

play10:56

From 2006 to 2009, she created ImageNet,

play10:59

a database of 1.2 million human-labeled images,

play11:02

which at the time,

play11:03

was the largest labeled image dataset ever constructed.

play11:06

And from 2010 to 2017,

play11:08

ImageNet ran an annual contest:

play11:10

the ImageNet Large Scale Visual Recognition Challenge,

play11:13

where software programs competed to correctly detect

play11:16

and classify images.

play11:17

Images were classified into 1,000 different categories,

play11:21

including 90 different dog breeds.

play11:23

A neural network competing in this competition

play11:25

would have an output layer of 1,000 neurons,

play11:28

each corresponding to a category of object

play11:30

that could appear in the image.

play11:32

If the image contains, say, a German shepherd,

play11:34

then the output neuron corresponding to German shepherd

play11:37

should have the highest activation.

play11:39

Unsurprisingly, it turned out to be a tough challenge.

play11:43

One way to judge the performance of an AI

play11:45

is to see how often the five highest neuron activations

play11:48

do not include the correct category.

play11:50

This is the so-called top-5 error rate.

play11:53

In 2010, the best performer had a top-5 error rate

play11:56

of 28.2%, meaning that nearly 1/3 of the time,

play12:01

the correct answer was not among its top five guesses.

play12:04

In 2011, the error rate of the best performer was 25.8%,

play12:09

a substantial improvement.

play12:11

But the next year,

play12:12

an artificial neural network

play12:13

from the University of Toronto, called AlexNet,

play12:16

blew away the competition

play12:17

with a top-5 error rate of just 16.4%.

play12:22

What set AlexNet apart was its size and depth.

play12:25

The network consisted of eight layers,

play12:27

and in total, 500,000 neurons.

play12:30

To train AlexNet,

play12:31

60 million weights and biases had to be carefully adjusted

play12:35

using the training database.

play12:37

Because of all the big matrix multiplications,

play12:40

processing a single image required 700 million

play12:43

individual math operations.

play12:45

So training was computationally intensive.

play12:48

The team managed it by pioneering the use of GPUs,

play12:51

graphical processing units,

play12:52

which are traditionally used for driving displays, screens.

play12:56

So they're specialized for fast parallel computations.

play13:00

The AlexNet paper describing their research

play13:02

is a blockbuster.

play13:04

It's now been cited over 100,000 times,

play13:07

and it identifies the scale of the neural network

play13:10

as key to its success.

play13:12

It takes a lot of computation to train and run the network,

play13:16

but the improvement in performance is worth it.

play13:19

With others following their lead,

play13:20

the top-5 error rate

play13:22

on the ImageNet competition plummeted

play13:23

in the years that followed, down to 3.6% in 2015.

play13:28

That is better than human performance.

play13:31

The neural network that achieved this

play13:32

had 100 layers of neurons.

play13:35

So the future is clear:

play13:36

We will see ever increasing demand

play13:38

for ever larger neural networks.

play13:40

And this is a problem for several reasons:

play13:43

One is energy consumption.

play13:45

Training a neural network requires an amount

play13:47

of electricity similar to the yearly consumption

play13:49

of three households.

play13:50

Another issue is the so-called Von Neumann Bottleneck.

play13:54

Virtually every modern digital computer

play13:55

stores data in memory,

play13:57

and then accesses it as needed over a bus.

play14:00

When performing the huge matrix multiplications required

play14:02

by deep neural networks,

play14:04

most of the time and energy goes

play14:05

into fetching those weight values rather

play14:07

than actually doing the computation.

play14:10

And finally, there are the limitations of Moore's Law.

play14:13

For decades, the number of transistors

play14:14

on a chip has been doubling approximately every two years,

play14:18

but now the size of a transistor

play14:20

is approaching the size of an atom.

play14:21

So there are some fundamental physical challenges

play14:24

to further miniaturization.

play14:26

So this is the perfect storm for analog computers.

play14:30

Digital computers are reaching their limits.

play14:32

Meanwhile, neural networks are exploding in popularity,

play14:35

and a lot of what they do boils down

play14:38

to a single task: matrix multiplication.

play14:41

Best of all, neural networks don't need the precision

play14:44

of digital computers.

play14:45

Whether the neural net is 96% or 98% confident

play14:48

the image contains a chicken,

play14:50

it doesn't really matter, it's still a chicken.

play14:52

So slight variability in components

play14:54

or conditions can be tolerated.

play14:57

(upbeat rock music)

play14:58

I went to an analog computing startup in Texas,

play15:01

called Mythic AI.

play15:03

Here, they're creating analog chips to run neural networks.

play15:06

And they demonstrated several AI algorithms for me.

play15:10

- Oh, there you go.

play15:11

See, it's getting you. (Derek laughs)

play15:13

Yeah. - That's fascinating.

play15:14

- The biggest use case is augmented in virtual reality.

play15:17

If your friend is in a different,

play15:19

they're at their house and you're at your house,

play15:20

you can actually render each other in the virtual world.

play15:24

So it needs to really quickly capture your pose,

play15:27

and then render it in the VR world.

play15:29

- So, hang on, is this for the metaverse thing?

play15:31

- Yeah, this is a very metaverse application.

play15:35

This is depth estimation from just a single webcam.

play15:38

It's just taking this scene,

play15:39

and then it's doing a heat map.

play15:41

So if it's bright, it means it's close.

play15:43

And if it's far away, it makes it black.

play15:45

- [Derek] Now all these algorithms can be run

play15:47

on digital computers,

play15:48

but here, the matrix multiplication is actually taking place

play15:52

in the analog domain. (light music)

play15:54

To make this possible,

play15:55

Mythic has repurposed digital flash storage cells.

play15:59

Normally these are used as memory

play16:01

to store either a one or a zero.

play16:03

If you apply a large positive voltage to the control gate,

play16:07

electrons tunnel up through an insulating barrier

play16:10

and become trapped on the floating gate.

play16:12

Remove the voltage,

play16:13

and the electrons can remain on the floating gate

play16:15

for decades, preventing the cell from conducting current.

play16:18

And that's how you can store either a one or a zero.

play16:21

You can read out the stored value

play16:22

by applying a small voltage.

play16:25

If there are electrons on the floating gate,

play16:26

no current flows, so that's a zero.

play16:29

If there aren't electrons,

play16:30

then current does flow, and that's a one.

play16:33

Now Mythic's idea is to use these cells

play16:36

not as on/off switches, but as variable resistors.

play16:40

They do this by putting a specific number of electrons

play16:42

on each floating gate, instead of all or nothing.

play16:45

The greater the number of electrons,

play16:47

the higher the resistance of the channel.

play16:49

When you later apply a small voltage,

play16:52

the current that flows is equal to V over R.

play16:55

But you can also think of this as voltage times conductance,

play16:59

where conductance is just the reciprocal of resistance.

play17:02

So a single flash cell can be used

play17:04

to multiply two values together, voltage times conductance.

play17:09

So to use this to run an artificial neural network,

play17:11

well they first write all the weights to the flash cells

play17:14

as each cell's conductance.

play17:16

Then, they input the activation values

play17:19

as the voltage on the cells.

play17:21

And the resulting current is the product

play17:23

of voltage times conductance,

play17:25

which is activation times weight.

play17:28

The cells are wired together in such a way

play17:30

that the current from each multiplication adds together,

play17:34

completing the matrix multiplication.

play17:36

(light music)

play17:39

- So this is our first product.

play17:40

This can do 25 trillion math operations per second.

play17:45

- [Derek] 25 trillion.

play17:47

- Yep, 25 trillion math operations per second,

play17:49

in this little chip here,

play17:50

burning about three watts of power.

play17:52

- [Derek] How does it compare to a digital chip?

play17:54

- The newer digital systems can do anywhere

play17:57

from 25 to 100 trillion operations per second,

play18:00

but they are big, thousand-dollar systems

play18:02

that are spitting out 50 to 100 watts of power.

play18:06

- [Derek] Obviously this isn't

play18:07

like an apples apples comparison, right?

play18:09

- No, it's not apples to apples.

play18:10

I mean, training those algorithms,

play18:13

you need big hardware like this.

play18:15

You can just do all sorts of stuff on the GPU,

play18:17

but if you specifically are doing AI workloads

play18:20

and you wanna deploy 'em, you could use this instead.

play18:22

You can imagine them in security cameras,

play18:25

autonomous systems,

play18:26

inspection equipment for manufacturing.

play18:29

Every time they make a Frito-Lay chip,

play18:30

they inspect it with a camera,

play18:32

and the bad Fritos get blown off of the conveyor belt.

play18:36

But they're using artificial intelligence

play18:37

to spot which Fritos are good and bad.

play18:40

- Some have proposed using analog circuitry

play18:42

in smart home speakers,

play18:43

solely to listen for the wake word, like Alexa or Siri.

play18:47

They would use a lot less power and be able to quickly

play18:49

and reliably turn on the digital circuitry of the device.

play18:53

But you still have to deal with the challenges of analog.

play18:56

- So for one of the popular networks,

play18:58

there would be 50 sequences

play19:00

of matrix multiplies that you're doing.

play19:02

Now, if you did that entirely in the analog domain,

play19:05

by the time it gets to the output,

play19:06

it's just so distorted

play19:07

that you don't have any result at all.

play19:10

So you convert it from the analog domain,

play19:12

back to the digital domain,

play19:14

send it to the next processing block,

play19:15

and then you convert it into the analog domain again.

play19:18

And that allows you to preserve the signal.

play19:20

- You know, when Rosenblatt was first setting

play19:22

up his perceptron,

play19:23

he used a digital IBM computer.

play19:26

Finding it too slow,

play19:28

he built a custom analog computer,

play19:30

complete with variable resistors

play19:32

and little motors to drive them.

play19:35

Ultimately, his idea of neural networks

play19:37

turned out to be right.

play19:39

Maybe he was right about analog, too.

play19:43

Now, I can't say whether analog computers will take

play19:46

off the way digital did last century,

play19:48

but they do seem to be better suited

play19:51

to a lot of the tasks that we want computers

play19:53

to perform today,

play19:55

which is a little bit funny

play19:56

because I always thought of digital

play19:58

as the optimal way of processing information.

play20:01

Everything from music to pictures,

play20:03

to video has all gone digital in the last 50 years.

play20:07

But maybe in a 100 years,

play20:09

we will look back on digital,

play20:11

not not as the end point of information technology,

play20:15

but as a starting point.

play20:17

Our brains are digital

play20:19

in that a neuron either fires or it doesn't,

play20:21

but they're also analog

play20:24

in that thinking takes place everywhere, all at once.

play20:28

So maybe what we need

play20:30

to achieve true artificial intelligence,

play20:32

machines that think like us, is the power of analog.

play20:37

(gentle music)

play20:42

Hey, I learned a lot while making this video,

play20:44

much of it by playing with an actual analog computer.

play20:47

You know, trying things out for yourself

play20:48

is really the best way to learn,

play20:50

and you can do that with this video sponsor, Brilliant.

play20:53

Brilliant is a website and app

play20:54

that gets you thinking deeply

play20:56

by engaging you in problem-solving.

play20:58

They have a great course on neural networks,

play21:00

where you can test how it works for yourself.

play21:02

It gives you an excellent intuition

play21:04

about how neural networks can recognize numbers and shapes,

play21:07

and it also allows you to experience the importance

play21:09

of good training data and hidden layers

play21:11

to understand why more sophisticated

play21:14

neural networks work better.

play21:15

What I love about Brilliant

play21:16

is it tests your knowledge as you go.

play21:19

The lessons are highly interactive,

play21:20

and they get progressively harder as you go on.

play21:23

And if you get stuck, there are always helpful hints.

play21:26

For viewers of this video,

play21:27

Brilliant is offering the first 200 people

play21:29

20% off an annual premium subscription.

play21:32

Just go to brilliant.org/veritasium.

play21:35

I will put that link down in the description.

play21:37

So I wanna thank Brilliant for supporting Veritasium,

play21:40

and I wanna thank you for watching.

Rate This

5.0 / 5 (0 votes)

الوسوم ذات الصلة
Analog ComputersAI TechnologyMatrix MultiplicationNeural NetworksArtificial IntelligenceDigital LimitationsEnergy EfficiencyHistorical ComputingTech InnovationAI TrainingHardware Advancement
هل تحتاج إلى تلخيص باللغة الإنجليزية؟