Neuromorphic Intelligence: Brain-inspired Strategies for AI Computing Systems

SynSense时识科技
11 Jul 202227:40

Summary

TLDRGiacomo Indiveri from the University of Zurich and ETH Zurich discusses brain-inspired strategies for developing low-power artificial intelligence computing systems. He highlights the limitations of current AI algorithms, which consume significant energy and are less versatile than natural intelligence. Indiveri introduces neuromorphic engineering as a promising approach, emphasizing the importance of emulating the brain's structure and function to create efficient, compact, and intelligent devices. He showcases the Dynamic Neuromorphic Asynchronous Processor as an example of this technology and its applications in fields like industrial monitoring and machine vision.

Takeaways

  • 🧠 The talk by Giacomo Indiveri from the University of Zurich and ETH Zurich focuses on brain-inspired strategies for low-power artificial intelligence computing systems.
  • 📈 The success of AI algorithms and neural networks, which originated in the late 80s, has recently surged due to advancements in hardware technologies, availability of large datasets, and improvements in algorithms.
  • 🔋 A significant challenge with current AI algorithms is their high energy consumption, with estimates suggesting the ICT industry could consume about 20% of the world's energy by 2025.
  • 💡 The high power usage is largely due to the extensive data and memory resources required, particularly the energy spent moving data between memory and processing units.
  • 🌐 The narrow specialization of AI networks is highlighted as a fundamental issue, contrasting with the general-purpose capabilities of natural intelligence found in animal brains.
  • 🚀 Neuromorphic engineering, inspired by the structure and function of the brain, is presented as a promising approach to overcome the limitations of current AI systems.
  • 🏫 The term 'neuromorphic' has been adopted by various communities, including those designing CMOS circuits to emulate brain functions, those developing practical devices for problem-solving, and those working on emerging memory technologies.
  • 🔬 The biological neural networks differ significantly from simulated ones, utilizing time dynamics and the physics of their elements, with memory and computation co-localized within each neuron.
  • 💡 The key to low-power computation lies in parallel arrays of processing elements with co-localized memory and computation, avoiding the need for data transfer between separate memory and processing units.
  • 🌟 The potential of neuromorphic systems is demonstrated through various applications such as ECG anomaly detection, vibration anomaly detection, and intelligent machine vision, showcasing their potential for practical, energy-efficient solutions.

Q & A

  • What is the main focus of Giacomo Indiveri's talk?

    -The main focus of Giacomo Indiveri's talk is on brain-inspired strategies for low-power artificial intelligence computing systems.

  • Why have artificial intelligence algorithms and networks only recently started outperforming conventional approaches?

    -Artificial intelligence algorithms and networks have only recently started outperforming conventional approaches due to advancements in hardware technologies providing enough computing power, the availability of large datasets for training, and improvements in algorithms making networks more robust and performant.

  • What is the estimated energy consumption of the ICT industry by 2025 in relation to the world's total energy?

    -By 2025, it is estimated that the ICT industry will consume about 20 percent of the entire world's energy.

  • Why are current AI algorithms considered to be power-hungry?

    -Current AI algorithms are power-hungry because they require large amounts of data and memory resources, and the energy cost of moving data from memory to computing and back is very high.

  • How do neuromorphic computing systems differ from traditional artificial neural networks?

    -Neuromorphic computing systems differ from traditional artificial neural networks by emulating the brain's structure and function more closely, using parallel arrays of processing elements with computation and memory co-localized, and leveraging the physics of the devices for computation.

  • What is the significance of the term 'neuromorphic' in the context of Giacomo Indiveri's work?

    -In the context of Giacomo Indiveri's work, 'neuromorphic' refers to the design of systems that mimic the neural structure and computational strategies of the brain, aiming to create compact, intelligent, and energy-efficient devices.

  • What are the three main strategies Giacomo Indiveri suggests for creating low-power artificial intelligence systems?

    -The three main strategies suggested for creating low-power artificial intelligence systems are using parallel arrays of processing elements with co-localized computation and memory, leveraging the physics of analog circuits for computation, and matching the temporal dynamics of the system to the signals being processed.

  • How does Giacomo Indiveri's approach to neuromorphic computing address the issue of device variability and noise?

    -Giacomo Indiveri's approach addresses device variability and noise by using populations of neurons and averaging over time and space, which can reduce the effect of device mismatch and noise, and by exploiting the variability as an advantage for robust computation.

  • What are some of the practical applications of neuromorphic computing systems discussed in the talk?

    -Some practical applications of neuromorphic computing systems discussed include ECG anomaly detection, vibration anomaly detection, industrial monitoring, intelligent machine vision, and consumer applications.

  • What is the Dynamic Neuromorphic Asynchronous Processor (DNAP) and what is its significance?

    -The Dynamic Neuromorphic Asynchronous Processor (DNAP) is an academic prototype built at the University of Zurich and ETH Zurich. It is significant as it demonstrates the feasibility of neuromorphic computing with a thousand neurons organized in four cores, showcasing the potential for edge computing applications with low power consumption.

  • How does Giacomo Indiveri's research contribute to the field of neuromorphic intelligence?

    -Giacomo Indiveri's research contributes to the field of neuromorphic intelligence by developing new architectures, packaging systems, and memory devices that are inspired by the brain's computational principles, aiming to create more efficient and powerful computing systems.

Outlines

00:00

🧠 Introduction to Brain-Inspired AI and Energy Efficiency

Giacomo Indiveri from the University of Zurich and ETH Zurich introduces the concept of brain-inspired strategies for low-power artificial intelligence computing systems. He discusses the history of AI, highlighting the resurgence of interest due to hardware advancements that provide the necessary computing power. The talk emphasizes the unsustainable energy consumption of current AI systems, which is projected to reach 20% of the world's energy by 2025. Indiveri points out the inefficiency of data movement between memory and processing units as a significant contributor to this energy consumption.

05:00

🌟 The Emergence of Neuromorphic Computing

The concept of neuromorphic computing is explored, which was first coined by Carver Mead in the 1980s. Neuromorphic computing aims to emulate the brain's efficiency in processing information. Three main communities are identified: those designing CMOS circuits to emulate brain functions, those building practical devices for problem-solving using spiking neural networks, and those developing nanoscale devices for non-volatile memory. The talk suggests that by combining new materials, architectures, and theories, neuromorphic computing could lead to more efficient and powerful AI systems.

10:01

🔄 The Temporal Dynamics of Biological vs. Artificial Neural Networks

Indiveri contrasts the temporal dynamics and physical computation of biological neural networks with the algorithmic simulation of artificial neural networks. He explains that while artificial networks simulate neuron properties, biological networks use the physical properties of neurons and synapses for computation. The talk emphasizes the importance of understanding these differences to develop more efficient computing systems, suggesting that the structure of biological networks is inherently the algorithm, unlike in artificial networks.

15:03

🔋 Strategies for Low-Power Neuromorphic Systems

The talk outlines strategies for creating low-power neuromorphic systems, focusing on the co-localization of computation and memory, the use of analog circuits for efficient computation, and the importance of temporal dynamics matching the processing needs. Indiveri argues for a paradigm shift in computing, moving from traditional CPU-memory architectures to brain-inspired architectures that interweave memory and computation in a distributed system, leading to significant power savings.

20:05

🤖 Practical Implementations and Applications of Neuromorphic Systems

Indiveri discusses practical implementations of neuromorphic systems, including the development of analog circuits with slow, non-linear dynamics that operate in parallel. He highlights the benefits of device mismatch and variability, which can be harnessed for robust computation. The talk also covers the advantages of using analog circuits over digital ones, such as avoiding the need for clock circuits, reducing the need for data conversion, and leveraging the physics of devices for complex operations. Indiveri presents examples of neuromorphic systems developed at the University of Zurich and ETH Zurich, including chips that can perform edge computing applications with low power consumption.

25:06

🚀 Advancing Neuromorphic Intelligence and Its Applications

The talk concludes with a look at the future of neuromorphic intelligence, emphasizing the potential to transfer academic knowledge into practical technology through startups. Indiveri mentions applications such as industrial monitoring, machine vision, and consumer electronics as promising areas for neuromorphic technology. He also discusses the Dynamic Neuromorphic Asynchronous Processor, an academic prototype that demonstrates the potential of neuromorphic systems for edge computing. The talk ends with a call to action for further development and application of neuromorphic intelligence in society.

Mindmap

Keywords

💡Artificial Intelligence (AI)

Artificial Intelligence refers to the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions. In the context of the video, AI is a central theme, with a focus on how AI algorithms and networks have evolved over time and their current impact on various fields, particularly in machine vision. The script mentions the historical development of AI, dating back to the late 80s, and the recent advancements that have led to AI systems outperforming traditional methods.

💡Convolutional Neural Networks (CNNs)

Convolutional Neural Networks are a class of deep neural networks most commonly applied to analyzing visual imagery. The video script highlights the breakthroughs in CNNs starting from 2009, where they began to achieve impressive results in the field of machine vision, leading to a surge in AI's popularity and application.

💡Backpropagation

Backpropagation is a computing technique used to train artificial neural networks through the gradient descent optimization. It is mentioned in the script as a fundamental algorithm that, despite its success, has limitations when compared to the natural intelligence found in biological systems. The speaker suggests that incremental improvements to backpropagation may not lead to breakthroughs in achieving natural intelligence.

💡Neuromorphic Computing

Neuromorphic Computing is an approach to computing that mimics the neuro-biological architectures present in the nervous system. The video discusses neuromorphic computing as a promising strategy for developing low-power AI systems, drawing inspiration from the brain's efficient use of resources and its ability to perform complex computations with minimal energy.

💡Energy Consumption

Energy Consumption in the context of the video refers to the significant amount of power required to train and run AI algorithms, particularly neural networks. The script estimates that by 2025, the ICT industry's energy consumption could account for 20 percent of the world's total energy use, highlighting the need for more sustainable and energy-efficient computing strategies.

💡Memory-Computation Co-location

Memory-Computation Co-location is a design principle where memory and computation occur in the same physical location, reducing the need to transfer data between separate memory and processing units. The video emphasizes this concept as a key strategy in neuromorphic computing to reduce energy consumption, drawing an analogy to how information is processed within neurons in the brain.

💡Spiking Neural Networks (SNNs)

Spiking Neural Networks are a type of artificial neural network that mimic the way biological neurons communicate using 'spikes' or brief electrical pulses. The script discusses SNNs as part of the neuromorphic approach, where digital circuits simulate the spiking behavior of neurons to create more efficient computing systems.

💡In-Memory Computing

In-Memory Computing is a paradigm where data is processed in the memory subsystem itself, reducing the need for data movement between memory and processing units. The video script describes how in-memory computing technologies can be used to implement long-term, non-volatile memories, contributing to the development of neuromorphic systems.

💡Analog Circuits

Analog Circuits are electronic circuits that process signals in a continuous and variable manner, as opposed to digital circuits that process signals in discrete levels. The video discusses the use of analog circuits in neuromorphic computing to emulate the physics of biological neurons and synapses, enabling more efficient and low-power computation.

💡Temporal Dynamics

Temporal Dynamics refers to the time-dependent behavior of a system. In the video, the importance of matching the temporal dynamics of computing systems to the characteristics of the signals being processed is emphasized. For instance, the video mentions how the brain's temporal dynamics are integral to its low-power and efficient computation.

💡Fault Tolerance

Fault Tolerance is the ability of a system to continue operating properly in the event of hardware or software faults. The script discusses how neuromorphic systems, with their distributed and parallel nature, can be inherently fault-tolerant, allowing them to continue functioning even if some components fail, which is a stark contrast to traditional computing systems.

Highlights

The talk focuses on brain-inspired strategies for low-power artificial intelligence computing systems.

Artificial intelligence algorithms and networks have their roots in the late 80s but only recently started outperforming conventional approaches.

The success of AI is attributed to advancements in hardware technologies, availability of large data sets, and improvements in algorithms.

AI algorithms require significant memory and energy resources, leading to concerns about sustainability.

The ICT industry is projected to consume about 20 percent of the world's energy by 2025.

The movement of data between memory and computing units is a major factor in energy consumption.

AI algorithms are narrow and specialized, unlike the general-purpose nature of animal brains.

The backpropagation algorithm is a fundamental component of AI but differs from the brain's computational principles.

Neuromorphic engineering aims to emulate the brain's efficiency and could lead to breakthroughs in AI.

Neuromorphic systems use analog circuits and in-memory computing to reduce power consumption.

Biological neural networks use time dynamics and the physics of their elements, unlike simulated networks.

The brain's structure and architecture are intertwined with its computational processes, unlike traditional computer systems.

Neuromorphic systems use parallel arrays of processing elements with computation and memory co-localized.

Analog circuits in neuromorphic systems can perform complex operations more efficiently than digital counterparts.

The temporal domain is crucial in neuromorphic systems, with dynamics matched to the signals being processed.

Neuromorphic systems can achieve fast reaction times even with slow elements due to parallel processing.

The University of Zurich and ETH Zurich have been developing neuromorphic systems for edge computing applications.

Neuromorphic chips can be used for practical applications like industrial monitoring and intelligent machine vision.

Startups are now utilizing neuromorphic technology to solve real-world problems with low-power, high-efficiency AI systems.

Transcripts

play00:04

hello

play00:05

this is giacomo individi from the

play00:07

university of zurich and eth zurich at

play00:09

the institute of neuro-informatics it's

play00:11

going to be a pleasure for me to give

play00:12

you a talk about brain inspired

play00:14

strategies for low-power artificial

play00:16

intelligence computing systems

play00:19

so the term artificial intelligence

play00:21

actually has become very very popular in

play00:24

recent times

play00:25

in fact artificial intelligence

play00:27

algorithms and networks go back to the

play00:29

late 80s

play00:30

and although the first successes of

play00:32

these networks were demonstrated in the

play00:34

80s only recently

play00:36

these algorithms and the computing

play00:38

systems started to outperform

play00:40

conventional

play00:41

approaches for solving problems

play00:44

in fact in

play00:46

the field of machine vision uh from 2009

play00:49

on

play00:50

in 2011 in fact the first convolutional

play00:53

neural network trained using back

play00:55

propagation achieved impressive results

play00:58

that made the whole field explode

play01:00

and the reason for the success of this

play01:03

approach

play01:04

really even though as i said it was

play01:06

started many many years ago only

play01:09

recently we started to be able to follow

play01:11

this

play01:12

success because um and achieve really

play01:15

impressive performance

play01:16

because the technologies the hardware

play01:18

technologies started to provide

play01:21

enough computing power for these

play01:22

networks to actually really perform well

play01:25

in addition there's been now

play01:28

the availability of large data sets that

play01:30

can be used to train such networks which

play01:32

also were not there in the 80s

play01:34

and finally

play01:36

several tricks and hacks and

play01:38

improvements in the algorithms have been

play01:39

proposed to actually

play01:41

make these networks very robust and very

play01:44

performant

play01:46

however they do have some problems

play01:49

most of these algorithms require a large

play01:52

amount of resources in terms of memory

play01:55

and energy to be trained

play01:58

and in fact if we

play02:00

do an estimate and we try to see how

play02:02

much energy is required by all of the

play02:04

computational devices that are in the

play02:06

world

play02:06

to implement such

play02:09

neural networks it is estimated that by

play02:12

2025 the ict industry will consume about

play02:14

20 percent of the entire world's energy

play02:17

this is clearly a problem which is not

play02:19

sustainable

play02:21

the other

play02:22

reason for or one of the main reasons

play02:24

for these networks to be extremely power

play02:27

hungry is because they are

play02:29

requiring large amounts of data and

play02:31

memory resources and in particular

play02:33

they're required to move data from the

play02:36

memory to the computing and from the

play02:38

computing back to the memory so

play02:40

typically memory is used

play02:42

in dram chips and these dram are at

play02:45

least a thousand five hundred times more

play02:46

costly than any compute operation mac

play02:50

operations in these cnn accelerators

play02:53

so it's it's really not the fact that

play02:55

we're doing lots of computation it's

play02:57

it's really the fact that we're moving

play02:58

bits uh back and forth

play03:01

that is is

play03:02

burning all of this energy

play03:05

the other problem is more fundamental

play03:07

it's not only related to technology it's

play03:09

related to the theory and these

play03:10

algorithms these algorithms actually as

play03:13

i said are actually are very very uh

play03:15

powerful in terms of recognizing images

play03:18

and solving but they are very narrow in

play03:21

the sense that they are very specialized

play03:23

to only a very specific domain

play03:26

these networks are programmed to perform

play03:28

a limited set of tasks and they operate

play03:30

within a predetermined and predefined

play03:32

range

play03:33

they are not nearly as general purpose

play03:35

as uh animal brains are so even though

play03:39

we do we do call it artificial

play03:41

intelligence it's really different from

play03:43

natural intelligence the type of

play03:45

intelligence that we see in in animals

play03:47

and in humans

play03:49

and the backbone of these artificial

play03:51

intelligence algorithms is the back

play03:53

propagation algorithm or if we're

play03:56

looking at time series and sequences the

play03:58

back propagation through time bpttt

play04:01

algorithm

play04:03

this this is uh really an algorithmic

play04:06

limitation even though it can be used to

play04:08

solve very powerful problems

play04:11

trying to improve this bpdt

play04:14

by making incremental changes is

play04:17

probably not going to lead to

play04:19

breakthroughs in understanding how to go

play04:21

from artificial intelligence to natural

play04:24

intelligence

play04:25

and the way the brain works is actually

play04:27

quite different from back propagation

play04:29

through time

play04:30

if you look at neuroscience and if you

play04:32

study real neurons and real synapses

play04:35

and

play04:36

the sort of computational principles of

play04:38

the brain you will realize that it's

play04:40

there there's a big difference so

play04:43

this problem has been recognized by many

play04:45

communities many agencies there is for

play04:48

example a recent paper by john shelf

play04:50

that shows how to go beyond these

play04:52

problems and try to improve performance

play04:55

of computation

play04:56

and if we look at this particular path

play04:58

that basically tries to put together new

play05:00

architectures new packaging systems with

play05:02

new memory devices and new theories

play05:05

one of the most promising approaches is

play05:07

the one that is here listed as

play05:08

neuromorphic

play05:10

so what is this neuromorphic this is the

play05:13

sort of the the bulk of this talk that i

play05:16

am going to show you what we can do at

play05:18

the university of zurich but also at the

play05:21

startup that comes out of the university

play05:22

of zurich instance with this type of

play05:25

approach which is as i said taking the

play05:27

best of the new materials and devices

play05:30

new architectures and new theories and

play05:32

trying to go really beyond what we have

play05:35

today

play05:37

so the term neuromorphic was actually

play05:40

invented or coined many many years ago

play05:42

by carver mead in the late 80s

play05:44

and is now being used to describe

play05:47

different things there's at least three

play05:49

big communities that are using the term

play05:51

neuromorphic

play05:52

the original one that goes back to

play05:54

carver mead was referring to the design

play05:57

of cmos electronic circuits that were

play06:00

used to emulate the brain

play06:03

basically as a basic research attempt to

play06:05

try to understand how the brain works by

play06:07

building

play06:08

circuits that are equivalent so trying

play06:10

to really reproduce the physics

play06:12

and and because of that these circuits

play06:14

were using sub-threshold analog

play06:16

transistors

play06:18

for the neurodynamics and the

play06:20

computation and asynchronous digital

play06:22

logic for communicating spikes across

play06:25

chips across cores it was really

play06:27

fundamental research

play06:29

the other big community that now started

play06:31

to use the term neuromorphic is the

play06:33

community building uh

play06:35

practical devices for you know solving

play06:37

practical problems

play06:39

in that case these these this community

play06:41

is is building chips that can implement

play06:44

spiking neural network accelerators or

play06:46

simulators not emulation but but really

play06:49

now at this point it's it's more an

play06:51

exploratory approach it's being used to

play06:53

try to understand what can be done

play06:55

with this approach of using digital

play06:58

circuits to simulate spiking neural

play07:00

networks

play07:01

finally the last community or another

play07:04

large community that is started to use

play07:05

the term neuromorphic is the one that

play07:07

has been developing emerging memory

play07:09

technologies looking at nanoscale

play07:11

devices to implement long-term

play07:14

non-volatile memories

play07:16

or if you like memoristive devices

play07:19

so this community also started using the

play07:21

turn neuromorphic because these devices

play07:23

they can actually store

play07:25

a change in the conductance which is

play07:27

very similar to the way the real

play07:28

synapses work when that they actually

play07:31

change their conductance when they

play07:32

change their synaptic weight

play07:34

and

play07:35

this allows them to build in memory

play07:37

computing architectures that are also as

play07:40

you will see very similar to the way

play07:42

real biological neural networks work and

play07:45

it can really create high density arrays

play07:47

so we can actually by using analog

play07:50

circuits the approach of simulating

play07:52

digital um spike neural networks and by

play07:55

using in-memory computing technologies

play07:57

the hope is that we create a new field

play08:00

which i'm calling here neuromorphic

play08:01

intelligence that will lead to the

play08:04

creation of

play08:05

compact intelligent brain inspired

play08:08

devices

play08:09

and really to understand how to do these

play08:11

brain inspired devices it's important to

play08:14

look at the brain to go back to carver

play08:16

meats approach and really do fundamental

play08:18

research in studying biology and try to

play08:21

really get the best out of all all of

play08:23

these communities of the devices of the

play08:25

sort of the computing principles using

play08:28

simulations and and machine learning

play08:30

approaches but also of neuroscience and

play08:33

studying the brain

play08:34

and so here i'd like to just to

play08:36

highlight the main differences that are

play08:37

there between

play08:39

simulated artificial neural networks and

play08:41

really

play08:42

the biological neural networks those

play08:44

that are in the brain

play08:45

in simulated artificial neural networks

play08:48

as you probably know there is a weighted

play08:50

sum of inputs the inputs are all coming

play08:52

in a point neuron which is basically

play08:54

just doing the sum or the integral of

play08:56

the inputs and multiplying all of them

play08:58

by a weight so it's it's really

play09:00

characterized by a big uh weight

play09:02

multiplication or matrix multiplication

play09:04

operation and then there is a

play09:06

non-linearity either a spiking

play09:08

non-linearity if it's a spike neural

play09:09

network or a thresholding non-linearity

play09:12

if it's an artificial neural network

play09:14

in biology the neurons are also

play09:17

integrating all of their synaptic inputs

play09:20

with different weights so there is this

play09:22

analogy of weighted inputs but it's all

play09:24

happening through the physics of the

play09:25

devices so the the physics is playing an

play09:28

important role for computation

play09:31

the synapses are not just doing a

play09:32

multiplication they're actually

play09:34

implementing some temporal operators

play09:37

integrating applying non-linearities uh

play09:40

dividing summing it's much more

play09:42

complicated than just a weighted sum of

play09:44

inputs

play09:45

in addition the neuron actually has an

play09:47

axon and it's sending its output through

play09:49

the axon using also basically an all or

play09:52

none event a spike

play09:54

through time because the longer the axon

play09:57

the longer it will take for the for the

play09:58

spike to travel and reach the

play10:00

destination

play10:02

and depending on how thick the axon is

play10:04

how much myelination there is it will be

play10:06

slower or faster so also here the

play10:08

temporal dimension is is really

play10:10

important

play10:11

in summary if we really want to see the

play10:13

big difference is that artificial neural

play10:14

networks the one that are being

play10:16

simulated on computers and gpus are

play10:18

actually algorithms that simulate some

play10:21

basic properties of real neurons

play10:24

whereas biological neural networks

play10:25

really use time dynamics and the physics

play10:28

of their computing elements to run the

play10:31

algorithm actually in these networks the

play10:34

structure of the architecture is the

play10:36

algorithm there is no distinction

play10:38

between the hardware and the software

play10:40

everything is one and understanding how

play10:43

to build

play10:44

these types of hardware architectures

play10:47

wet wear or hardware

play10:49

using cmos using memristors maybe even

play10:51

using alternative you know dna computing

play10:54

or other approaches

play10:55

will

play10:56

hopefully and probably lead to much more

play10:58

efficient and powerful computing systems

play11:01

compared to the artificial neural

play11:03

networks so if we want to understand how

play11:05

to do this we really need to do a

play11:07

radical paradigm shift in computing

play11:10

standard computing architectures are

play11:12

basically based on the phenomenon

play11:15

system where you have a cpu on one side

play11:18

and

play11:18

memory uh on the other and as i said

play11:22

transferring data back and forth from

play11:24

the cpu to the memory and back is

play11:27

actually what's burning all the power

play11:29

doing the computation inside the cpu is

play11:31

much much uh more energy efficient and

play11:34

less costly than transferring the data

play11:37

in brains what's happening is that

play11:39

inside the neuron there are synapses

play11:41

which store

play11:43

the value of the weight so

play11:45

memory and computation are co-localized

play11:48

there is no transfer of data back and

play11:50

forth everything is happening at the

play11:52

synapse at the neuron and there is many

play11:54

distributed synapses as many distributed

play11:56

neurons so the memory and the

play11:58

computation are intertwined together in

play12:00

a distributed system

play12:02

and this is really a big difference so

play12:04

if we want to understand how to really

play12:06

save power we have to look at how the

play12:08

brain does it we have to use these brain

play12:10

inspire strategies and the main three

play12:13

points that i'd like you to remember is

play12:15

that you we have to use basically

play12:17

parallel arrays of processing elements

play12:19

that have computation and memory

play12:21

co-localized and this is radically

play12:23

different from time multiplexing a

play12:26

circuit here for example if we have one

play12:27

cpu two cpus but even 64 cpus to

play12:31

simulate

play12:32

thousands of neurons we are time

play12:34

multiplexing the integration of the

play12:36

differential equations in these 64 cpus

play12:40

here if we look at how to do it

play12:42

following this brain inspired strategies

play12:44

if we want to emulate a thousand neurons

play12:46

we really have to have a thousand

play12:48

different circuits that are laid out in

play12:50

the in the layout of the of the chip of

play12:52

the wafer and then run these

play12:55

through their physics through the

play12:56

physics of the circuits analog circuits

play12:59

digital circuits but they have to be

play13:01

many parallel circuits that operate in

play13:03

parallel with the memory and the

play13:05

computation co-localized that's really

play13:07

the trick to to save power

play13:09

the other is if we have analog circuits

play13:11

we can use the physics of the circuits

play13:13

to carry out the computation that really

play13:14

instead of abstracting away some

play13:16

differential equations and integrating

play13:18

numerically the differential equation we

play13:20

really use the physics of the device to

play13:22

carry out the computation it's much more

play13:24

efficient in terms of power latency time

play13:27

and area

play13:28

and finally the temporal domain is

play13:31

really important the temporal dynamics

play13:32

of the system have to be well matched to

play13:34

the signals that we want to process so

play13:37

if we want to have very low power

play13:38

systems and for example we want to

play13:40

process speech we have to have elements

play13:42

in our computing substrate in our brain

play13:44

like computer that have the same time

play13:47

constants speech for example phonemes

play13:49

have time constants of the order of 50

play13:51

milliseconds so we have to slow down

play13:53

silicon to have dynamics and time

play13:55

constants of the order of 50

play13:57

milliseconds so our our chips will be

play14:00

firing and going at you know hertz or

play14:03

maybe hundreds of hertz but definitely

play14:05

not that megahertz or gigahertz like our

play14:07

cpus or our gpus are doing

play14:10

and and by having parallel arrays of

play14:13

very slow elements we can still get very

play14:15

fast computation even if we have slow

play14:18

elements it doesn't mean that we don't

play14:19

have a very fast reactive system

play14:21

it's because they're in working in

play14:23

parallel and so at some point there will

play14:25

always be one or two of these elements

play14:26

that are about to fire whenever the

play14:28

input arrives and we can have

play14:31

microsecond nanosecond reaction times

play14:33

even though we have millisecond dynamics

play14:36

and this is another key trick to

play14:38

remember if we want to understand how to

play14:40

do this radical paradigm shift

play14:43

and at the university of zurich at eth

play14:45

at the institute of neuroinformatics

play14:47

we've been building these types of

play14:48

systems

play14:49

for many many years and we are uh now

play14:53

also building these systems at our new

play14:55

startup at since

play14:57

the type of systems are shown here

play14:59

basically we create arrays of neurons

play15:02

with analog circuits

play15:04

these circuits as i told you are slow

play15:06

they have slow temporal non-linear

play15:08

dynamics

play15:09

as i told you they are massively

play15:11

parallel we do massively parallel

play15:13

operations all of the circuits work in

play15:15

parallel

play15:16

the fact that they are analog actually

play15:18

brings this

play15:20

feature that are that is basically

play15:23

device mismatch all the circuits are

play15:25

inhomogeneous they are not equal and

play15:28

this actually can be used as an

play15:29

advantage to carry out robust

play15:31

computation it's counter-intuitive but i

play15:33

will show you that it's actually an

play15:35

advantage to have variability in your

play15:38

devices and this actually is also very

play15:40

nice for people doing memory still

play15:42

devices that are typically very very

play15:44

variable

play15:46

the other features are that they are

play15:48

adaptive all of these circuits that we

play15:50

have have negative feedback loops they

play15:51

have learning

play15:52

adaptation plasticity so the learning

play15:55

actually helps in creating robust

play15:58

computation through the noisy and

play16:00

variable elements

play16:02

by construction there are many of these

play16:04

in working in parallel even if some of

play16:05

these stop working the system is fault

play16:08

tolerant you don't have to throw away

play16:09

the chip like you would with a standard

play16:11

processor if one transistor breaks

play16:14

probably performance will degrade

play16:16

smoothly but at least the system will be

play16:17

fault tolerant and because we use both

play16:20

the best of both worlds analog circuits

play16:22

for the dynamics and digital circuits

play16:24

for the communication we can program the

play16:26

routing tables and configure these

play16:28

networks so we have flexibility in being

play16:31

able to program these dynamical systems

play16:34

like you would program a neural network

play16:35

on a cpu on a computer

play16:38

uh of course it's it's more complex we

play16:40

still have to develop all the tools and

play16:42

and simpsons and and other colleagues

play16:44

around the world are still busy

play16:46

developing the tools to program these

play16:47

dynamical systems

play16:49

it's not nearly as well developed as you

play16:52

know having a java or a c or a python

play16:55

piece of code but

play16:57

there is very promising work going on

play17:00

and now the question always comes why do

play17:02

you do it if analog is noisy and

play17:04

annoying in homogeneous why do you go

play17:07

through the effort of building these

play17:08

analog circuits so let me just try to

play17:11

explain that there are several

play17:12

advantages if you think of having large

play17:15

networks in which you have many elements

play17:17

working in parallel for example these

play17:19

memorizative devices in a crossbar array

play17:22

and you want to send data through them

play17:24

these membership devices if you use the

play17:26

physics they use analog variables so

play17:31

if you just send these variables in an

play17:33

asynchronous mode you don't need to use

play17:35

a clock so you can avoid using digital

play17:38

clock circuitry which is actually

play17:40

extremely expensive in terms of area

play17:42

requirements in large complex chips and

play17:45

extremely power hungry so avoiding

play17:47

clocks is is something really really

play17:49

useful

play17:51

if we don't use digital if we're staying

play17:52

analog all the way from the input to the

play17:54

output we don't need to convert we don't

play17:56

need to convert from digital to analog

play17:58

to to run the physics of these

play18:00

memristors and we don't need to convert

play18:02

back from analog to digital and these

play18:04

adcs and these dacs are actually very

play18:06

expensive in terms of size and power so

play18:09

again if we don't use them we save in

play18:11

size and power

play18:13

if we use transistors to do for example

play18:15

exponentials we don't need to have a

play18:17

very complicated uh digital circuitry to

play18:20

do that so again we can use a single

play18:22

device that through the physics of the

play18:24

device can do complex nonlinear

play18:26

operation that saves area and power as

play18:28

well

play18:29

and finally if we have analog variables

play18:31

like variable voltage heights variable

play18:34

voltage pulses widths

play18:37

and and

play18:38

other types of variable currents we can

play18:40

control the properties of the

play18:42

of the devices that we use these

play18:44

memorized devices we can make them uh

play18:48

depending on how strong we we drive them

play18:50

we can make them volatile or

play18:51

non-volatile we can use their intrinsic

play18:54

non-linearities

play18:56

depending on how strongly we derive them

play18:58

we can even make them switch with a

play18:59

probability so we can use their

play19:01

intrinsic stochasticity to do stochastic

play19:04

gradient descent or to do probabilistic

play19:06

graphical networks to do probabilistic

play19:08

computation

play19:10

and we can also use them in their

play19:12

standard way of operation in their

play19:15

non-volatile way of operation as

play19:17

long-term memory elements so we don't

play19:19

need to shift data back and forth from

play19:22

peripheral memory back we can just store

play19:24

the value of the sinuses directly in

play19:27

these membrane devices

play19:29

so if we use analog for our neurons and

play19:31

synapses in cmos

play19:33

we can then best best benefit the use of

play19:36

future emerging memory technologies uh

play19:39

reducing power consumption

play19:41

and uh in a very recent in the last

play19:44

escas conference which was just you know

play19:45

a few weeks ago we did show with the pcm

play19:49

trace algorithm or or the pcm series

play19:51

experiments that we can exploit the

play19:54

drift of pcm devices which are these

play19:56

shown here in a picture from ibm

play19:59

to implement eligibility traces which is

play20:02

a very useful feature to have for

play20:04

reinforcement learning

play20:06

so if if we are interested in building

play20:08

reinforcement learning algorithms for

play20:10

example for for having behaving robots

play20:12

that run with

play20:14

brains that are implemented using these

play20:16

chips we can actually take advantage of

play20:18

the properties of these pcm devices

play20:21

none that are typically thought as

play20:23

non-idealities we can use them to our

play20:25

advantage for computation now

play20:28

analog circuits are noisy i told you

play20:30

there are the they are variable in

play20:31

homogeneous for example if you take one

play20:33

of our chips you you stimulate the

play20:36

neurons with the same current to all the

play20:38

neurons to 256 different neurons and you

play20:41

see how long it takes for the neuron to

play20:42

fire

play20:43

not only these neurons are slow but

play20:45

they're also variable the time at which

play20:47

they fire can greatly change depending

play20:49

on which circuit you're using

play20:51

and there is this noise which is usually

play20:53

typically you have variance of 20

play20:55

over the mean so the coefficient of

play20:57

variation is about 20 percent

play21:00

so the question is how can you do robust

play21:01

computation using this noisy

play21:04

computational substrate and the obvious

play21:06

answer the easiest thing that people do

play21:08

when they have noise is to average

play21:11

so we can do that we can average over

play21:13

space and we can average over time if we

play21:16

use populations of neurons not single

play21:18

neurons we can just take you know two

play21:20

three four six eight neurons and look at

play21:22

the average time that it took for them

play21:23

to spike or if they're spiking uh

play21:26

periodically we can look at the average

play21:28

firing rate

play21:29

and then if we integrate over long

play21:31

periods of time we can average over time

play21:33

so these these two strategies are going

play21:35

to be useful for uh reducing the effect

play21:38

of device mismatch if we do need to have

play21:40

precise computation

play21:42

and we are doing experiments in this in

play21:44

these very days this is actually a very

play21:46

recent experiment where we took these

play21:48

neurons and we put two of them together

play21:51

four of them together eight sixteen if

play21:53

you look at the cluster size basically

play21:54

that's the number of neurons that we are

play21:56

using for the average over space

play21:59

and then we are computing the um

play22:02

firing rate over two milliseconds five

play22:03

milliseconds 50 milliseconds 100 and so

play22:06

on and then what we do is we calculate

play22:09

the coefficient of variation basically

play22:11

how much device mismatch there is the

play22:13

larger the coefficient of variation the

play22:15

more noise the smaller the coefficient

play22:17

of variation the less noise so we can go

play22:19

from a very large coefficient of

play22:21

variation of say 12

play22:23

actually 18 as i said 20

play22:26

by integrating over long periods of time

play22:29

or by integrating over large numbers of

play22:31

neurons we can decrease this all the way

play22:32

to 0.9

play22:34

and you can take this coefficient of

play22:35

variation and you can calculate the

play22:37

equivalent number of bits if we were

play22:39

using digital circuits how many would

play22:42

bits would this correspond to so for by

play22:45

just integrating over larger number of

play22:47

neurons and over longer periods we can

play22:49

have for example a sweet spot where we

play22:51

have eight bit resolution just by using

play22:54

16 neurons

play22:58

and integrating for example over 50

play22:59

milliseconds

play23:01

this can be changed at runtime if we if

play23:03

we want to have a very fast reaction

play23:05

time and a course idea of what the

play23:07

result is we can have only two neurons

play23:10

and integrate only over two milliseconds

play23:13

then there's many false myths when we we

play23:15

use spike neural networks people tell us

play23:17

oh but if you have to wait until you

play23:19

integrate enough it's going to be slow

play23:22

if you have to average over time it's

play23:23

going to take area all of these are

play23:26

actually false myths that can be

play23:27

debunked by looking at neuroscience

play23:29

neuroscience has been studying how the

play23:31

brain works the brain is extremely fast

play23:34

it's extremely low power we don't have

play23:36

to wait long periods of time to make a

play23:37

decision

play23:39

so if you use populations of neurons to

play23:41

average out it's been shown for example

play23:43

experimentally with real learners

play23:45

that populations of neurons have

play23:47

reaction times that can be even 50 times

play23:49

faster than single neurons

play23:51

so by using populations we can really

play23:53

speed up the computation

play23:55

if we use the populations of neurons we

play23:57

don't need them to be every neuron to be

play23:59

very highly accurate they can be noisy

play24:01

and they can be very low precision but

play24:04

by

play24:05

using populations and using sparse

play24:08

coding we can have very accurate

play24:09

representation of data and there is a

play24:12

lot of work for example by sophie the

play24:14

nerve that has been showing how to do

play24:15

this by by training populations of

play24:18

neurons to do that

play24:19

and now it should be also known in

play24:21

technology that if you have variability

play24:24

it actually helps in transferring

play24:26

information across multiple layers

play24:28

and here what i'm showing here is data

play24:30

from one of our chips where we use 16 of

play24:32

these neurons per core and we basically

play24:35

provide some in desired target as the

play24:38

input we drive a motor with a pid

play24:41

controller and we minimize the error

play24:44

it's just to show you that by using

play24:45

spikes we can actually have very fast

play24:48

reaction times in robotic platforms

play24:50

using these types of uh chips that you

play24:52

saw in the previous slides

play24:55

in fact we've been building many chips

play24:56

for many years also at since the the the

play24:59

colleagues are building chips and the

play25:00

latest one that we have built at the

play25:02

university as a academic prototype

play25:06

is um

play25:08

is called the dynamic neuromorphic

play25:09

asynchronous processor it was built

play25:11

using a very old

play25:13

technology 180 nanometer as i said

play25:15

because it's an academic exercise but

play25:18

still this has a thousand neurons it has

play25:21

four cores of 256 neurons each we can

play25:24

actually do very interesting edge

play25:25

computing applications just with a few

play25:28

hundred neurons and then of course then

play25:30

the idea is to use both the best of both

play25:32

worlds to see where we can do

play25:34

analog circuits to really have low power

play25:37

or digital circuits to have you know to

play25:39

verify principles and make you know

play25:42

practical problems solve practical

play25:44

problems quickly and then by combining

play25:46

analog and digital we can also there get

play25:49

the best of both worlds

play25:50

so to to conclude actually i would just

play25:53

like to show you examples of

play25:54

applications that we built

play25:57

using this

play25:59

this is

play26:00

a long list of applications but if you

play26:02

have

play26:03

the slides you can actually click on the

play26:07

on the references and you can get the

play26:08

paper the ones highlighted in red are

play26:11

the ones that have been done by simpsons

play26:13

uh by our colleagues from instance on

play26:15

ecg anomaly detection and the detection

play26:18

of vibration anomalies was actually done

play26:19

by simpsons and by university of zurich

play26:22

in parallel independently and just go to

play26:24

the last slide where i basically tell

play26:26

you that we are now at a point where we

play26:29

can actually use all of the knowledge

play26:31

from the university about brain inspire

play26:34

strategies

play26:35

to develop this neuromorphic

play26:36

intelligence field transfer all of this

play26:39

know-how into

play26:40

technology and in the in with new

play26:43

startups that can actually use this

play26:45

know-how to solve practical problems

play26:47

and try to find you know what is the

play26:48

best market for this

play26:50

as i said industrial monitoring for

play26:52

example for vibrations is something but

play26:54

something that can also be done using

play26:56

both sensors and processors is is really

play26:59

what the simpsons company has been

play27:01

developing and it's it's been really

play27:03

successful at for doing intelligent

play27:05

machine vision for putting both uh

play27:07

sensing and

play27:09

artificial intelligence algorithm on the

play27:11

same chip

play27:12

and having a very low power in the order

play27:14

of microwave uh tens or hundreds of

play27:16

microwatt power dissipation for solving

play27:19

practical problems that can be useful in

play27:21

society and in fact even solving

play27:23

consumer applications

play27:25

so with this um sorry i went a bit over

play27:27

time but i would just like to thank you

play27:29

for your attention

play27:39

you

Rate This

5.0 / 5 (0 votes)

相关标签
AI ComputingNeuromorphic SystemsEnergy EfficiencyArtificial IntelligenceMachine VisionNeural NetworksHardware TechnologyUniversity ResearchInnovative AlgorithmsSustainable Tech
您是否需要英文摘要?