Customized Hardware, but for AI

TechTechPotato
6 Mar 202407:30

Summary

TLDRThe video script discusses a new startup called Talas that aims to revolutionize AI hardware efficiency. Talas proposes designing dedicated AI chips for specific models, optimized for performance and cost-effectiveness. By combining highly optimized models with structured ASICs (a hybrid between FPGAs and ASICs), Talas claims to achieve 10 to 1000x better efficiency compared to existing solutions. This approach targets edge and embedded devices, enabling AI capabilities in resource-constrained environments where traditional GPUs or internet connectivity may not be feasible. With $50 million in funding, Talas plans to demonstrate a proof of concept by the end of the year.

Takeaways

  • ๐Ÿค– The video discusses a new startup called Talas that aims to create highly efficient and dedicated AI chips for specific models, promising 10-1000x better performance and efficiency compared to current hardware.
  • ๐Ÿ”ฉ Talas' approach involves designing custom silicon optimized for individual AI models, rather than using general-purpose hardware like GPUs or CPUs.
  • ๐Ÿ’ป Their technology sits between fully programmable hardware (like CPUs/GPUs) and fully configurable hardware (like FPGAs), utilizing a concept called 'structured ASICs' or 'easic' business.
  • โšก By hardening the final few layers of metallization in the chip, Talas can achieve dedicated ASIC-like speeds while retaining some configurability for different models.
  • ๐Ÿ’ฐ According to the video, the cost of AI hardware could become a bottleneck, and Talas' approach aims to reduce this cost, especially for edge devices and embedded systems.
  • ๐ŸŒ The video suggests that AI models will become ubiquitous, present in various devices like smart meters, cars, and electronics, necessitating efficient and dedicated hardware solutions.
  • ๐Ÿญ Talas, founded by former Nvidia executive Lua Bic, has raised $50 million in funding and plans to tape out their first chip by the end of the year, with customer deployments expected in 2024.
  • ๐Ÿ”„ While AI models and architectures are rapidly evolving, Talas' approach targets models that are well-defined and unlikely to change significantly over 10-30 years.
  • ๐Ÿง  The video positions Talas' technology as a potential solution for edge and edge inference applications, rather than large-scale training workloads.
  • ๐ŸŒŸ Overall, the video presents Talas as an innovative startup aiming to disrupt the AI hardware landscape with highly efficient and dedicated silicon solutions for specific models.

Q & A

  • What is the main topic discussed in the script?

    -The script discusses a new startup called Talis that claims to achieve 10 to 1,000 times better efficiency for AI hardware by developing dedicated AI chips optimized for specific machine learning models.

  • Why is the development of dedicated AI chips considered innovative?

    -The idea of developing dedicated AI chips tailored to specific machine learning models is innovative because it departs from the current approach of using general-purpose hardware (like GPUs) or reconfigurable hardware (like FPGAs) for running various AI models. Dedicated chips can potentially provide better performance and efficiency for specific models.

  • What is the key advantage of Talis' approach according to the script?

    -The script suggests that Talis' approach of developing dedicated AI chips for specific models can lead to better performance, better efficiency, and lower hardware costs compared to using general-purpose or reconfigurable hardware for running AI models.

  • How does Talis' approach differ from using FPGAs?

    -While FPGAs offer fully configurable hardware, Talis' approach involves developing structured ASICs or what Intel calls "eASICs". These chips have a reconfigurable part, but in the final few layers of metallization, some pathways are hardened or fixed, providing ASIC-like speeds while retaining some configurability.

  • What is the potential market for Talis' dedicated AI chips?

    -The script suggests that Talis' dedicated AI chips could be useful for edge and edge inference applications, especially in devices that don't connect to the internet and require efficient, low-power AI processing for tasks like power management, image correction, or voice interaction.

  • What is the significance of the name 'Talis'?

    -The script mentions that 'Talis' means 'locksmith' in Hindi, likely referring to the company's goal of developing dedicated, optimized hardware solutions for specific AI models or 'locks'.

  • Who is the founder of Talis, and what is their background?

    -The script states that Talis was founded by Lua Bic Urein, who was previously the founder of Tenstorrent (a company focused on AI hardware). Urein left Tenstorrent about a year ago, and Jim Keller now runs that company.

  • What is the current status of Talis' development efforts?

    -According to the script, Talis is expecting to have a chip tape-out (a completed chip design ready for manufacturing) by the end of the year, and they aim to have their technology proliferated to customers by next year.

  • What is the potential impact of widespread adoption of dedicated AI chips?

    -The script suggests that if dedicated AI chips become ubiquitous, they could be used in various devices like smart meters, cars, and electronic devices for tasks like advanced power management, image correction, or voice interaction, even in devices that don't connect to the internet.

  • What is the significance of the statement "AI and machine learning is still such a rapidly developing market"?

    -This statement highlights the rapidly evolving nature of the AI and machine learning field, suggesting that the need for dedicated, optimized hardware solutions may arise as new models and applications continue to emerge and evolve over time.

Outlines

00:00

๐Ÿค– The Landscape of AI Hardware and the Emergence of a New Startup

The video script discusses the current state of AI hardware, with various companies offering different chips like TPUs, GPUs, CGAs, and specialized cores. It then introduces a startup called Talas (meaning 'locksmith' in Hindi), funded by Lua Bic, the founder of Tencent. Talas proposes a novel approach to AI hardware design, aiming to create dedicated AI chips tailored for specific models, with the potential to achieve 10 to 1000 times better efficiency compared to existing solutions.

05:01

๐Ÿ” Talas' Innovative Approach to Dedicated AI Chips

The script delves into Talas' approach, which involves creating dedicated AI chips optimized for specific models that are not expected to change for an extended period (10-30 years). This approach aims to address the high costs associated with designing chips for frequently changing models. Talas proposes a middle ground between standard AI accelerators and fully reconfigurable hardware, using a technique called structured ASICs or eASICs. This technique combines the benefits of dedicated ASICs and the flexibility of reconfigurable hardware, potentially reducing costs and improving efficiency for applications like edge and edge inference, where dedicated, low-power AI chips could be ubiquitous.

Mindmap

Keywords

๐Ÿ’กAI Chips

AI chips refer to specialized hardware designed to accelerate machine learning and artificial intelligence workloads. The script mentions various types of AI chips from companies like Google (TPUs), NVIDIA (GPUs), Sambanova (CGRAs), Cerebras (massive wafer-scale chips), and Tensort (106 cores). These chips are optimized for efficiently running AI models and computations, enabling faster performance compared to general-purpose processors.

๐Ÿ’กMachine Learning Models

Machine learning models are algorithms or mathematical representations that are trained on data to learn patterns and make predictions or decisions. The script discusses the rapid development of new machine learning models, architectures, and optimizations, highlighting the need for configurable hardware to support these evolving models. Examples mentioned include transformers, which require dedicated hardware acceleration.

๐Ÿ’กDedicated AI Chips

Dedicated AI chips refer to specialized hardware designed and optimized for running a specific machine learning model or a set of models. The script introduces the concept of having silicon dedicated to a well-defined, highly optimized model that is not expected to change significantly for an extended period. This approach aims to extract maximum efficiency by tailoring the hardware to the exact requirements of the model.

๐Ÿ’กTaras

Taras (spelled as 'Talas' in the script) is a startup company based in Toronto, Canada, founded by Lua Bic, the former founder of Tensort. Taras has received $50 million in funding and is exploring the idea of dedicated AI chips per model. Their goal is to develop chips that are hyper-optimized for specific machine learning models, potentially providing 10 to 1,000 times better efficiency compared to existing solutions.

๐Ÿ’กFPGAs

FPGAs (Field-Programmable Gate Arrays) are reconfigurable hardware devices that can be programmed to implement various digital circuits and algorithms. The script mentions the FPGA route as a potential solution for optimizing machine learning models, where the hardware can be reconfigured with a different bitstream to adapt to different code paths and optimized models.

๐Ÿ’กStructured ASICs

Structured ASICs (Application-Specific Integrated Circuits) are a type of hardware that combines aspects of reconfigurable logic (like FPGAs) and hardened, dedicated circuits (like traditional ASICs). The script suggests that Taras may be exploring a structured ASIC approach, where the final few metallization layers are fixed to harden specific pathways, providing ASIC-like speeds while retaining some configurability.

๐Ÿ’กEdge Inference

Edge inference refers to running machine learning models and making predictions or inferences on edge devices, such as embedded systems, smartphones, or IoT devices, rather than in the cloud or data centers. The script discusses the potential of dedicated AI chips for edge inference applications, where power efficiency, cost, and specialized hardware are crucial factors.

๐Ÿ’กUbiquitous AI

The script suggests that AI and machine learning models are becoming ubiquitous, meaning they will be present in various electronic devices and appliances, from smart meters and cars to cameras and ring lights. This ubiquity of AI highlights the need for efficient, low-power, and cost-effective dedicated AI chips that can be proliferated across diverse applications and edge devices.

๐Ÿ’กHyper-optimized Models

Hyper-optimized models refer to machine learning models that have been extensively optimized and refined for specific tasks or applications. The script discusses Taras's approach of combining hyper-optimized models with hyper-optimized silicon (dedicated AI chips) to achieve significant performance and efficiency gains.

๐Ÿ’กTape Out

Tape out is a term used in the semiconductor industry to describe the final step in the chip design process, where the complete design data is transferred to a semiconductor manufacturing facility for fabrication. The script mentions that Taras is expecting a chip tape out by the end of the year, indicating their plans to have a physical prototype or product ready for manufacturing.

Highlights

The startup Talas, which means 'locksmith' in Hindi, aims to create dedicated AI chips per model, with the goal of achieving 10 to 1,000x better efficiency for AI hardware.

Talas is proposing a solution that sits between a standard AI accelerator and fully reconfigurable hardware, called a structured ASIC or 'eASIC', which combines the configurability of FPGAs with the dedicated, hardened pathways of ASICs.

The key benefit of Talas' approach is the ability to extract dedicated ASIC-like speeds while retaining some configurability, potentially reducing hardware costs and improving efficiency.

Talas believes their approach is the future of machine learning, especially for edge and edge inference applications, where dedicated, ultra-low power, and cheap AI hardware is needed.

Talas is aiming to have their first chip tapeout by the end of 2023 and customer proliferation in 2024, showcasing a proof-of-concept for their technology.

The company has received $50 million in funding from Lua Bic, the former founder of Tenstorrent, who left the company about a year ago.

Designing a chip for each AI model is typically expensive, but Talas' approach aims to reduce costs by leveraging structured ASICs.

AI models are expected to become ubiquitous, appearing in devices like smart meters, cars, and electronic devices for tasks like power management and image processing.

The founder of Talas, Lua Bic, previously founded Tenstorrent and left the company about a year ago, with Jim Keller now running Tenstorrent.

Talas is based in Toronto, Canada.

The transcript discusses the rapid pace of development in machine learning and AI, with new models, architectures, and optimizations being introduced frequently.

The need for configurable and programmable hardware has increased to handle the diversity of AI models and optimizations.

GPUs and dedicated AI accelerators have been used to handle various AI workloads, but Talas aims to create specialized hardware for specific, well-defined models.

FPGAs offer fully configurable hardware but with overhead, while Talas' structured ASICs aim to combine configurability with dedicated ASIC-like speeds.

The transcript mentions the potential for machine learning in various devices, such as cameras, ring lights, and other electronic devices, for tasks like power optimization and image correction.

Transcripts

play00:00

we're currently in a fast-paced world

play00:01

where there are tons of AI chips in the

play00:04

market we have Google with tpus we have

play00:07

Nvidia with gpus we have companies like

play00:10

sambanova with cgas and cerebrus with

play00:13

massive wafer scale down to T tent with

play00:15

106 cores and then all the embedded

play00:18

markets with all the little analog and

play00:20

neuromorphic and all those different

play00:22

sorts of cores that you've seen on this

play00:23

channel before what if I have to tell

play00:26

you that there is another way in this

play00:28

video we're going to speak about a new

play00:30

startup who says that they can go

play00:31

forward another 10 to 1,000x in

play00:34

efficiency for AI

play00:45

Hardware so what this company is doing

play00:48

is actually quite Innovative um it's not

play00:51

a necessarily a brand new idea but it's

play00:52

definitely being applied in a brand new

play00:55

context now machine learning and AI

play00:58

currently whether it's Hardware or

play00:59

software Ware is pretty fast placed

play01:02

we've got new models being developed all

play01:03

the time new architectures for those

play01:06

models how they're applied how they're

play01:08

used and how they perform how they're

play01:10

optimized everything is being moving at

play01:12

roughly a fairly fast pace so as a

play01:15

result we've needed a lot of

play01:17

configurable Hardware in order to do

play01:19

that or at least programmable Hardware

play01:21

the ability to use it in different ways

play01:23

in order to extract performance

play01:24

regardless of whether it's doing this

play01:26

little niche or that little niche or

play01:28

this new optimization uh the big changes

play01:31

are things like

play01:32

Transformers uh that's required almost a

play01:35

new dedicated sort of hardware on top of

play01:37

that but the whole point is if you use a

play01:39

GPU you can pretty much do anything if

play01:41

you use one of these new dedicated AI

play01:43

as6 as long as you're using tens FL pie

play01:46

torch it's usually pretty okay what if

play01:50

some of those models were very well

play01:52

defined the ability to look at a model

play01:55

and say this isn't going to change for

play01:57

10 20 30 years it's highly optimized we

play02:01

know the code paths we know exactly how

play02:03

it's going to perform what if we had

play02:06

silicon dedicated to that exact model

play02:10

and this this is what this startup is

play02:11

going to do it's called talas which

play02:13

means locksmith in Hindi and it's a new

play02:17

startup with about $50 million in

play02:19

funding by Lua Bic you reain remember

play02:22

Lua from being the founder of ten

play02:25

torrent he left tenens torrent about a

play02:27

year ago Jim Keller now runs that

play02:29

company and I do a bit of work with them

play02:31

but his new startup based in Toronto

play02:33

Canada is called talas I keep wanting to

play02:36

call it Talis but it's talas and their

play02:39

whole point is what if you had dedicated

play02:41

AI chips per model now this seems a bit

play02:45

far-fetched I mean designing a chip

play02:47

takes millions and millions and millions

play02:49

and millions of dollars um which means

play02:52

if you have one per model and the models

play02:54

are changing frequently then that's

play02:56

going to be Millions for that chip

play02:57

Millions for that chip Millions for that

play02:58

chip now now there is one sort of

play03:01

solution here you could go down Route

play03:03

it's called the fpga route fpga is fully

play03:06

configurable hardware and in that

play03:08

instance you get essentially a fully

play03:10

optimized version of your code path and

play03:13

you can change it with a different Co

play03:15

bit stream in order to change those

play03:17

Pathways for your optimized model what

play03:20

we think talas is proposing here is

play03:23

something that sits in between you know

play03:25

a standard AI accelerator and that fully

play03:28

reconfigurable Hardware something called

play03:30

a structured Asic or what Intel calls

play03:33

this easic business what you have here

play03:37

is a reconfigurable hardware like an

play03:40

fpga but in the final few letter layers

play03:43

of metallization you fix some of those

play03:46

Pathways to be what's called hardened

play03:48

that means that there's no overhead for

play03:51

reconfigurable logic uh you get

play03:53

dedicated Asic like speeds but the

play03:56

configurability of having different of

play03:59

having the same

play04:00

uh chip being manufactured but then

play04:02

being optimized in different ways it's

play04:05

not something that we necessarily speak

play04:07

about a lot in the industry just because

play04:09

we either have things like dedicate we

play04:11

have like CPU cores and gpus which are

play04:14

fully programmable uh logic and then you

play04:17

have fpga which is reconfigurable

play04:19

Hardware this is essentially trying to

play04:21

meet in the middle with that the

play04:23

benefits of having an Asic on top what

play04:26

Talis and the beer are saying here is

play04:28

that they can extract a 10 to a,x better

play04:31

performance better or better efficiency

play04:33

as well um and may bring the cost of the

play04:36

hardware down what they say is the

play04:40

what's going to be an issue moving

play04:42

forward with some of this machine

play04:43

learning stuff is the cost of the

play04:45

hardware not everybody can buy gpus

play04:48

perhaps you don't want a GPU in your

play04:51

small embedded device you needed a

play04:53

dedicated AI accelerator with dedicated

play04:56

AI Pathways that's super efficient maybe

play04:59

the device it's going in will never

play05:01

connect to the internet for 10 20 30

play05:03

years you know what code is going to be

play05:06

on there and as a result you can have

play05:09

dedicated Hardware just for the model

play05:11

you're running on that piece of Hardware

play05:14

now we could go into a conversation here

play05:16

about the proliferation of AI models um

play05:19

we I expect AI models to be almost as

play05:22

ubiquitous as electricity it's going to

play05:24

be in the small chips in your smart

play05:27

meters in your cars in your or any any

play05:30

electronic device it's going to be doing

play05:32

things like Advanced power management it

play05:34

may not be doing things like large

play05:35

language models uh they maybe you do

play05:38

have a small handheld device that

play05:39

doesn't connect to the internet that

play05:40

will have to interact uh with voice or

play05:43

with language in some way or you know

play05:45

image generation that's like I say Ai

play05:48

and machine learning is still such a

play05:49

rapidly developing Market it depends

play05:52

what you're going to be using these

play05:53

devices for I mean right now I've got a

play05:55

camera in front of me the camera doesn't

play05:57

connect to the internet but it could use

play05:59

machine learning for power optimization

play06:02

or for uh image correction because

play06:04

goodness knows this image is probably

play06:05

terrible but then behind it I've got a

play06:07

ring light and maybe the ring light

play06:09

needs machine learning again for

play06:11

adapting some of the LEDs perhaps the

play06:13

performance of the LEDs changes over

play06:15

time and perhaps you can put that into a

play06:17

dedicated Asic that's cheap to make

play06:19

cheap to proliferate ultra low power and

play06:22

could potentially become ubiquitous this

play06:25

is essentially what Talis are doing

play06:27

they're calling it Talis Foundry so it's

play06:29

a combination of hyper optimized models

play06:32

and then hyper optimized silicon

play06:35

producing this net benefit and they

play06:37

think that's the future of where machine

play06:39

learning has to go especially when we're

play06:41

talking about Edge and Edge inference

play06:43

we're not necessarily talking about the

play06:44

big training right now this is mainly an

play06:47

edge and Edge inference um so good luck

play06:50

to Leia and his team they're expecting a

play06:53

chip tape out by the end of the year and

play06:55

proliferation into customers by next

play06:58

year um as a generalized language model

play07:01

uh chip I say generalized it's going to

play07:04

be dedicated for a specific customer I'm

play07:06

pretty sure but they've got to show off

play07:07

a proof of concept and they got to show

play07:09

that this technology works we'll be

play07:12

staying tuned and we'll be keep we'll

play07:14

keep on top of what announcements

play07:16

they'll make in the

play07:28

future

Rate This
โ˜…
โ˜…
โ˜…
โ˜…
โ˜…

5.0 / 5 (0 votes)

Related Tags
AI hardwareTalas startupEfficiency leapMachine learningSilicon innovationDedicated chipsEdge inferenceFPGAASICTechnology future