WTF is an AI PC?

Framework
12 Sept 202416:05

Summary

TLDRDans cette vidéo, NV Patel, fondateur et PDG de Framework, présente les modèles de langage avancés et leur utilisation sur des ordinateurs portables. Il discute de la manière dont les modèles locaux comme Meta's Llama 3.1 peuvent offrir des réponses intelligentes sans avoir à envoyer de données en ligne. Le PDG explore également les avantages de la confidentialité et de la personnalisation des modèles, ainsi que les progrès rapides dans le domaine de l'IA et de l'apprentissage automatique.

Takeaways

  • 🤖 NV Patel, fondateur et PDG de Framework, discute d'IA sur une vidéo YouTube.
  • 🧠 Il existe beaucoup de bruit autour de l'IA, mais Patel se concentre sur les cas d'utilisation réels et fonctionnels.
  • 💡 Patel démontre comment exécuter le modèle 'llama 3.1' de Meta localement sur un ordinateur portable Framework.
  • 🔍 Le modèle 'llama 3.1' répond rapidement à des questions complexes, montrant l'avancée de l'IA sur matériel de consommation.
  • 🆚 Les grands modèles de langage comme GPT sont propriétaires et coûteux, contrairement aux modèles plus petits et ouverts comme 'llama 3.1'.
  • 🌐 Exécuter des modèles localement permet de garder le contrôle total des données et de personnaliser le modèle.
  • 🏆 'llama 3.1' se classe dans le top 10 du Chatbot Arena, prouvant la compétitivité des modèles ouverts.
  • 🔢 Les paramètres du modèle (poids internes) déterminent sa capacité à apprendre et à répondre.
  • 💾 Il est possible de réduire la profondeur des bits des modèles pour les faire tenir dans une mémoire GPU limitée.
  • 🖥️ Les GPU sont adaptés pour exécuter des modèles IA en raison de leur capacité de traitement de matrices et de bande passante mémoire élevée.
  • 📝 AMA est un outil simple pour exécuter des modèles de langage sur une machine personnelle sans connaissances techniques approfondies.

Q & A

  • Qu'est-ce que NV Patel mentionne comme problème avec l'hype autour de l'IA?

    -Il mentionne qu'il y a beaucoup de bruit et d'informations non essentielles ('noise and BS') autour des sujets à la mode, et qu'il préfère se concentrer sur ce qui est réel et fonctionne.

  • Quel est l'objectif de NV Patel lorsqu'il parle de l'IA?

    -Il veut éviter les cas d'utilisation hype et parler de ce qui fonctionne réellement avec l'IA sur des matériels réels.

  • Pourquoi NV Patel n'est-il pas un expert en IA?

    -Il se décrit comme un amateur, et son approche est donc de partager ce qu'il a appris jusqu'à présent plutôt que de parler avec l'autorité d'un expert.

  • Quel modèle d'IA a-t-il exécuté sur un ordinateur portable Framework?

    -Il a exécuté le modèle 'llama 3.1' de Meta.

  • Quelle question a-t-il posée au modèle 'llama 3.1' lors de la démonstration?

    -Il a demandé au modèle ce qui est important concernant le droit de réparation.

  • Quels sont les avantages de l'exécution d'un modèle IA localement?

    -Les avantages incluent le contrôle total, la sécurité des données (elles ne quittent pas la machine), et la possibilité de modifier le modèle ou d'inspecter ce qui se passe en coulisses.

  • Quelle est la différence entre les modèles de langage géants comme ChatGPT et les modèles plus petits?

    -Les grands modèles comme ChatGPT ont des centaines de milliards de paramètres et s'exécutent sur des serveurs coûteux, tandis que les petits modèles sont conçus pour s'exécuter sur des GPU de niveau consommateur.

  • Quels sont les modèles IA qui commencent à être compétitifs avec les plus grands modèles?

    -Les modèles 'llama 3.1' de Meta sont mentionnés comme étant compétitifs avec les plus grands modèles tout en étant suffisamment petits pour s'exécuter sur un ordinateur portable.

  • Quel est le rôle des paramètres dans un modèle IA?

    -Les paramètres sont les poids ou les éléments à l'intérieur du modèle, et plus il y en a, plus le modèle peut être intelligent, en gros.

  • Pourquoi les GPU sont-elles efficaces pour exécuter des modèles IA localement?

    -Les GPU ont beaucoup de bande passante mémoire et de capacité de multiplication de matrices et de débit, ce qui est utile pour le machine learning, similaire à la façon dont elles sont utilisées pour les jeux vidéo.

  • Quel outil a utilisé NV Patel pour exécuter des modèles de langage sur son ordinateur portable?

    -Il a utilisé un outil appelé 'ama', développé par un groupe appelé 'ama', pour exécuter des modèles de langage.

  • Quels sont les autres types de modèles IA que mentionne NV Patel?

    -Il mentionne les modèles de vision, comme 'lava' et 'lava llama 3', qui peuvent analyser des images et interagir avec l'utilisateur en fonction de l'image.

  • Quel est l'intérêt de la génération d'images avec des modèles IA?

    -La génération d'images est une autre application intéressante de l'IA où des modèles comme 'stable diffusion' ou 'flux' peuvent créer des images à partir de rien.

Outlines

00:00

🤖 Introduction à l'IA avec NV Patel

NV Patel, fondateur et PDG de Framework, aborde le sujet de l'IA en evitant les cas d'utilisation hyperboliques. Il souhaite se concentrer sur ce qui fonctionne réellement aujourd'hui. Il mentionne qu'il n'est pas un expert en IA ou en apprentissage automatique et qu'il aborde le sujet en tant qu'amateur. Il démontre un exemple concret en utilisant le modèle 'llama 3.1' de Meta sur un ordinateur portable Framework, répondant à une question complexe sur le droit de réparation.

05:01

🔍 Explication des modèles d'IA et de leurs avantages

Le script explique les différences entre les modèles d'IA de grande taille, tels que Chat GPT, qui fonctionnent sur des serveurs onéreux et des modèles plus petits exécutés sur des GPU de consommation. Il souligne les avantages de Meta's 'llama 3.1', un modèle intermédiaire qui fonctionne sur des GPU de consommation et offre une bonne intelligence tout en étant modifiable et inspectable par l'utilisateur. Le modèle est classé dans le top 10 du Chatbot Arena, un concours mondial entre les modèles d'IA.

10:02

📊 La puissance des paramètres et de la mémoire GPU

Le script explore le concept de paramètres dans les modèles d'IA, qui sont les poids du modèle et déterminent sa capacité à apprendre et à répondre. Il explique comment réduire la profondeur des bits des paramètres peut permettre de faire tenir des modèles plus grands dans moins de mémoire. Il souligne également l'efficacité des GPU pour exécuter ces modèles en raison de leur capacité de traitement de matrices et de leur bande passante mémoire élevée.

15:03

💬 Utilisation des modèles de langage et des applications futures

Le script présente diverses utilisations des modèles de langage, allant de l'interaction textuelle à la reconnaissance d'images avec le modèle 'lava llama 3'. Il mentionne également d'autres types de modèles d'IA en dehors des LMM, comme les générateurs d'images, et comment ils pourraient être configurés pour fonctionner localement sur un ordinateur portable. Il conclut en disant que l'état de l'art en IA avance rapidement et que nous en entendrons davantage à l'avenir.

Mindmap

Keywords

💡AI

AI, ou Intelligence Artificielle, fait référence à la capacité des ordinateurs de réaliser des tâches qui nécessitent habituellement l'intelligence humaine. Dans la vidéo, l'AI est abordée comme un sujet à la mode, mais l'accent est mis sur les applications réelles et opérationnelles plutôt que sur la spéculation ou la publicité excessive. L'exemple donné est l'utilisation d'un modèle AI local pour répondre à une question complexe sur le droit de réparation.

💡Hype

Le terme 'hype' fait référence à la publicité excessive ou à la couverture médiatique qui peut surestimer la portée ou l'utilité de quelque chose. Dans le script, l'orateur mentionne qu'il évite le 'hype' autour de l'AI et se concentre sur ce qui fonctionne réellement, suggérant un désir de distinguer entre la réalité et l'exagération.

💡Framework Laptop

Le 'Framework Laptop' est un ordinateur portable conçu pour la modularité et la réparabilité. Dans le script, il est utilisé comme plateforme pour exécuter des modèles d'IA localement, mettant en évidence la puissance et la flexibilité de cette plateforme pour l'innovation technologique.

💡Discret GPU

Une 'GPU discrète' est une unité de traitement graphique autonome, distincte de l'UC. Elle est essentielle pour l'exécution de tâches gourmandes en ressources telles que le machine learning. Dans le script, la présence d'une GPU discrète est soulignée comme un élément clé pour faire fonctionner les modèles AI sur un ordinateur portable.

💡Meta's Llama 3.1

Meta's Llama 3.1 est un modèle de langage AI développé par Meta (anciennement Facebook). Il est mentionné dans le script comme un exemple de modèle 'intermédiaire' qui offre un bon équilibre entre la taille, la capacité de traitement et l'intelligence. Il est utilisé pour démontrer comment des modèles AI de plus en plus performants peuvent s'exécuter localement sur un matériel de consommation.

💡Proprietary Model

Un 'modèle propriétaire' est un logiciel ou un algorithme développé par une entreprise qui le garde secret et le contrôle entièrement. Dans le script, les modèles propriétaires comme ChatGPT sont comparés aux modèles ouverts comme Llama 3.1, en termes de confidentialité, de modification possible et d'accès aux données.

💡Right to Repair

Le 'droit de réparation' est un mouvement visant à permettre aux consommateurs de réparer leurs propres biens plutôt que d'avoir à les jeter et d'en acheter de nouveaux. Dans le script, il est cité comme une question complexe à laquelle le modèle AI peut répondre, illustrant la capacité de l'AI à traiter des sujets d'intérêt public.

💡Matrix Multiplication

La 'multiplication de matrices' est une opération mathématique courante en informatique, essentielle pour les calculs dans les réseaux de neurones artificiels. Dans le script, elle est mentionnée comme l'un des raisons pour lesquelles les GPU sont efficaces pour exécuter des modèles AI localement, en raison de leur capacité à effectuer ces opérations rapidement.

💡Open Source

Le terme 'open source' fait référence au logiciel dont le code source est disponible gratuitement, permettant à tout le monde de le modifier et de l'améliorer. Dans le script, l'open source est présenté comme un avantage pour les modèles AI, offrant la possibilité de personnalisation, de modification et de contrôle total sur les données et le logiciel.

💡Chatbot Arena

Le 'Chatbot Arena' est une compétition mentionnée dans le script qui évalue les modèles de langage AI basés sur leurs réponses aux questions. Elle sert à comparer les performances des différents modèles AI, y compris les modèles ouverts comme Llama 3.1, avec les modèles propriétaires de grande taille.

Highlights

NV Patel, founder and CEO of Framework, discusses AI and its practical applications.

AI hype often leads to noise and confusion, so Framework focuses on real, workable solutions.

Demonstration of running Meta's LLaMA 3.1 model locally on a Framework laptop to answer complex questions.

Comparison between large, proprietary AI models like ChatGPT and smaller, open-source models.

Advantages of running AI models locally include data privacy and the ability to modify and inspect the model.

Emergence of mid-sized AI models that are smart enough and can run on consumer GPUs.

LLaMA 3.1's performance in the Chatbot Arena, competing with proprietary models.

Parameters define the model's intelligence, with more parameters generally leading to smarter models.

GPUs are well-suited for running AI models due to their memory bandwidth and matrix multiplication capabilities.

Practical demonstration of asking LLaMA 3.1 about the best burritos in San Francisco.

Exploration of large language models beyond text, including models with vision support.

Demonstration of image analysis using the LLaMA model to describe content of an image.

Knowledge cutoff of AI models and their reliance on training data for information.

Potential future improvements in AI models' knowledge and capabilities.

Introduction to other AI applications like image generation using models like Stable Diffusion.

The rapid advancement of AI and machine learning technology and its impact on consumer electronics.

Framework's commitment to discussing AI and machine learning as the technology evolves.

Script for a YouTube video explaining how to run large language models locally on a laptop.

Transcripts

play00:00

I'm NV Patel I'm the founder and CEO of

play00:02

framework and today we're going to talk

play00:04

about everyone's favorite topic AI we

play00:07

haven't talked a lot about Ai and part

play00:09

of the reason for that is there's just a

play00:11

ton of hype around it and when there's

play00:13

hype around a topic there tends to be a

play00:15

lot of noise and Bs in general and we

play00:18

try to avoid that we try to focus on

play00:20

what's real we try to focus on what

play00:22

works and so when we talk about AI we're

play00:24

going to avoid the hypy use cases and

play00:27

really actually just talk through what

play00:29

works today on Real Running

play00:32

[Music]

play00:35

Hardware let's talk about what AI is in

play00:38

the context of a framework laptop one

play00:40

thing I want to call out from the start

play00:42

is that I'm not an AI expert I'm not a

play00:44

machine learning expert so I'm really

play00:46

approaching this from the perspective of

play00:48

a hobbyist and so before we go deep into

play00:51

what I've learned so far we're actually

play00:52

just going to run something so we're

play00:54

going to open up oama and we're going to

play00:56

run meta's llama 3.1 model and we're

play00:59

just going to ask it a question to show

play01:01

what AIS can actually do running locally

play01:03

on a framework laptop or really any

play01:04

consumer laptop with a discret GPU so

play01:07

we're just going to ask it what's

play01:09

important

play01:11

about the right to repair so an

play01:14

important question very important

play01:15

question what's important about the

play01:16

right to repair and you can see it's

play01:17

just dumping out an answer pretty

play01:19

quickly and just quickly reading through

play01:21

this it's an issue that's gained

play01:22

significant attention in recent years

play01:24

environmental impact cost savings

play01:26

increased product lifespan some of these

play01:28

are a little iier job creation econom

play01:29

iic growth obviously we're hiring people

play01:31

so maybe we've created some jobs with

play01:33

this that's great uh product design

play01:34

Innovation so this is actually a pretty

play01:36

good answer like it's pretty fast that

play01:38

it's stmping this out it's a pretty

play01:40

smart answer it's actually probably a

play01:41

better answer than what you get in you

play01:43

know a few minutes of Googling around

play01:45

and that it's just nicely summarized um

play01:48

and so basically just as a starting

play01:49

point this model running locally on a

play01:51

framework laptop 16 was able to answer

play01:55

actually a quite complex question with a

play01:58

reasonable answer pretty quickly and so

play02:00

that's that's the starting point and so

play02:02

let's talk about what that just was what

play02:03

just happened here running locally on

play02:04

this computer and how does it compare to

play02:07

maybe some of the other AIS that you've

play02:08

seen or used in the past like chat GPT

play02:11

so the thing about chat GPT of course is

play02:13

that it is this enormous proprietary

play02:16

model that openai has developed there's

play02:18

similar models like clad there are other

play02:20

ones from Google and x and other

play02:22

companies the key thing of course is

play02:24

that these are normal models running on

play02:27

very expensive servers in data centers

play02:30

and they are proprietari and

play02:32

inaccessible in the sense that if you

play02:34

wanted to modify it if you wanted to

play02:36

inspect what's happening under the hood

play02:38

you actually can't do that if you've got

play02:39

concerns around privacy or security you

play02:43

have to basically trust that those

play02:44

companies are doing the right things

play02:46

with your queries and your data that

play02:47

you're providing and so obviously the

play02:49

advantage to being able to run a model

play02:51

locally is that you have complete

play02:53

control you can trust that your data is

play02:55

not leaving your machine you can modify

play02:57

the model you can inspect what's

play02:59

happening under the hook it's all open

play03:00

and available to you and the trade-off

play03:02

of course is whether it's smart enough

play03:04

and so as we look at those giant models

play03:06

like chat GPT they have hundreds of

play03:09

billion or yeah hundreds of billions of

play03:11

parameters and they're running on these

play03:13

multi-million doll machines filled with

play03:16

inedia gpus that oftentimes you can't

play03:18

even get access to versus let's say the

play03:21

types of models that have been getting

play03:24

hype in the PC space things that are

play03:26

running on the tiny bit of silicon area

play03:28

that's dedicated to npus on these recent

play03:31

generation processors there's this

play03:33

massive Gulf between those two ends

play03:35

these giant models like trat GPT on one

play03:37

side and then the tiny little models

play03:39

that to run on 40 to 50 Tops on your

play03:42

processor but the cool thing is that

play03:44

there is this Middle Ground that's

play03:45

starting to emerge where there are

play03:48

models that fit on consumer level gpus

play03:51

that are actually getting smart enough

play03:53

to be useful and so this model that we

play03:55

just ran here was meta's latest and

play03:57

greatest called llama 3.1 and they

play04:00

actually have multiple sizes of it that

play04:01

range from too big to reasonably run on

play04:04

a laptop to actually perfectly sized to

play04:07

be able to run on 8 gigs of Graphics

play04:10

memory which is exactly what we have

play04:12

here with our Radeon 7700s and the

play04:14

framework laptop 16 and so we're

play04:16

actually just going to look at some of

play04:17

these models this is um a site called

play04:20

Elm Marina there's this competition

play04:23

that's called the chatbot Arena which is

play04:25

This Global competition between both

play04:27

proprietary and open large language

play04:30

models basically to see which one is the

play04:32

smartest which one's delivering the best

play04:33

responses to questions and you can see

play04:36

the stack ranking that's been generated

play04:38

now over the course of a few years and

play04:40

obviously like that the top models the

play04:42

greatest models giving the best answers

play04:44

are these very very large proprietary

play04:46

models that have the most data going

play04:48

into them that have the largest number

play04:49

of parameters largest model size overall

play04:52

that are closed of course uh but the

play04:55

cool thing is that as you start to

play04:56

scroll down just a little bit right

play04:58

there in the top 10 we've got an open

play05:01

model this model called llama 3.1 that

play05:04

meta has been investing in now over a

play05:06

few generations and the exciting thing

play05:08

about this is that it is open it's

play05:10

available under an open license you can

play05:11

actually download the entire model

play05:13

locally and be able to play with it the

play05:16

license of course it's it's a little bit

play05:17

tricky it is an open model in the sense

play05:19

that it's open and accessible but there

play05:22

are some restrictions really for the

play05:23

sake of safety

play05:25

primarily and we're not going to address

play05:28

that in this video but one thing to call

play05:30

out here of course is that as you look

play05:32

down this ranking you're kind of going

play05:34

from like these giant models that have

play05:36

hundreds of billions of parameters and

play05:38

as you scroll down you start to see the

play05:39

models that start to get a bit smaller

play05:41

like you've got llama 370 billion you've

play05:43

got llama 3 8 billion which is the one

play05:47

that we just ran which is still in the

play05:49

top 50 and actually if you look at some

play05:51

of the models that are around it these

play05:52

were best inclass proprietary models

play05:55

just a generation ago so basically this

play05:57

Frontier in Ai and machine Lear learning

play06:00

is moving incredibly quickly where the

play06:03

top models in the world the most

play06:05

advanced models that were closed and

play06:07

proprietary just a generation ago open

play06:10

models that are small enough to run on a

play06:12

consumer laptop are actually competitive

play06:15

with them which is just such a cool

play06:16

place to be and part of why machine

play06:18

learning in AI actually genuinely is

play06:20

interesting and that there is stuff

play06:22

there beyond the hype and so one cool

play06:25

thing so I mentioned parameters a few

play06:27

times so just to explain a bit about

play06:29

what parameters are so parameters are

play06:31

basically the the weights the number of

play06:33

of like items that are inside that model

play06:35

and the more parameters there are just

play06:38

as a rough approximation the smarter the

play06:40

model can be that's why you see of

play06:42

course meta three different model sizes

play06:45

the the biggest model is the smartest

play06:46

and it goes down as you shrink the model

play06:48

down a bit and the number of parameters

play06:52

normally would be you know let's say 16

play06:54

bits per parameter two bytes per

play06:56

parameter so for this 8 billion

play06:57

parameter model you need a 16 gab chunk

play07:01

of memory to be able to run through it

play07:03

you can though shrink down the bit depth

play07:06

to 8 bit or six bit or even smaller

play07:08

without like substantially making the

play07:10

model dumber and so often times I'll run

play07:14

a model at 8 bit or even six bit just to

play07:16

make it fit and so we can take these 8

play07:19

billion parameter models and be able to

play07:20

fit them in 8 gigs of video memory which

play07:23

is cool and so one other thing to call

play07:25

out of course when we ran that demo

play07:27

asking a question um so I'll just say

play07:30

tell me more um you can see that it it

play07:32

it answers pretty quickly and so one

play07:34

cool thing that makes gpus so effective

play07:36

at running these models locally versus

play07:39

something like an npu running uh you

play07:41

know within the Silicon area of your of

play07:43

your CPU Apu is that there's actually

play07:45

quite a lot of memory bandwidth and

play07:47

there's a lot of matrix multiplication

play07:49

capability and throughput inside of a

play07:51

GPU because they're crunching you know

play07:53

polygons they're crunching shaders

play07:56

things that are just matrix

play07:57

multiplication which very conveniently

play07:59

is largely what machine learning is and

play08:02

so what made gpus perfect for gaming has

play08:05

translated really really well over to

play08:07

machine learning and so you can see in

play08:10

general these models these 8 billion

play08:12

parameter models because of that matrix

play08:14

multiplication throughput because of the

play08:16

memory bandwidth that's available they

play08:18

can output answers basically faster than

play08:20

you can read them and so like General

play08:23

like conversational speed you'd need

play08:25

about like five to seven words per

play08:28

second or tokens per spec per second for

play08:30

it to not feel sluggish and with these

play08:33

models running on 7700s like we have

play08:36

here you can get up to about 30 35

play08:39

tokens per second which is basically

play08:41

faster than you can read so really

play08:43

actually quite usable while still

play08:45

delivering these answers that are pretty

play08:47

smart so we've entered this kind of cool

play08:49

sweet spot here going back to focusing

play08:51

on real use cases instead of hype of

play08:53

course the cool thing here is that we're

play08:55

really just approaching this likely

play08:56

would any other piece of software so you

play08:58

can choose what you want to run this

play09:00

isn't like co-pilot that's baked into

play09:01

your PC that you have no control over

play09:04

this is literally open source software

play09:06

and so the specific tool that I use I've

play09:08

been playing with here showing these

play09:09

demos is a tool called AMA which is from

play09:13

a group called AMA and they make AMA

play09:16

which is probably the easiest way to get

play09:18

up and running with large language

play09:20

models actually that's literally their

play09:22

one sentence on their homepage get up

play09:23

and running with large language models

play09:25

and so I just hit the download button

play09:27

installed it it popped open window

play09:29

terminal and then I just ran ama ama run

play09:33

and the cool thing here is that they

play09:36

have access to they've actually made all

play09:38

of these open models super super simple

play09:41

to run and so you can go to their

play09:42

website and just go to the models list

play09:46

and see llama 31 Gemma 2 from Google

play09:48

mistol which is a big AI company I think

play09:51

based out of France a deep

play09:53

coder code Gemma like all these

play09:56

different models that kind of have

play09:57

different specializations

play09:59

and you can actually just run any one of

play10:01

them so you can just say run run Gemma

play10:05

2 and the cool thing here is that it

play10:08

doesn't take any programming knowledge

play10:10

it doesn't take really deep technical

play10:12

knowledge of any kind it's literally

play10:14

just you interacting in text with this

play10:16

model but AMA and a lot of the tools

play10:19

around this that are open source are

play10:21

flexible enough that if you want to go

play10:22

deeper as a developer or tinkerer you

play10:25

can go in and write uh you know python

play10:28

interfaces or other interfaces to be

play10:30

able to like download data or have the

play10:33

model interact with the Internet or

play10:35

interact with data on your machine again

play10:37

all that you control in a way that you

play10:40

can see the code or Rite the code modify

play10:43

the model if you want to and have

play10:45

complete ownership over without having

play10:47

to wonder what's happening with your

play10:49

data or what's happening out in the

play10:50

cloud so really cool stuff happening

play10:52

locally these days so what else can you

play10:54

do with large language models I've been

play10:56

kind of treating it a bit like Google

play10:57

but you can go a bit deeper you know you

play10:59

can actually be friends with your

play11:00

virtual friend in your framework laptop

play11:03

um and so we can pick a couple of other

play11:05

examples like maybe we ask llama

play11:07

3.1 what's the best burrito in San

play11:11

Francisco and just see um yeah and it's

play11:15

just dumping out this answer super

play11:16

quickly so I I live in San Francisco I

play11:18

love burritos and I have to say these

play11:21

are pretty good answers actually these

play11:23

These are good these are good burritos

play11:24

they got a a good the stack ranking of

play11:26

course is controversial but the specific

play11:28

few that they pick they are good good

play11:30

burritos we've been focused on large

play11:33

language models so far which are

play11:34

basically models that you can feed text

play11:36

into it'll crunch it and then output an

play11:38

answer back as text but as we look at

play11:41

machine learning in AI some of the

play11:42

coolest stuff happening goes outside of

play11:45

text outside of llms and into other

play11:47

types of models and so one of the cool

play11:49

things about olama is that there's

play11:51

actually Vision support and that you can

play11:54

load in a model that you can feed an

play11:56

image and the model will basically parse

play11:58

the image understanding what's happening

play11:59

there and then you can interact with the

play12:01

model and ask it questions and kind of

play12:03

interrogate it about the image or have a

play12:05

conversation that's based on the image

play12:07

and so I'm going to download an image I

play12:09

just picked this one off of our website

play12:11

it is now just downloaded and I'm going

play12:13

to run this model that's called lava and

play12:17

actually I'm going to run a variant of

play12:19

it that's called

play12:20

lava llama 3 which uh uses meta llama 3

play12:25

in combination with lava to actually

play12:26

give some pretty good results and so I'm

play12:28

just going to feed it this image image

play12:30

um that is conveniently called image.png

play12:33

and I'm just going to ask it what's in

play12:36

this image and we're going to see how

play12:38

smart it is so it po it so in this image

play12:41

a young woman is engrossing her work in

play12:42

a silver what a silver MacBook laptop

play12:45

hold on um that's not a

play12:52

Macbook other than that it was pretty

play12:54

accurate so behind her there's a window

play12:56

that offers a Serene view of lush green

play12:58

plants out side in addition to the

play13:00

ambience um see the woman seems to be

play13:02

enjoying her work and I think she

play13:04

probably was um wow this is even worse

play13:08

so I told it that's not a Macbook it

play13:10

thinks it's an Acer Spire or a Toshiba

play13:13

Satellite I don't even know what those

play13:15

are so one one key factor here is that

play13:19

there is actually a knowledge cut off

play13:21

because this model is not touching the

play13:22

internet it is running entirely locally

play13:25

so it actually only has knowledge of

play13:27

whatever was baked in at the point in

play13:29

time that it was fed that data to Crunch

play13:32

into and train that model obviously that

play13:34

model has no knowledge of what framework

play13:36

is as a company it might have even been

play13:38

trained before framework existed as a

play13:40

company so unfortunately we can't ask it

play13:42

about framework laptop we can only ask

play13:44

it about Acer Aspire Toshiba satellite

play13:47

or MacBook it's okay though I'm not

play13:51

offended by lava llama 3 in the future

play13:54

of course these models will know a bit

play13:57

more about the r repair and a bit more

play13:58

about frame work hopefully there's other

play14:00

cool stuff you can do too like image

play14:02

generation there models called uh stable

play14:04

diffusion there's a newer one called

play14:06

flux that are these open image

play14:08

generation models they are quite a bit

play14:10

more complex to set up so we're not

play14:12

going to play with them today but they

play14:14

have actually gotten a lot better on AMD

play14:16

historically they've been largely Nvidia

play14:19

focused in terms of what has been open

play14:21

to run locally on a laptop or on a

play14:23

desktop amd's been closing the Gap

play14:26

really really quickly with their roam

play14:28

framework as so as you can see like with

play14:30

Olam it just runs straight out of the

play14:31

box with some of the image Generation

play14:33

stuff takes a little bit more work but

play14:36

it is getting there pretty quickly and

play14:38

so of course this was the first time

play14:39

we've talked about AI it's the first

play14:40

time we've talked about machine learning

play14:42

but the state-of-the-art is advancing

play14:44

really quickly the frontier is just

play14:46

moving incredibly fast so this is

play14:48

something that we're definitely going to

play14:49

be talking more about as we go write me

play14:51

a script for a YouTube

play14:56

video video explain explaining how to

play15:00

run large

play15:03

language models locally on a

play15:07

laptop here's script oh wow so an intro

play15:10

30 second intro hey there Tech

play15:12

enthusiasts welcome back to the

play15:13

framework YouTube channel today we're

play15:15

going to talk about something that's

play15:17

been making waves in the Tech Community

play15:18

running large language models locally on

play15:20

your laptop yeah you heard that right

play15:22

with a rise of AI and machine learning

play15:24

having a powerful language model at your

play15:26

fingertips oh my God it's writing faster

play15:28

than I can read

play15:29

and scrolling powerful language while

play15:31

your fingertips can be super useful for

play15:32

developers researchers even hobbyists

play15:35

like ourselves so let's dive into it and

play15:38

this is still going so we've got a 12

play15:42

minute script that ends with oh closing

play15:45

shot with a friendly smile thanks for

play15:47

watching everyone we'll catch you in the

play15:49

next video

play15:51

[Music]

Rate This

5.0 / 5 (0 votes)

Ähnliche Tags
IAMachine LearningLangagePortableModèlesTutorielFrameworkLocalDéveloppementTech
Benötigen Sie eine Zusammenfassung auf Englisch?