Has Generative AI Already Peaked? - Computerphile

Computerphile
9 May 202412:47

Summary

TLDREl video discute la idea de que el uso de inteligencia artificial generativa para producir nuevas oraciones e imágenes, y su capacidad para entender imágenes y otros elementos, podría llevar a una inteligencia generalizada. Sin embargo, un nuevo artículo científico cuestiona esta teoría, argumentando que la cantidad de datos necesarios para lograr un rendimiento general de cero disparos en tareas nunca antes vistas sería astronómicamente grande, y posiblemente inalcanzable. El estudio examina el rendimiento de tareas secundarias, como la clasificación o recomendaciones, basadas en el uso de sistemas de empaquetado de CLIP, que usan grandes transformadores de visión y codificadores de texto. Los hallazgos sugieren que para problemas difíciles y conceptos poco representados en los conjuntos de datos, el modelo no será tan efectivo a menos que se cuente con una cantidad masiva de datos. Esto plantea un debate sobre la viabilidad de alcanzar una IA generalista a través del simple aumento de la cantidad de datos y modelos, y si en su lugar se requerirá una nueva estrategia o enfoque en la inteligencia artificial para mejorar el rendimiento en tareas complejas.

Takeaways

  • 📈 La idea detrás de los modelos de inteligencia artificial generativa es que con suficientes pares de imágenes y texto, el modelo aprenderá a distilar lo que hay en una imagen en ese tipo de lenguaje.
  • 🤖 Se ha argumentado que con la adición de más y más datos o modelos más grandes, eventualmente se alcanzará una inteligencia general o una IA extremadamente efectiva que funcione en todos los dominios.
  • 🧪 Sin embargo, la ciencia no hipotetiza sobre lo que sucede, sino que justifica experimentalmente; por lo que cualquier afirmación de mejora continua debe ser comprobada empíricamente.
  • 📉 Un reciente artículo sugiere que la cantidad de datos necesaria para lograr un rendimiento de cero disparos generales (tareas nunca antes vistas) es astronómicamente vasta y potencialmente imposible de alcanzar.
  • 📚 Los modelos deClip embeddings utilizan un espacio compartido de embeddings para que las imágenes y el texto tengan una representación numérica similar, lo que se entrena a través de múltiples imágenes y textos.
  • 🚀 Estas técnicas se han utilizado en tareas secundarias como la clasificación y recomendaciones, como en sistemas de recomendación de servicios de streaming.
  • 🚧 El artículo demuestra que sin cantidades masivas de datos para respaldarlas, no es posible aplicar estas tareas secundarias de manera efectiva en problemas difíciles.
  • 📉 Los hallazgos del artículo sugieren que el rendimiento en tareas de IA se vuelve logarítmico y se aplana con el aumento de los datos, lo que indica un posible punto de saturación.
  • 🌳 La distribución de clases y conceptos dentro del conjunto de datos no es uniforme, lo que lleva a que algunos conceptos, como las especies de árboles específicas, estén muy subrepresentados.
  • 🛠 Aunque los modelos grandes y la retroalimentación humana pueden mejorar el rendimiento, el artículo cuestiona si simplemente acumular más datos será suficiente para abordar tareas difíciles.
  • ⚖️ El desafío es encontrar otras formas de abordar tareas difíciles que están subrepresentadas en los textos y búsquedas generales de Internet además de recolectar más datos.
  • 🔮 Los avances futuros en IA dependerán de la capacidad de superar los límites actuales de los modelos de变压器 (Transformer) y de encontrar estrategias de aprendizaje automático más eficaces.

Q & A

  • ¿Qué es un clip embedding y cómo se relaciona con la inteligencia artificial generativa?

    -Un clip embedding es una representación numérica que encapsula el significado de una imagen y un texto, aprendida a partir de pares de imágenes y texto. Se utiliza en inteligencia artificial generativa para producir nuevas oraciones, imágenes, etc., y para entender la relación entre el lenguaje y las imágenes.

  • ¿Por qué la idea detrás de los clip embeddings es que eventualmente se alcanzará una inteligencia general?

    -La idea es que si se analizan suficientes pares de imágenes y texto, el modelo aprenderá a distilar la esencia de una imagen en un lenguaje similar. Con suficientes imágenes y texto, se espera que el modelo alcance un nivel de inteligencia general que le permita funcionar eficazmente en todos los dominios.

  • ¿Qué argumenta la reciente investigación contra la posibilidad de una inteligencia general a través de la adición de más datos y modelos?

    -La investigación sugiere que la cantidad de datos necesaria para lograr un rendimiento de cero disparos general (performance en nuevas tareas nunca vistas) es astronómicamente vasta, al punto de ser imposible de alcanzar con los recursos actuales.

  • ¿Cómo se definen los conceptos en el estudio y cuál es su relación con la eficacia de las tareas downstream?

    -Los conceptos se definen como ideas simples, como 'gato' o 'persona', o más complejas, como una especie específica de gato o una enfermedad. Se examinan 4,000 conceptos diferentes y se evalúa su prevalencia en conjuntos de datos, luego se prueba su rendimiento en tareas downstream como la clasificación cero disparos o sistemas de recomendación.

  • ¿Qué hallazgos muestra la investigación en cuanto a la relación entre la cantidad de datos y el rendimiento en tareas downstream?

    -La investigación muestra que la relación no es lineal ni exponencial, sino logarítmica, lo que significa que a medida que se agregan más datos, los incrementos en el rendimiento se vuelven menos significativos, hasta alcanzar un punto de platillo.

  • ¿Por qué los sistemas de recomendación como Spotify o Netflix podrían beneficiarse de los clip embeddings?

    -Porque los clip embeddings pueden generar un espacio compartido de representación para imágenes y texto. Utilizando esta representación, podrían recomendar programas basados en la similitud de sus embeddings con los programas que el usuario ha visto previamente.

  • ¿Cómo afecta la distribución irregular de clases y conceptos en un conjunto de datos la capacidad de un modelo para realizar tareas difíciles?

    -La distribución irregular conduce a una sobre-representación de ciertos conceptos y una sub-representación de otros, lo que hace que el modelo tenga un peor desempeño en las tareas relacionadas con los conceptos poco representados, al no haber suficientes datos para entrenar el modelo en ellos.

  • ¿Qué sucede cuando un modelo de lenguaje grande es preguntado sobre un tema poco representado en su conjunto de entrenamiento?

    -El modelo comienza a crear respuestas que son menos precisas y empieza a 'halucinar', es decir, a generar información que no está bien soportada por los datos de entrenamiento, lo que degrada su rendimiento.

  • ¿Qué implicaciones tiene el hallazgo de que la adición de más datos y modelos no mejora significativamente el rendimiento para tareas difíciles?

    -Implica que para mejorar el rendimiento en tareas difíciles, es necesario encontrar nuevas estrategias de aprendizaje automático o nuevas formas de representar los datos que superen los límites actuales de los modelos basados en Transformers.

  • ¿Cuál es la sugerencia del hablante para mejorar el rendimiento en tareas difíciles que están sub-representadas en los conjuntos de datos?

    -La sugerencia es que en lugar de simplemente recopilar más y más datos, se debe encontrar otras formas de abordar estas tareas difíciles, posiblemente utilizando técnicas de aprendizaje automático más avanzadas o estrategias de modelado de datos diferentes.

  • ¿Por qué podría ser ineficiente continuar aumentando la cantidad de datos y el tamaño de los modelos para mejorar el rendimiento en tareas específicas?

    -Puede ser ineficiente debido a que hay un punto de retorno decreciente donde el costo de adición de más datos y aumento del tamaño del modelo supera los beneficios en términos de mejora del rendimiento, especialmente cuando se trata de conceptos sub-representados en los conjuntos de datos actuales.

Outlines

00:00

🤖 Generative AI y su potencial en la inteligencia artificial

El primer párrafo discute la utilización de la inteligencia artificial generativa para crear oraciones y imágenes nuevas. Se explora la idea de que al analizar suficientes pares de imágenes y texto, el AI podría aprender a convertir lo que hay en una imagen en un lenguaje similar. Además, se cuestiona la creencia de que con la adición de más datos y modelos más grandes, la IA alcanzará una inteligencia generalizada. Se menciona un estudio reciente que argumenta lo contrario, es decir, que la cantidad de datos necesaria para lograr un rendimiento general de cero disparos es astronómicamente grande y posiblemente inalcanzable.

05:00

📈 Análisis de datos y conceptos clave en la IA

El segundo párrafo se enfoca en el análisis de datos y conceptos clave en la IA. Se definen conceptos simples y se examina su prevalencia en conjuntos de datos. Luego, se evalúa el rendimiento de tareas descendentes, como la clasificación de cero disparos o sistemas de recomendación, en función de la cantidad de datos disponibles para cada concepto. Se grafica la relación entre el número de ejemplos en el conjunto de entrenamiento y el rendimiento en la tarea, mostrando que el rendimiento tiende a nivelarse a pesar del aumento en la cantidad de datos, lo que sugiere un posible punto de inflexión en la mejora de la IA.

10:01

🌐 Dificultades y soluciones en la representación de datos en la IA

El tercer párrafo aborda las dificultades de representar ciertos objetos o conceptos en la IA debido a su bajo representatividad en los conjuntos de datos de entrenamiento. Se da ejemplos de cómo los modelos de IA pueden tener un rendimiento inferior al solicitarles tareas complejas que no están ampliamente representadas en los datos con los que fueron entrenados. Se argumenta que para mejorar el rendimiento en tareas difíciles, se requerirá encontrar nuevas formas de representar los datos o nuevas estrategias de aprendizaje automático. Además, se menciona el potencial de las empresas con más recursos para mejorar los modelos a través de la retroalimentación humana y otros métodos.

Mindmap

Keywords

💡clip embeddings

Los 'clip embeddings' son representaciones vectoriales que permiten a los modelos de inteligencia artificial relacionar imágenes con texto. En el video, se menciona que a través del entrenamiento con múltiples pares de imágenes y texto, los modelos aprenden a 'distillar' la información de una imagen en una representación lingüística. Esto es fundamental para entender y generar contenido en ambos formatos, y es un tema central en la discusión del video.

💡generative AI

La 'generative AI' se refiere a la capacidad de los sistemas de inteligencia artificial para crear contenido original, como oraciones o imágenes nuevas. En el contexto del video, se discute cómo la generative AI puede ser utilizada para producir nuevas representaciones de texto e imágenes, y cómo esto puede llevar a la comprensión más profunda de la información visual.

💡Vision Transformer

El 'Vision Transformer' es un tipo de modelo de aprendizaje profundo que se utiliza en la comprensión de imágenes. En el video, se menciona como parte del proceso de entrenamiento de 'clip embeddings', donde se utiliza para procesar y entender la información visual de las imágenes.

💡text encoder

Un 'text encoder' es una parte de un modelo de lenguaje que convierte texto en una representación numérica que luego puede ser utilizada por el modelo para procesos de aprendizaje y comprensión. En el video, se destaca cómo el text encoder trabaja junto con el Vision Transformer para crear un espacio compartido de representación entre texto e imágenes.

💡zero shot performance

El 'zero shot performance' hace referencia a la habilidad de un modelo de IA para ejecutar tareas que no ha visto antes sin necesidad de más entrenamiento. En el video, se cuestiona si la adición de más datos y modelos mejorará significativamente este tipo de rendimiento, que es crucial para la generalización de la IA.

💡data set

Un 'data set' es una colección organizada de datos que se utiliza para entrenar y evaluar modelos de aprendizaje automático. En la discusión, se argumenta que la cantidad de datos necesarios para alcanzar un rendimiento generalizado en nuevas tareas es asombrosamente grande, y esto es un punto crítico en la investigación de IA.

💡recommender system

Un 'sistema de recomendación' es una aplicación de IA que sugiere contenido basado en la historial de行為 o preferencias de un usuario. En el video, se sugiere que los 'clip embeddings' podrían ser utilizados para mejorar estos sistemas, al recomendar contenido basado en la similitud de sus representaciones vectoriales.

💡classification

La 'clasificación' es el proceso de etiquetar o categorizar elementos de datos en grupos predefinidos. En el contexto del video, la clasificación es una tarea a la que se aplican los 'clip embeddings', permitiendo al modelo identificar y categorizar diferentes objetos o conceptos en imágenes.

💡overfitting

El 'overfitting' ocurre cuando un modelo de aprendizaje automático se ajusta demasiado bien a los datos de entrenamiento, lo que puede llevar a un rendimiento peor en datos no vistos. Aunque no se menciona explícitamente en el video, el concepto está implícito en la discusión sobre la necesidad de equilibrar la cantidad de datos y la complejidad del modelo para evitar este problema.

💡representation learning

El 'aprendizaje de representación' es el proceso de enseñar a un modelo de IA a construir representaciones internas de los datos que capturan información relevante y útil. En el video, este concepto es central, ya que la calidad de las 'clip embeddings' depende de la capacidad del modelo para aprender buenas representaciones de imágenes y texto.

💡plateau

Un 'plato' o 'plateau' en el contexto de aprendizaje automático se refiere a un punto en el que el rendimiento del modelo deje de mejorar a pesar de continuar añadiendo más datos o incrementar la complejidad del modelo. El video sugiere que se puede llegar a un plato en el rendimiento de la IA, lo que implicaría un límite en la mejora que se puede lograr con la adición de más datos.

Highlights

Exploration of clip embeddings and their role in understanding the relationship between images and text.

Discussion on the potential of generative AI to produce new sentences and images.

The concept that analyzing pairs of images and text can lead to a distilled representation of an image's content in language.

Argument that with enough training data and a large network, AI could achieve general intelligence across domains.

The importance of experimental justification over hypothetical claims in scientific inquiry.

Recent paper arguing against the idea that simply adding more data and bigger models will solve complex AI tasks.

The paper suggests that achieving general zero-shot performance on new tasks requires an astronomical amount of data.

Introduction of clip embeddings, which use a shared embedded space for images and text to match their meanings.

Potential applications of clip embeddings in classification, image recall, and recommender systems.

The paper's findings that massive amounts of data are needed to effectively apply downstream tasks for difficult problems.

The challenge of classifying specific subcategories like breeds of cats or tree species due to insufficient data.

The paper's experiments on various concepts, models, and downstream tasks, showing a consistent trend.

Evidence suggesting a plateau in performance improvement despite increasing data and model sizes.

The need for alternative strategies beyond Transformers for better performance on underrepresented tasks.

The paper's analysis of the prevalence of different concepts in datasets and their impact on downstream task performance.

The issue of class imbalance within datasets, leading to varied performance on different tasks.

The potential for companies with more resources to improve models through better data and human feedback.

The anticipation of future developments in AI and whether performance will plateau or continue to improve.

Sponsorship message and invitation to participate in programs run by Jane Street, with a link to their website.

Transcripts

play00:00

so we looked at clip embeddings right

play00:01

and we've talked a lot about using

play00:03

generative AI to produce new sentences

play00:06

to produce new images and so on and so

play00:08

to understand images all these kind of

play00:10

different things and the idea was that

play00:11

if we look at enough pairs of images and

play00:15

text we will learn to distill what it is

play00:18

in an image into that kind of language

play00:20

so the idea is you have an image you

play00:22

have some texts and you can find a

play00:23

representation where they're both the

play00:24

same the argument has gone that it's

play00:27

only a matter of time before we have so

play00:28

many images that we train on and so and

play00:30

such a big Network and all this kind of

play00:32

business that we get this kind of

play00:34

general intelligence or we get some kind

play00:35

of extremely effective AI that works

play00:38

across all domains right that's the

play00:40

implication right the argument is and

play00:42

you see a lot in the sort of tech sector

play00:44

from the from some of these sort of um

play00:46

big tech companies who to be fair want

play00:48

to sell products right that if you just

play00:52

keep adding more and more data or bigger

play00:54

and bigger models or a combination of

play00:56

both ultimately you will move Beyond

play00:59

just recognizing cats and you'll be able

play01:00

to do anything right that's the idea you

play01:02

show enough cats and dogs and eventually

play01:04

the elephant just is

play01:07

implied as someone who works in science

play01:09

we don't hypothesize about what happens

play01:12

we experimentally justify it right so I

play01:15

would say if you're going to if you're

play01:16

going to say to me that the only upward

play01:18

trajectory is is going you know the only

play01:20

trajectory is up it's going to be

play01:21

amazing I would say go on and prove it

play01:23

and do it right and then we'll see we'll

play01:25

sit here for a couple of years and we'll

play01:26

see what happens but in the meantime

play01:28

let's look at this paper right which

play01:29

came out just recently this

play01:31

paper is saying that that is not true

play01:34

right this paper is saying that the

play01:37

amount of data you will need to get that

play01:39

kind of General zero shot performance

play01:41

that is to say performance on new tasks

play01:43

that you've never

play01:44

seen is going to be astronomically vast

play01:47

to the point where we cannot do it right

play01:48

that's the idea so it basically is

play01:51

arguing against the idea that we can

play01:55

just add more data and more models and

play01:57

we we'll solve it right now this is only

play01:59

one p

play02:00

and of course you know your mileage may

play02:02

vary if you have a bigger GPU than these

play02:03

people and so on but I think that this

play02:05

is actual numbers right which is what I

play02:07

like because I want to see tables of

play02:09

data that show a trend actually

play02:10

happening or not happening I think

play02:12

that's much more interesting than

play02:14

someone's blog post that says I think

play02:16

this is going what's going to happen so

play02:18

let's talk about what this paper does

play02:20

and why it's interesting we have clip

play02:21

embeddings right so we have an image we

play02:23

have a big Vision Transformer and we

play02:25

have a big text encoder which is another

play02:28

Transformer bit like the sort of you

play02:30

would see in a large language model

play02:31

right which takes text strings my text

play02:33

string today and we have some shared

play02:35

embedded space and that embedded space

play02:37

is just a numerical fingerprint for the

play02:39

meaning in these two items and they're

play02:41

trained remember across many many images

play02:43

such that when you put the same image

play02:45

and the text that describes that image

play02:47

in you get something in the middle that

play02:49

matches and the idea then is you can use

play02:51

that for other tasks like you can use

play02:52

that for classification you can use it

play02:54

for image recall if you use a streaming

play02:55

service like Spotify or Netflix right

play02:58

they have this thing called a recom

play02:59

recommended system a recommended system

play03:01

is where you've watched this program

play03:02

this program this program what should

play03:05

you watch next right and you you might

play03:07

have noticed that your mileage may vary

play03:08

on how effective that is but actually I

play03:10

think they're pretty impressive what

play03:12

they have to do but you could use this

play03:14

for a recommender system because you

play03:15

could say basically what programs have I

play03:17

got that embed into the same space of

play03:19

all the things I just watched and and

play03:20

recommend them that way right so there

play03:21

are Downstream tasks like classification

play03:24

and recommendations that we could use

play03:26

based on a system like this what this

play03:28

paper is showing is that you cannot

play03:31

apply these effectively these Downstream

play03:34

tasks for difficult problems without

play03:36

massive amounts of data to back it up

play03:38

right and so and the idea that you can

play03:41

apply you know this kind of

play03:43

classification on hard things so not

play03:45

just cats and dogs but specific cats and

play03:48

specific dogs or subspecies of tree

play03:51

right or difficult problems where the

play03:53

the answer is more difficult than just

play03:54

the broad category that there isn't

play03:57

enough data on those things to train

play03:58

these models and way I've got one of

play04:01

those apps that tells you what specific

play04:03

species a tree is so is it not just

play04:05

similar to that no because they're just

play04:07

doing classification right or some other

play04:09

problem they're not using this kind of

play04:10

generative giant AI right the argument

play04:13

has been why do that silly little

play04:16

problem where you can do a general

play04:17

problem and solve all your problems

play04:19

right and the response is because it

play04:21

didn't work right that's that's that's

play04:22

that's why we're doing it um so there

play04:26

are pros and cons for both right I'm not

play04:28

going to say that no generative AI is

play04:30

useful or no or these these models are

play04:31

incredibly effective for what they do

play04:33

but I'm perhaps suggesting that it may

play04:36

not be reasonable to expect them to do

play04:38

very difficult medical diagnosis because

play04:41

you haven't got the data set to back

play04:42

that up right so how does this paper do

play04:44

this well what they do is they def they

play04:46

Define these Core Concepts right so some

play04:48

of the concepts are going to be simple

play04:49

ones like a cat or a person some of them

play04:51

are going to be slightly more difficult

play04:53

like a specific species of cat or a

play04:55

specific disease in an image or

play04:57

something like this and they they come

play04:59

up about

play05:00

4,000 different concepts right and these

play05:02

are simple text Concepts right these are

play05:04

not complicated philosophical ideas

play05:07

right I don't know how well it embeds

play05:09

those and and what they do is they look

play05:12

at the prevalence of these Concepts in

play05:14

these data sets and then they sh they

play05:16

they test how well the downstream task

play05:20

of let's say one zero shot

play05:21

classification or recall recommended

play05:24

systems works on all of these different

play05:26

concepts and they plot that against the

play05:29

amount of data that they had for that

play05:31

specific concept right so let's draw a

play05:32

graph and that will help me make it more

play05:34

clear right so let's imagine we have a

play05:36

graph here like this and this is the

play05:39

number of

play05:41

examples in our training set of a

play05:45

specific concept right so let's say a

play05:47

cat a dog something more difficult and

play05:49

this is the performance on the actual

play05:53

task of let's say recommend a system or

play05:56

recall of an object or the ability to

play05:58

actually classify as a cat right

play06:00

remember we talked about how you could

play06:01

use this for zero shck classification by

play06:03

just seeing if it embeds to the same

play06:05

place as a picture of a cat the text a

play06:07

picture of a cat that kind of process so

play06:09

this is performance right the best case

play06:12

scenario if you want to have an all

play06:14

powerful AI that can solve all the

play06:16

world's problems is that this line goes

play06:19

very steeply upwards right this is the

play06:20

exciting case it goes like like this

play06:23

right that's the exciting case this is

play06:25

the kind of AI explosion argument that

play06:28

basically says we're on the Custer

play06:29

something that's about to happen

play06:30

whatever that may be where the scale is

play06:32

going to be such that this can just do

play06:34

anything right okay then there the

play06:36

perhaps slightly more reasonable should

play06:39

we say pragmatic interpretation which is

play06:41

like just call it balanced right which

play06:43

is but there a sort of linear movement

play06:45

right so the idea is that we have to add

play06:47

a lot of examples but we are going to

play06:49

get a decent performance Boost from it

play06:50

right so we just keep adding examples

play06:51

we'll keep getting better and that's

play06:53

going to be great and remember that if

play06:55

we ended up up here we have something

play06:57

that could take any image and tell you

play06:58

exactly what's in it under any

play07:00

circumstance right that's that's kind of

play07:01

what we're aiming for and similarly for

play07:03

large language models this would be

play07:04

something that could write with

play07:05

Incredible accuracy on lots of different

play07:08

topics or for image generation it would

play07:10

be something that could take your prompt

play07:11

and generate a photorealistic image of

play07:13

that with almost no coercion at all

play07:16

that's kind of the goal this paper has

play07:18

done a lot of experiments on a lot of

play07:20

these Concepts across a lot of models

play07:21

across a lot of Downstream tasks and

play07:24

let's call this the evidence what you're

play07:28

going to call it pessimistic now it is

play07:30

pessimistic also right it's logarithmic

play07:32

so it basically goes like this right

play07:34

flattens out it flattens out now this is

play07:36

just one paper right it doesn't

play07:38

necessarily mean that it will always

play07:39

flatten out but the argument is I think

play07:42

that and it's not an argument they

play07:44

necessarily make in in the paper but you

play07:46

know the paper's very reasonable I'm

play07:47

being a bit more Cavalier with my

play07:49

wording the suggestion is that you can

play07:51

keep adding more examples you can keep

play07:52

making your models bigger but we are

play07:54

soon about to hit a plateau where we

play07:56

don't get any better and it's costing

play07:58

you millions and millions of dollars to

play08:00

train this at what point do you go well

play08:02

that's probably about as good as we're

play08:03

going to get with technology right and

play08:05

then the argument goes we need something

play08:07

else we need something in the

play08:08

Transformer or some other way of

play08:10

representing data or some other machine

play08:12

learning strategy or some other strategy

play08:14

that's better than this in the long term

play08:17

if we want to have this line G up here

play08:18

or this line gar up here that's that's

play08:20

kind of the argument and so this is

play08:22

essentially

play08:23

evidence I would argue against the kind

play08:26

of

play08:27

explosion you know possibility of but

play08:30

just you just add a bit more data and we

play08:31

were on the cusp of something we might

play08:32

come back here in a couple of years you

play08:34

know if you're still allow me on

play08:35

computer file after this absolute

play08:36

embarrassment of of these claims that I

play08:38

made um and we say okay actually the

play08:41

performan has improve improved massively

play08:43

right or we might say we've doubled the

play08:44

number of data sets to 10 billion images

play08:47

and we've got 1% more right on the on on

play08:49

the classification to which is good but

play08:51

is it worth it I don't know this is a

play08:53

really interesting paper because it's

play08:54

very very fough right if there's a lot

play08:56

of evidence there's a lot of Curves and

play08:57

they all look exactly the same it

play08:59

doesn't doesn't matter what method you

play09:00

use it doesn't matter what data set you

play09:01

train on it doesn't matter what your

play09:02

Downstream task is the vast majority of

play09:05

them show this kind of problem and the

play09:07

other problem is that we don't have a a

play09:10

nice even distribution of classes and

play09:13

Concepts within our data set so for

play09:14

example cats you can imagine are over um

play09:18

emphasized or over represented over

play09:21

represented yeah over represented in the

play09:23

data set by an order of magnitude right

play09:25

whereas specific planes or specific

play09:28

trees are incredibly under represented

play09:31

because you just have tree right so I

play09:34

mean trees are probably going to be less

play09:35

represented than cats anyway but then

play09:37

specific species of tree very very

play09:39

underrepresented which is why when you

play09:41

ask one of these models what kind of cat

play09:43

is this or what kind of tree is this it

play09:45

performs worse than when you ask it what

play09:47

animal is this because it's a much

play09:49

easier problem and you see the same

play09:50

thing in image generation if you ask it

play09:53

to draw a picture of something really

play09:54

obvious like a castle where that comes

play09:57

up a lot in the training set it can draw

play09:58

you a Fant fantastic castle in the style

play10:00

of Monet and it can do all this other

play10:02

stuff but if you ask it to draw some

play10:04

obscure artifact from a video game

play10:06

that's barely even made it into the

play10:08

training set suddenly it's starting to

play10:10

draw something a little bit less quality

play10:13

and the same with large language models

play10:14

this paper isn't about large language

play10:15

models but the same process you can see

play10:17

actually already happening if you talk

play10:19

to something like chap GPT when you ask

play10:22

it about a really important topic from

play10:24

physics or something like this it will

play10:26

usually give you a pretty good

play10:27

explanation of that thing because that

play10:29

in the training set but the question is

play10:31

what happens when you ask it about

play10:32

something more difficult right when you

play10:33

ask it to write that code which is

play10:35

actually quite difficult to write and it

play10:37

starts to make things up it starts to

play10:39

hallucinate and it starts to be less

play10:41

accurate and that is essentially the

play10:42

performance degrading because it's under

play10:44

represented in the training set the

play10:46

argument I think is at least it's the

play10:48

argument that I'm starting to come

play10:50

around to thinking if you want

play10:51

performance on hard tasks tasks that are

play10:53

under represented on just general

play10:55

internet text and searches we have to

play10:57

find some other way of doing it than

play10:59

just is collecting more and more data

play11:00

right particularly because it's

play11:01

incredibly inefficient to do this right

play11:04

on the other hand we they you know these

play11:06

companies will they've got a lot more

play11:08

gpus than me right they're going to

play11:09

train on on bigger and bigger corpuses

play11:12

better quality data they're going to use

play11:13

human feedback to better train their

play11:15

language models and things so they may

play11:17

find ways to improve this you know up

play11:19

this way a little bit as we go forward

play11:22

but it's going to be really interesting

play11:23

see what happens because you know will

play11:25

it Plateau out will we see trap GPT 7

play11:28

or8 or 9 be roughly the same as chat

play11:30

dpt4 or will we see another

play11:32

state-of-the-art performance boost every

play11:33

time I'm kind of trending this way but

play11:36

you know it'll be excited to see if it

play11:37

goes this way take a look at this puzzle

play11:40

devised by today's episode sponsor Jane

play11:43

straight it's called bug bite inspired

play11:47

by debugging code that world we're all

play11:49

too familiar with where solving one

play11:51

problem might lead to a whole chain of

play11:54

others we'll link to the puzzle in the

play11:57

video description let me know how you to

play11:59

get on and speaking of Jane Street we're

play12:01

also going to link to some programs that

play12:03

they're running at the moment these

play12:05

events are all expenses paid and give a

play12:07

little taste of the tech and problem

play12:09

solving used at trading firms like Jane

play12:15

Street are you curious are you Problem

play12:18

Solver are you into computers I think

play12:20

maybe you are if so well you may well be

play12:22

eligible to apply for one of these

play12:24

programs check out the links below or

play12:27

visit the Jane Street website and follow

play12:29

the these links there are some deadlines

play12:31

coming up for ones you might want to

play12:32

look at and there are always more on the

play12:34

horizon our thanks to Jane Street for

play12:36

running great programs like this and

play12:38

also supporting our Channel and don't

play12:40

forget to check out that bug bite puzzle

Rate This

5.0 / 5 (0 votes)

Etiquetas Relacionadas
Inteligencia ArtificialGenerative AIModelos de DatosClasificación de ImágenesProcesamiento de Lenguaje NaturalDesarrollo de IATecnología de la IAEstudios de IARecomendaciones de ContenidoAprendizaje AutomáticoDistribución de Clases