GPT-4o, AI overviews and our multimodal future

Mixture of Experts
17 May 202441:04

Summary

TLDREl episodio de 'Mixture of Experts', conducido por Tim Wong, reúne a expertos en AI para debatir los acontecimientos más importantes de la semana. Se discuten dos demostraciones destacadas de Google y OpenAI sobre modelos multimodales que permiten a los usuarios obtener respuestas en tiempo real mediante la cámara de su teléfono. Se exploran las implicaciones de estas tecnologías en el ámbito empresarial y el consumo, así como la importancia de la latencia y los costos en el desarrollo de agentes de IA. Además, se analiza el anuncio de Google sobre los resúmenes de AI en los resultados de búsqueda y su impacto en el mercado y la economía web, planteando preguntas sobre el futuro de la búsqueda y la web como plataforma dinámica y saludable.

Takeaways

  • 🤖 La importancia de la multimodalidad: Las compañías están enfocadas en modelos de IA que puedan tomar entradas de video y hacer sentido de ellas, lo que puede cambiar drásticamente el uso de la IA en el futuro.
  • 🚀 El impacto de la latencia y los costos: La reducción de los costos y la mejora en la velocidad de los modelos de IA tienen el potencial de afectar enormemente las aplicaciones descendentes de la IA.
  • 🔍 La introducción de resúmenes de AI en Google Search: Google anunció que comenzará a mostrar resúmenes generados por AI en los resultados de búsqueda, lo que representa un cambio significativo en la experiencia del usuario.
  • 🛠️ La aplicación de IA en flujos de trabajo empresariales: La IA está comenzando a ser integrada en procesos empresariales, lo que puede mejorar la eficiencia y la toma de decisiones.
  • 👓 La visión del futuro con dispositivos multimodales: Se discute cómo los dispositivos como los Google Glasses pueden ser utilizados en el futuro para una interacción más fluida y eficiente con la IA.
  • 📱 La competencia entre la experiencia de usuario de los asistentes de IA y las aplicaciones tradicionales: Se destaca cómo la interacción con la IA puede ser más satisfactoria que la navegación a través de múltiples botones en una aplicación.
  • 💬 La mejora en la latencia para fomentar conversaciones más humanas: Se resalta la importancia de la baja latencia para que las interacciones con la IA sean más naturales y fluidas.
  • 🔑 La personalización en los resultados de búsqueda con AI: Se sugiere que el uso de AI puede llevar a una personalización más efectiva en los resultados de búsqueda, mejorando la experiencia del usuario.
  • 🌐 La consideración de la transformación de la web con la introducción de resúmenes de AI: Se reflexiona sobre cómo la integración de resúmenes de AI en los resultados de búsqueda de Google puede cambiar la forma en que interactuamos con la web.
  • 📈 La transformación potencial de la búsqueda en un bien de consumo común: Se predice que la búsqueda puede volverse un servicio más genérico y accesible a través de asistentes de IA, en lugar de la búsqueda tradicional en navegadores.

Q & A

  • ¿Quién es el anfitrión del programa Mixture of Experts?

    -El anfitrión del programa Mixture of Experts es Tim Wong.

  • ¿Qué tipo de invitados participan en el programa Mixture of Experts?

    -El programa reúne un equipo de investigadores, expertos en productos, ingenieros y más.

  • ¿Qué compañías fueron discutidas en el episodio en relación con sus anuncios recientes?

    -Las compañías discutidas en el episodio son OpenAI y Google, en relación con sus recientes anuncios en el campo de la inteligencia artificial.

  • ¿Cuáles son los tres temas principales que se discuten en el episodio en relación con los anuncios de Google y OpenAI?

    -Los tres temas principales son la multimodalidad, la latencia y los costos, y el anuncio de Google sobre los resúmenes de AI en los resultados de búsqueda.

  • ¿Qué es la multimodalidad y cómo se relaciona con los modelos de IA?

    -La multimodalidad se refiere a la capacidad de los modelos de IA para tomar diferentes tipos de entrada, como video, y ser capaces de entender y procesar esa información de manera efectiva.

  • ¿Por qué la reducción de la latencia y los costos en los modelos de IA es significativa para el uso posterior de la IA?

    -La reducción de la latencia y los costos puede tener un gran impacto en el uso posterior de la IA, ya que permite que los modelos sean más accesibles y se puedan integrar en aplicaciones más rápidas y eficientes.

  • ¿Qué cambios significativos anunció Google en relación con los resúmenes de AI en los resultados de búsqueda?

    -Google anunció que los usuarios de Google Search comenzarían a ver resúmenes de AI en la parte superior de los resultados de búsqueda, lo que representa un cambio significativo en la experiencia del usuario.

  • ¿Qué es Gemini y cómo se relaciona con la demostración de Google?

    -Gemini es un modelo de IA presentado por Google en su demostración, que muestra la capacidad de la IA para interactuar en tiempo real con la cámara de un dispositivo móvil.

  • ¿Cómo podría la tecnología multimodal impactar en los flujos de trabajo empresariales según Shobhit Varshney?

    -Según Shobhit Varshney, la tecnología multimodal podría impactar significativamente en los flujos de trabajo empresariales al convertir los teléfonos en extensiones de los sentidos, facilitando la automatización de tareas y mejorando la eficiencia en procesos como la auditoría de planogramas.

  • ¿Qué es un planograma y cómo se relaciona con la aplicación empresarial de la IA según lo discutido en el episodio?

    -Un planograma es una representación visual de cómo se deben colocar los productos en los estantes de una tienda. La IA puede ayudar a automatizar el proceso de auditoría de planogramas, comparando imágenes de los estantes con las posiciones correctas de los productos.

  • ¿Qué cambios en la experiencia del usuario se esperan con la implementación de resúmenes de AI en los resultados de búsqueda de Google?

    -Se esperan cambios significativos en la experiencia del usuario, donde en lugar de tener que navegar a través de múltiples resultados de búsqueda, los usuarios podrán recibir respuestas directas y personalizadas a sus consultas en los resultados de búsqueda de Google.

Outlines

00:00

🤖 Introducción a Mixture of Experts

El episodio inicia con la presentación del programa 'Mixture of Experts' por su anfitrión, Tim Wong, quien menciona que cada semana reúne a un equipo de expertos en IA para debatir los principales acontecimientos de la semana. En este caso, se centrará en la rivalidad entre OpenAI y Google, discutiendo los anuncios recientes y su impacto en la industria. El panel incluye a Shobhit Varshney, un consultor senior en AI, Chris Hay, un distinguido ingeniero y CTO de Transformación de Clientes, y Bryan Casey, Director de Marketing Digital, quien se unirán para analizar los temas clave de la semana.

05:04

🔍 Multimodalidad y su impacto en el futuro

En este segmento, se discute el tema de la multimodalidad, con enfasis en cómo ambos gigantes tecnológicos, Google y OpenAI, están trabajando en modelos que puedan manejar múltiples tipos de entrada, como videos e imágenes, y producir salidas que combinen texto, audio y otros modales. Se menciona que esta tecnología no solo es impresionante sino que también puede desbloquear nuevos productos y cambiar significativamente los flujos de trabajo en la vida diaria y en los negocios, como la automatización de procesos en empresas, mejorando la eficiencia y la interacción con el usuario final.

10:08

🛠️ Aplicaciones empresariales de la tecnología multimodal

Se exploran las aplicaciones empresariales de la tecnología multimodal, destacando cómo la capacidad de los modelos de IA para interpretar y responder a datos de video en tiempo real puede revolucionar tareas específicas y flujos de trabajo en empresas. Se discute cómo en el pasado era necesario codificar procesos y documentarlos detalladamente para la automatización, pero con los avances en IA, es posible una interacción más dinámica y menos estructurada, permitiendo la interrupción y adaptación de tareas en curso. También se menciona el uso de esta tecnología en escenarios de ventas al por mayor y cómo puede facilitar la toma de decisiones y la auditoría de productos en tiendas.

15:09

🕒 Importancia de la latencia y los costos en el desarrollo de AI

En este apartado, se aborda el tema de la latencia y los costos asociados con los modelos de IA. Se destaca la importancia de tener modelos que puedan responder rápidamente para permitir interacciones fluidas y conversaciones naturales con los usuarios. Se menciona que los modelos más grandes, aunque capaces de razonamiento más profundo, tienen una latencia mayor, mientras que los modelos más pequeños y rápidos pueden no ser tan efectivos en la razón. Se debate cómo equilibrar entre estos factores para lograr una implementación eficaz de la tecnología en aplicaciones prácticas.

20:11

📈 Estrategias empresariales para adoptar modelos de IA

Se discuten las estrategias que las empresas pueden emplear para integrar modelos de IA en sus operaciones. Se sugiere que en lugar de reemplazar completamente los flujos de trabajo existentes, es posible mejorar ciertos aspectos específicos utilizando modelos de IA más pequeños y especializados. También se destaca la importancia de evaluar aspectos como la latencia, el costo y la seguridad al seleccionar el modelo de IA adecuado para una tarea determinada. Se enfatiza la necesidad de una comprensión clara de los procesos y los resultados esperados para una implementación exitosa de la tecnología.

25:14

🎤 Perspectiva de marketing sobre la interacción con IA

Desde una perspectiva de marketing, se analiza cómo la tecnología de IA está cambiando la experiencia del usuario y las expectativas en términos de interacción. Se menciona la importancia de la latencia y cómo una respuesta rápida de los sistemas de IA puede mejorar significativamente la experiencia del usuario, haciendo que la interacción con un agente virtual sea más aceptable y efectiva. Se sugiere que la capacidad de conversar con un AI y recibir respuestas inmediatas puede ser un factor clave en la adopción de esta tecnología en el mercado.

30:17

🌐 La evolución de la búsqueda con la introducción de resúmenes generados por AI

Se debate el anuncio de Google sobre la implementación de resúmenes generados por AI en los resultados de búsqueda. Se plantea la idea de que esta medida puede cambiar la economía del web y la forma en que los usuarios interactúan con la información en línea. Se reflexiona sobre el impacto que tendrá en el SEO y en la forma en que los usuarios encontrarán y consumirán contenido en el futuro, sugiriendo que los resúmenes de AI podrían ofrecer una experiencia más enriquecedora y personalizada que la búsqueda tradicional.

35:17

📚 Perspectivas finales sobre la IA y la búsqueda

En el último párrafo, los panelistas comparten sus perspectivas finales sobre cómo la IA y los asistentes virtuales van a cambiar la forma en que interactuamos con la búsqueda y la información en general. Se sugiere que la búsqueda se convertirá en un bien más común y que la interacción con los asistentes será la nueva forma de obtener información. Se destaca la importancia de la personalización y la capacidad de los asistentes para comprender y satisfacer las necesidades de los usuarios de manera más eficiente que los motores de búsqueda tradicionales.

Mindmap

Keywords

💡Mixture of Experts

Mixture of Experts es el nombre del programa en el que se discuten temas relacionados con la inteligencia artificial. Es un espacio donde expertos de diversos campos se reúnen para debatir y analizar las últimas noticias y tendencias en AI. En el guion, se menciona como el medio para abordar temas como los anuncios de Google y OpenAI y su impacto en la industria.

💡OpenAI

OpenAI es una organización de investigación y desarrollo de inteligencia artificial que se menciona en el guion como una de las compañías en el 'showdown' de la semana. Se discuten sus anuncios y cómo están compitiendo o colaborando con Google en el campo de la IA, destacando su modelo 4.0 y su capacidad para procesar múltiples modalidades de entrada.

💡Google

Google es una de las compañías líderes en tecnología y se menciona en el guion como la otra mitad del 'showdown' con OpenAI. Se discuten sus anuncios recientes y cómo su modelo Gemini compara con los modelos de OpenAI, así como su enfoque en la reducción de latencia y costes en sus modelos de IA.

💡multimodality

La multimodalidad se refiere a la capacidad de los modelos de IA para manejar y comprender múltiples tipos de entrada, como video, audio, texto, etc. En el guion, se discute cómo tanto Google como OpenAI están enfocados en mejorar la multimodalidad en sus modelos para permitir una interacción más rica y dinámica con los usuarios.

💡latency

La latencia se refiere al tiempo que tarda un sistema en responder a una entrada o solicitud. En el guion, se destaca la importancia de reducir la latencia en los modelos de IA para permitir interacciones más fluidas y naturales, lo que es crucial para la adopción en escenarios de usuario final y en el entorno empresarial.

💡costs

Los costos asociados con la computación y el uso de modelos de IA son un tema importante en el guion. Se discute cómo las compañías están trabajando en modelos más pequeños y eficientes para reducir los costos por cada token o interacción, lo que puede tener un impacto significativo en la implementación y adopción de la tecnología.

💡AI summaries

Los resúmenes de IA son una característica anunciada por Google donde los usuarios de Google Search comenzarán a ver resúmenes generados por IA en los resultados de búsqueda. En el guion, se cuestiona si esto es positivo y se explora el impacto que podría tener en la experiencia del usuario y en la economía de la web.

💡enterprise

El término 'enterprise' se utiliza en el guion para referirse a las aplicaciones de la IA en el sector empresarial. Se discuten ejemplos de cómo las mejoras en la tecnología de IA pueden optimizar procesos empresariales, como la automatización de flujos de trabajo y la mejora de la precisión en tareas específicas.

💡consumer

El contexto 'consumer' se contrasta con 'enterprise' en el guion al discutir los usos y aplicaciones de la IA. Se menciona cómo la tecnología multimodal puede afectar tanto a los consumidores finales como a las empresas, y se cuestiona dónde podría haber un mayor impacto o adopción inicial.

💡search engine

Los motores de búsqueda son una temática central en el guion, especialmente en relación con el anuncio de Google de implementar resúmenes de IA. Se reflexiona sobre cómo este cambio podría afectar la forma en que los usuarios interactúan con la web y si esto podría desplazar la necesidad de un motor de búsqueda tradicional.

Highlights

Mixture of Experts podcast discusses the latest AI news and developments.

OpenAI and Google's recent announcements indicate a focus on multimodal AI models that can process video inputs.

Debate on the potential of AI to reduce latency and costs, making AI more accessible and faster.

Google's introduction of AI summaries in search results represents a significant shift in user interaction with the web.

Shobhit Varshney emphasizes the impact of AI advancements on enterprise workflows, particularly in knowledge transfer.

Chris Hay discusses the potential of multimodal AI in transforming consumer and enterprise interactions.

Bryan Casey brings a marketer's perspective on the importance of AI's ability to provide real-time responses.

The conversation explores the balance between model size, reasoning ability, and response latency.

Enterprises are focusing on applying AI at the sub-task level to improve specific steps in a workflow.

Discussion on the potential of AI to automate complex decision-making processes, such as planning events or making purchases.

Brian Casey argues that Google's integration of AI overviews is crucial for maintaining the web's dynamism and health.

Shobhit Varshney predicts a future where AI can hyper-personalize search results based on individual user data.

Chris Hay envisions a future where search becomes a commodity, and AI assistants handle user queries.

The panelists agree that AI's impact on search and user experience will be significant and disruptive.

Final thoughts on the importance of feedback mechanisms in AI to improve personalization and user experience.

Transcripts

play00:09

Hello and welcome to Mixture of Experts.

play00:11

I'm your host, Tim Wong.

play00:12

Each week, Mixture of Experts brings together a world class team of

play00:15

researchers, product experts, engineers, uh, and more to debate and distill down

play00:20

the biggest news of the week in AI.

play00:22

Today on the show, the OpenAI and Google showdown of the week.

play00:25

Who's up, who's down, who's cool, who's cringe?

play00:28

What matters, and what was just hype?

play00:30

We're going to talk about the huge wave of announcements coming out of

play00:33

both companies this week, and what it means for the industry as a whole.

play00:37

So, for panelists today on the show, I'm ably supported by an incredible

play00:40

panel, uh, two veterans who have joined the show before, and a new,

play00:44

uh, contestant has joined the ring.

play00:47

Um, so, first off, uh, Shobhit Varshney, he's the Senior Partner Consulting

play00:51

for AI in US, Canada, and LATAM.

play00:54

Shobhit, welcome back to the show.

play00:55

Thanks for having me back, Tim.

play00:56

Love this.

play00:57

Yeah, definitely.

play00:58

Glad to have you here.

play00:59

Chris Hay, who is a distinguished engineer and the CTO of Customer Transformation.

play01:03

Chris, welcome back.

play01:05

Hey, nice to be back.

play01:06

Yeah, glad to have you back.

play01:08

And joining us for the first time is Bryan Casey, who is the Director

play01:11

of Digital Marketing, who has promised a 90 minute monologue.

play01:15

Uh, on AI and search summaries, which I don't know if we're going to get to,

play01:18

but we're going to have him have a say.

play01:19

Bryan, welcome to the show.

play01:21

We'll have to suffer through showbit and Chris for a little bit, and

play01:23

then we'll get to the monologue.

play01:24

But for having me.

play01:25

The good stuff.

play01:26

Yeah, exactly.

play01:28

Um, well, great.

play01:29

Well, let's just go ahead and jump right into it.

play01:31

So obviously there were a huge number of announcements this week.

play01:35

OpenAI came out of the gate with its kind of raft of announcements.

play01:39

Uh, Google I.

play01:39

O.

play01:40

is going on and they did their set of announcements.

play01:43

And so, really, more things, I think, were debuted, promised, coming

play01:47

out, than we're going to have the chance to cover on this episode.

play01:50

But sort of from my point of view, and I think I wanted to use this as a way

play01:53

of organizing the episode, there were kind of three big themes coming out

play01:57

of Google and OpenAI this week that we'll sort of take in turn and use

play02:01

to kind of make sense of everything.

play02:03

So I think the first thing is multimodality, right?

play02:05

Both companies are sort of obsessed with their models taking video input and being

play02:10

able to make sense of it and going from, you know, image to audio, text to audio.

play02:15

Um, and I want to talk a little bit about that.

play02:17

Second thing is latency and costs, right?

play02:19

Everybody touted the fact that their models are going to be cheaper, and

play02:22

they're going to be way faster, right?

play02:24

And, you know, I think if you're from the outside, you might say, well,

play02:27

it's kind of a difference in kind.

play02:28

Things get faster and cheaper.

play02:29

But I think what's happening here really potentially might have a huge

play02:32

impact on downstream uses, uh, of AI.

play02:35

And so I want to talk a little bit about that dimension and sort of what it means.

play02:39

Um, and then finally, uh, I've already kind of previewed a little bit.

play02:43

Um, Google made this big announcement that I think is almost literally

play02:46

going to be like Many people's very first experience with LLMs in full

play02:51

production, uh, Google basically announced that going forwards, uh, the U.

play02:55

S.

play02:55

market and then globally, uh, those users of Google search will start

play02:59

seeing AI summaries at the top of each of their sort of search results.

play03:03

Um, that's a huge change.

play03:04

We're gonna talk a little bit about what that means and, um, if it's good.

play03:08

I think is a really good question, uh, so looking forward to diving into it all.

play03:17

So let's talk a little bit about multi modal first.

play03:21

So there's two showcase demos from Google and OpenAI, and I think both of them

play03:25

kind of roughly got at the same thing, which is that in the future you're going

play03:28

to open up your phone, you're going to turn on your camera, and then you can

play03:31

wave your camera around, and your AI will basically be responding in real time.

play03:36

And so, Shobhit, I want to bring you in because you were the one who

play03:38

kind of flagged this being like, we should really talk about this, because

play03:41

I think the big question that I'm sort of left with is like, you know,

play03:44

where do we think this is all going?

play03:45

Right?

play03:45

It's a really cool feature, but like, what kind of products do we

play03:48

think it's really going to unlock?

play03:49

And maybe we'll start there, but I'm sure, I mean, this topic goes

play03:52

into all different places, so I'll give you the floor to start.

play03:54

Tim, Monday and Tuesday were just phenomenal inflection points

play03:57

for the industry altogether.

play03:59

It's getting to a point where an AI can make sense of all

play04:03

these different modalities.

play04:04

It's an insanely tough problem.

play04:06

We've been at this for a while and we've not gotten it right.

play04:09

We spent all this time trying to create pipelines to do each of these speech to

play04:12

text and understand and then text to back.

play04:14

It takes a while to get all of the processing done.

play04:17

The 2024 we were able to do this, what a time to be alive man.

play04:22

I just feel that we are getting, finally getting to a point where.

play04:25

Your phone becomes an extension of your eyes, of your listening

play04:28

in and stuff like that, right?

play04:30

And that is a, that has a profound impact on some of the

play04:33

workflows in our daily lives.

play04:34

Now, within IBM, I focus a lot more on enterprises.

play04:37

So I'll give you more of an enterprise view of how these

play04:40

technologies are actually going to make a, make a difference or not.

play04:44

In both cases, Gemini and OpenAI is 4.

play04:48

0.

play04:48

And by the way, in my case, I'm 4.

play04:49

0.

play04:49

4.

play04:50

0 does not stand for Omni.

play04:51

For me, 4.

play04:52

0 means, oh my God, it is really, really that good.

play04:55

So, um, we're getting to a point where there are certain workflows

play04:59

that we do with enterprises, like you are looking at transferring

play05:03

knowledge from one person to the other.

play05:04

And usually you're looking at a screen and you have a bunch of here's

play05:07

what I did, how I solved for it.

play05:09

We used to spend a lot of time trying to capture all of that

play05:11

and what happened in the desktop.

play05:13

Classic BPO processes, these are billions of dollars of work that happens, right?

play05:17

Yeah, and I think if I could pause you there, like I'm curious if you can

play05:20

explain, because again, this is not my world, I'm sure a lot of listeners

play05:22

aren't, it isn't their world as well.

play05:24

How did it used to be done?

play05:25

Right?

play05:26

Like, so if you're, you're trying to like automate a bunch of these

play05:28

workflows, is it just people writing scripts for every single task?

play05:31

Or like, I'm just kind of curious about what it looks like.

play05:33

Yeah.

play05:33

So Tim, let's, let's pick a more concrete example.

play05:36

Uh, say you have outsourcing a particular piece of work and your finance documents

play05:40

coming in, you're comparing it against other things, you're finding errors,

play05:43

you're going to go back and send an email, things of that nature, right?

play05:46

So we used to spend a lot of time documenting the current process.

play05:50

And then we look at that 7...20...9 step process and say I'm going to call an

play05:54

API, I'm going to write some scripts, and all kinds of issues used to happen

play05:57

along the way, unhappy paths and so forth.

play05:59

So the whole process used to be codified in some level of code, and

play06:04

then it's deterministic, it does one thing in a particular flow really well.

play06:07

And you can't interrupt it, you can't just barge in and say, no, no, no, this is not

play06:10

what I wanted, can you do something else?

play06:12

So we're now finally getting to a point where that knowledge work, that work

play06:16

that is to get done in a process, that'll start getting automated significantly with

play06:20

announcements from both Google and OpenAI.

play06:23

So far people would solve it as a decision step by step flowchart.

play06:26

But now we're in a paradigm shift where I can, in the middle of it,

play06:29

interrupt an action to say, Hey, see what's on my desktop and figure it out.

play06:33

I've been playing, I've been playing around with, uh, with OpenAI's 4.

play06:36

0, its ability to go look at a video of a screen and things of that nature.

play06:40

It's pretty outstanding.

play06:41

We are coming to a point where the, the speed at which the

play06:43

inference is happening is so quick.

play06:45

And now you can physically we can actually bring them into your

play06:47

workflows early to just take so long.

play06:50

It was very clunky is very expensive.

play06:52

So you couldn't really justify adding AI into those workflows.

play06:55

It'll be you do liver arbitrage or things of that nature

play06:58

versus trying to automate it.

play07:00

So these are kind of workflows infusing AI in doing this entire

play07:03

process into an phenomenal unlock.

play07:06

One of my clients is a big CBG company.

play07:09

And as we walk into the aisles, they do things like panograms where you're

play07:12

looking at a picture of the Um, And these consumer product goods companies

play07:17

would give you a particular format in which you want to keep different

play07:21

chips and drinks and so forth.

play07:23

And each of those labels are turned around or they are in a different place.

play07:26

You have to audit and say, Am I placing things on the shelf the right way?

play07:30

Like the consumer product goods model.

play07:31

That's called planogram.

play07:33

Realogram, planogram, that's the idea.

play07:35

So earlier we used to take pictures, a human would go in and note things

play07:39

and say, yes, I have enough of the bottles in the right order.

play07:41

Then we started to take pictures and analyzing it.

play07:43

You start to run into real world issues.

play07:46

You don't have enough space to back up and take a picture or.

play07:48

You go to the next island, the lighting is very different and stuff like that.

play07:51

So AI never quite scaled.

play07:53

And this is the first time now we're looking at models like Gemini and

play07:56

others where I can just walk past it and create a video and just feed the whole

play08:01

five minute video in with this context length of two million plus and stuff.

play08:05

It can just actually ingest.

play08:06

Right, just process it all.

play08:07

On the fly.

play08:08

It was missing.

play08:08

Yeah.

play08:08

That's right.

play08:09

Yeah.

play08:09

Those kind of things that were very, very difficult to do for us earlier.

play08:14

Those are becoming a piece of cake.

play08:16

um, how do I make sure that the AI, the normal stuff that we are

play08:21

seeing, is grounded in enterprise?

play08:23

So it's my data, my planogram style, or my processes, my documents, not

play08:29

getting knowledge from elsewhere.

play08:31

So in all the demos, one of the things that I was missing was, how do I make it

play08:35

go down a particular path that I want?

play08:37

Right.

play08:38

If the answer is not quite right, how do I control it?

play08:39

So I think a lot more around.

play08:41

How do I bring this to my enterprise clients and deliver value for them?

play08:45

Those are some of the open questions.

play08:46

Chris, do you have something similar to that?

play08:48

Totally, I do want to get into that.

play08:49

I see Chris coming off mute though, so I don't want to break his role.

play08:51

I don't know if Chris, you got kind of a view on this, or if you

play08:53

disagree, you're like, ah, it's actually not that impressive.

play08:56

Google glasses back, baby.

play08:58

Yeah.

play09:01

So I, I think, I think multimodality is a huge thing, as Shobhit

play09:05

covered it correctly, right?

play09:06

There's so many use cases.

play09:08

Yeah.

play09:08

in the enterprise, but also in consumer based scenarios.

play09:13

And I think one of the things we really need to think about is we've been

play09:16

working with LLMs for so long now, which has been great, but the 2D tech

play09:21

space isn't enough for generative AI.

play09:24

It's, it's, we want to be able to interact real time.

play09:28

We want to be able to interact with audio.

play09:30

Um, you know, and you can take that to things like contact centers where you

play09:33

want to be able to transcribe that audio.

play09:35

You want to then have AIs be able to respond back in a human way.

play09:38

And you want to chat with the assistants.

play09:40

Like you'd like you saw in the open AI demo.

play09:43

Uh, you know, you don't want to be sitting there and go, well, you

play09:45

know, my conversation is going to be as fast as my fingers can type.

play09:49

You want to be able to say, Hey.

play09:50

you know, what do you think about this?

play09:51

What about that?

play09:52

And you want to imagine new scenarios.

play09:55

So you want to say, what does this model look like?

play09:57

What does this image look like?

play09:59

You know, tell me what this is.

play10:00

And you want to be able to interact with the world around you.

play10:03

And to be able to do that, you need multimodal, uh, models.

play10:07

And, and therefore, Like in the Google demo, where, you know, yeah, she picked

play10:13

up the glasses again, you know, so I jokingly said Google glasses back,

play10:17

but, but it really is, it's, if you're going and having a shopping experience,

play10:21

retail, and you want to be able to look at what the price of a mobile phone

play10:25

is, for example, you're not going to want to stop getting phone out, type,

play10:28

type, type, you just want to be able to interact with an assistant there and

play10:32

then, or seeing your glasses, what the price is, and I give the mobile phone

play10:36

example for a reason, I Which is, the price that I pay for a mobile phone isn't

play10:42

the same price as you would pay, right?

play10:44

Because it's all contract rates.

play10:45

And if I go and speak, if I want to get the price of how much am I paying

play10:49

for that phone, it takes an advisor like 20 minutes, because they have to

play10:53

go look up your contract details, etc.

play10:56

They have to look up what the phone is, and then they do a deal.

play10:59

In a world of multi modality, where you've got something like

play11:01

glasses on, it can recognize the object, it knows who you are.

play11:05

And then it can go and look up what, uh, what the price of the phone is for you.

play11:10

And then be able to answer questions that are not generic questions, but

play11:13

specific about you, your contract.

play11:15

To you.

play11:16

Right.

play11:16

Exactly.

play11:17

That, that is where multi modality is going to start, start to come in.

play11:21

I mean, it kind of sounds like, right, yeah, totally.

play11:24

I mean, Chris, if I have you right, I mean, this is one of the questions I want

play11:26

to pitch to both you, Shobhit, and you, Chris, on this is, You know, I actually,

play11:30

my mind goes directly back to Google Glass, like the bar where the guy got

play11:34

beat up for wearing Google Glass years ago, that was like around the corner from

play11:36

where I used to live in San Francisco.

play11:39

And you know, there's just been this dream and obviously all the open

play11:42

AI demos and Google demos for that matter are all very consumer, right?

play11:46

You're walking around with your glasses and you're looking around

play11:48

the world and you know, get prices and that kind of thing.

play11:51

This has been like a longstanding Silicon Valley dream and it's

play11:53

been very hard to achieve.

play11:55

And I guess the one thing I wanted to run by you is like, And the answer

play11:58

might just be both, or we don't know, is like, if you're more bullish on the

play12:01

B2B side or on the B2C side, right?

play12:03

Because I hear what Shobit's saying, and I'm like, oh, okay, I can see

play12:06

why enterprises really get a huge bonus from this sort of thing.

play12:09

Um, and, and I guess it's really funny to me, because I think there's one

play12:12

point of view, which is everybody's talking about the consumer use case,

play12:15

but the actual near term impact may actually be more on the enterprise side.

play12:19

But I don't know if you guys buy that, or if you really are, like,

play12:21

this is the era of Google Glass, you know, it's back, baby, so.

play12:25

I can start first, Tim.

play12:26

Um, like I've, we've been working with Apple Vision quite a bit, um,

play12:30

uh, with an IBM with our clients.

play12:32

And a lot of those are enterprise use cases in a very controlled environment.

play12:36

So things that where things break in the consumer world, you don't

play12:40

have a controlled environment.

play12:41

You have corner cases that happen a lot, right?

play12:44

In an enterprise setting, If I'm wearing my Vision Pros for two

play12:49

hours at a stretch doing, I'm a mechanic, I'm fixing things, right?

play12:53

That's a place where I need additional input and I can't go look at other things

play12:58

like pick up my cell phone and work on it.

play13:00

I'm underneath, I'm fixing something in the middle of it, right?

play13:03

So those use cases, because the environment is very controlled, I can do

play13:07

AI with higher accuracy, it's reputable.

play13:11

I know I can start crossing the answers because I have enough

play13:13

data coming out from it, right?

play13:14

So you're not trying to solve every problem.

play13:16

But I think we'll see a higher uptake of these devices.

play13:20

By the way, I love the Ray Ban glasses from Meta as well.

play13:23

Great to do something quick, but when you don't want to switch.

play13:27

But I think we're moving to a point where enterprises will go deliver these

play13:32

at scale, the tech starts to get better.

play13:35

And adoption is going to come over on the B2C side.

play13:38

But in the consumer goods, we'll have multiple attempts at this.

play13:40

Like we had with Google Glasses and stuff.

play13:42

It'll take a few attempts to get better.

play13:44

On the enterprise side, we will learn and make the models a lot better.

play13:47

But I think there's an insane amount of value that we're delivering

play13:50

to our clients with Apple Vision Pro today in enterprise settings.

play13:54

I think it's going to follow that.

play13:55

Totally.

play13:56

Yeah.

play13:56

And it's actually interesting.

play13:57

I hadn't really thought about this.

play13:58

And Chris, I'll let you get in.

play13:58

It's like, um, basically like the phone is almost not as big of

play14:02

competition in the enterprise setting.

play14:05

Right.

play14:05

Whereas like the example that Chris gave was like literally there you're trying to

play14:07

be like, is this multimodal device faster than using my phone in that interaction?

play14:13

Which is like a real competition, but if it's something like a mechanic,

play14:16

you know, they don't have, they don't, they can't just pull out their phone.

play14:18

Um, Chris, any final thoughts on this?

play14:19

And then I want to move us to our next topic.

play14:21

Yeah.

play14:21

And I was just going to give another kind of use case scenario.

play14:24

I, I often think of things like the oil rigs example, example.

play14:28

So a real sort of enterprise space where you're wandering around and you have to

play14:32

go and do safety checks on various things.

play14:36

Most of their time, if you think of the days before the mobile phone or

play14:39

before the tablet, what they would have to do is go look at the part, do the

play14:42

inspection, the visual inspection, and then walk back to a PC to go fill that in.

play14:47

And then these days, you do that with a tablet on the rig, right?

play14:50

But, but then actually, you need to find the component you're going to look

play14:53

at, you have to do the defect analysis.

play14:55

You want to be able to take pictures of that.

play14:57

You need the geo location of where that part is so that

play15:01

the next person can find it.

play15:02

And then you want to be able to see the notes that they had before on this.

play15:06

And then you've got to fill in the safety form, right?

play15:08

So they have to fill in a ton of forms.

play15:11

So there's a whole set of information.

play15:13

If you just think about AI, just having.

play15:15

You know, even your phone or glasses pick either to be able to look at

play15:19

that part, be able to have the notes contextualized in that geospatial

play15:23

space, be able to fill in that form, be able to do an analysis with AI.

play15:27

It's, it's got a huge impact on enterprise cases and probably multimodality in that

play15:33

sense has probably got a bigger impact, I would say, in the enterprise cases

play15:37

than the consumer spaces even today.

play15:39

And I, and I think that's something we really need to think about.

play15:42

The other one is.

play15:44

And again, I know you wanted this to be quick there, Tim,

play15:47

is the clue and generative AI is the generative part, right?

play15:51

So actually I can create images, I can create audio, I could create

play15:57

music things that don't exist today.

play15:59

So, and with the text part of something like an LLM, Then I

play16:03

can create new creative stuff.

play16:05

I can create DevOps pipelines, Docker files, whatever.

play16:08

So there comes a part where I want to visualize the thing that I create.

play16:13

I don't want to be copying and pasting from one system to another, right?

play16:18

That's not any different from the oil rig scenario.

play16:21

So as I start to imagine new new business processes, new pipelines,

play16:26

new, uh, tech processes, I then want to be able to have the real time

play16:30

visualization of that at the same time, or be able to interact with that.

play16:33

And that's why multi modality is, is really important, probably

play16:36

more so in the enterprise space.

play16:38

Yeah, that's right.

play16:39

I mean, I think some of the experiments you're seeing with, like,

play16:41

dynamic visualization generation are just, like, very cool, right?

play16:45

Uh, cause then you basically have You can say, like, here's how I

play16:48

want to interact with the data.

play16:49

The system kind of just generates it, right, on the fly, which I

play16:52

think is very, very exciting.

play16:59

All right, so next up, I want to talk about latency and cost.

play17:01

So this is another big trend, you know, I think it was very interesting that

play17:05

both companies went out of their way to be like, we've got this offering

play17:09

and it's way cheaper for everybody, which I think suggests to me that, you

play17:12

know, these big, huge competitors in AI all recognize That like your, your per

play17:16

token cost is gonna be this huge bar to getting the technology more distributed.

play17:21

Um, so certainly one of the ways they sold 4.0 was that it was

play17:25

cheaper and as good as GPT, right?

play17:28

And everybody was kind of like, okay, well why do I pay for Pro anymore if

play17:31

I'm just gonna get this for, for free?

play17:32

And then Google's bid, of course, was Gemini 1.5 flash, right?

play17:35

Which is okay, it's gonna be cheaper and faster again.

play17:38

Um, and I know Chris, you threw this.

play17:41

Uh, sort of topic out, so I'll kind of let you have the first say, but I think

play17:44

the main question I'm left with is, like, what are the downstream impacts of

play17:47

this, right, for someone who's not really paying attention to AI very closely,

play17:51

like, is this just a matter of, like, it's getting cheaper, or do you think,

play17:54

like, these are actually, these economics are kind of changing how the technology

play17:57

is actually going to be rolled out?

play17:59

I think latency and smaller models and tokens are probably one of the most

play18:05

interesting challenges we have today.

play18:07

So if you think about like the GPT 4 and everybody was

play18:11

talking like, oh, that's a 1.

play18:13

8 trillion model or whatever it is, that's great.

play18:16

But the problem with these large models is every layer that you have in the neural

play18:23

network is adding time to get a response back, and not only time, but cost.

play18:30

So if you look at the demo that OpenAI did, for example, what was really cool

play18:35

about that demo was the fact that when you were speaking to the assistant, it was

play18:40

answering pretty much instantly, right?

play18:43

And that is the real important part.

play18:45

And when we look at previous demos, what you would have to do if you were

play18:49

having a voice interaction is you'd be stitching together kind of three

play18:53

different pipelines you need to do Uh, speech to text, then you're going to run

play18:58

that through the model, and then you're going to do text to speech back the

play19:01

way, so you're getting latency, latency, latency, before you get a response, and

play19:05

that timing that it would take, because it's not in the sort of 300 millisecond

play19:10

mark, it was too long for a human being to be able to interact, so you got this

play19:14

massive pause, so actually, the, latency and the kind of tokens per second becomes

play19:20

the most important thing if you want to be able to interact with models quickly

play19:25

and be able to have those conversations.

play19:26

And that's sort of why also multimodality is really important, because if I

play19:31

can do this in one model as well, then it means that I'm not sort

play19:35

of jumping pipelines all the time.

play19:37

So the smaller you can make the model, the faster it's going to be.

play19:41

Now, if you look at the GPT 4.

play19:44

on the model.

play19:44

I don't know if you've played with just the text mode.

play19:47

It is lightning fast when it comes.

play19:49

Very fast now.

play19:50

Yeah, it is.

play19:51

It's noticeably so like, it's just like, it feels like every time I'm in,

play19:54

there's like these improvements, right?

play19:56

Yeah.

play19:57

And this is what you're doing.

play19:58

You're sort of trading off reasoning versus, uh, speed of the model, right?

play20:03

And, and as we move into kind of agentic platforms, as we

play20:08

move into the multimodality.

play20:10

You need that latency to be super, super sharp, because you're not

play20:14

going to be waiting all the time.

play20:15

So, there is going to be scenarios where you want to move back to

play20:18

a bigger model, that is fine.

play20:20

Um, but you're going to be paying the cost.

play20:22

And that cost is going to be the cost, uh, the price of the tokens in the first

play20:27

place, but also the speed of the response.

play20:29

And I think this is the push and pull that model creators

play20:32

are going to be playing against.

play20:34

All of the time and, and, and therefore if you can a similar result from a

play20:40

smaller model and you can get a similar result from a faster model and a cheaper

play20:45

model, then you're going to go for that.

play20:47

But in those cases where it's not, then you may need to go to the

play20:50

larger model to kind of reason.

play20:52

So this, this is really

play20:53

Totally.

play20:53

Yeah.

play20:54

I think there's a bunch of things to say there.

play20:55

I mean, I think one thing that you've pointed out clearly is that like

play20:58

this makes conversation possible.

play20:59

Right, like that you and I can have a conversation in part

play21:02

because I have low latency is kind of the way to think about it.

play21:05

And like now that we're reaching kind of human like parity on latency,

play21:08

you know, finally these models can kind of converse in a certain way.

play21:11

The other one is actually, I really thought about that there is kind of

play21:13

this almost like thinking fast and slow thing where basically like the

play21:17

models can be faster, but they're just not as good at reasoning.

play21:20

Um, and then there's kind of this like deep thinking mode, which

play21:23

actually is like slower in some ways.

play21:25

So Tim, uh, the way we are helping.

play21:27

Enterprise clients, again, have that kind of focus in life.

play21:31

There is a split.

play21:32

There's a, there's a, there are two ways of looking at applying

play21:34

Gen AI in the industry right now.

play21:36

One is at the use case level.

play21:38

You're looking at the whole workflow end to end, seven different steps.

play21:42

The other is going and looking at it at a sub task level.

play21:46

Right, so I'll just pick an example and walk you through it.

play21:48

So say I have an invoice that comes in and I'm taking an application,

play21:51

I'm pulling something out of it, I'm making sure that that's as per the

play21:55

contract, I'm going to send you an email saying your invoice is paid, right?

play21:59

So some sort of a flow like that, right?

play22:01

So say it is seven steps, just very simplified, right?

play22:05

I'm going to pull things from the back end systems using APIs.

play22:08

Step number three, I'm going to go A fraud detection model that has

play22:11

been working great for three years.

play22:13

Step number four, I'm extracting things from a paper.

play22:16

Right, an invoice that came in.

play22:18

That extraction I used to be doing with OCR.

play22:21

85 percent accuracy, humans will do the overflow of it.

play22:25

At that point we're taking a pause and saying, we have reason to believe

play22:28

that LLMs today can look at an image and extract this with higher accuracy.

play22:31

Yeah.

play22:32

Say we get up to 94%.

play22:34

So that's nine points, uh, higher accuracy of pulling things out.

play22:38

So we pause at that point and say, let's create a set of constraints for step

play22:42

number four to find the right athletes.

play22:44

And the constraint could be, what's the latency?

play22:47

Like we just spoke, how quickly I need the result, or can this take

play22:50

30 seconds and I'll be okay with it?

play22:52

Yeah.

play22:52

Second could be around cost.

play22:53

If I'm doing this a thousand times, I have a cost envelope to

play22:56

work with versus a human doing it.

play22:58

If I'm doing it a million times, I can invest a little bit more if I

play23:01

can get accuracy out of it, right?

play23:03

So the ROI becomes important.

play23:04

Then you're looking at security constraints around, does this data

play23:07

have any identifiable PHI data, PII I have to bring things closer.

play23:13

Or is this something that is military grade secrets and

play23:15

has to be on prem, right?

play23:16

So you have certain constraints around that.

play23:18

So you come up with a list of five, six constraints, and then

play23:21

that lets you decide whether.

play23:22

What kind of an LLM will actually check off all these different

play23:25

constraints, and then you start comparing and bringing it in.

play23:28

So the spirit that we're seeing in the market is one way with LLM agents and

play23:33

with these multi modal models, they're trying to accomplish the entire flow

play23:37

work for end to end, like you saw with Google's returning the shoes.

play23:40

It's taking an image of it, it's going and looking at your Gmail to find the receipt,

play23:44

starting the return, giving you a QR code with the whole return process done.

play23:48

So just figure it out how to go create the entire end to end workflow.

play23:52

But where the enterprises are still focused is more on the sub task level.

play23:56

That point we are saying, This step number four is worth switching and I have enough

play24:00

evals before and after, I have enough metrics to understand, and I can control

play24:05

that, I can audit that much better.

play24:07

The thing that from an enterprise perspective, these end to end multimodal

play24:10

models, it'll be difficult for us to explain to SEC, for example, why

play24:15

we rejected somebody's benefits on a credit card, things of that nature.

play24:19

So I think in the, in the enterprise world, we're going to go down the

play24:22

path of, Let me define the process.

play24:25

I'm going to pick small models to Chris's point to do that particular piece better.

play24:29

And then eventually start moving over to, and now let me make sure that those, that

play24:33

framework evals and all that stuff can be applied to end to end multimodal models.

play24:38

I guess I do want to maybe bring in Brian here.

play24:40

You like release the Brian on this conversation.

play24:43

Um, cause I, I'm curious about like, kind of like the marketer's view on all this.

play24:47

Right.

play24:47

Cause I think there's one point of view, which is yes, yes, Chris, show a

play24:50

bit like this is all nerd stuff, right?

play24:52

Like I, you know, it's like latency and cost and speed and whatever.

play24:56

The big thing is that you can actually talk to these AIs.

play24:58

Right.

play24:59

And I guess I'm kind of curious from your point of view about like, I mean,

play25:02

one really big thing that came out of like the open AI announcements was.

play25:05

Yeah.

play25:06

We're going to use this latency thing largely to kind of create this

play25:09

feature that just feels a lot more human and lifelike, um, than, you

play25:13

know, typing and chatting with an AI.

play25:15

And I guess I'm kind of curious about, like, you know, what you

play25:18

think about that move, right?

play25:20

Like, is that ultimately, like, Like going to help the adoption of AI?

play25:23

Is it just kind of like a weird sci fi thing that OpenAI wants to do?

play25:26

And also, I mean, I think if you've got any thoughts on, you know, how

play25:29

it impacts the enterprise as well, which is like, do companies suddenly

play25:32

say, Oh, I understand this now, right?

play25:34

It's because it's like the AI from her.

play25:35

I can buy this.

play25:36

Um, just kind of interesting thinking about like the, the sort of surface part

play25:40

of this, because it actually will really have a big impact on the market as well.

play25:43

It's kind of like the technical advances are driving the marketing of this.

play25:46

I mean, I do think when you, when you look at like some of the initial

play25:50

reviews of, I don't know, honestly like the pin and rabbit like I remember one

play25:55

of the one of the scenarios that was being demoed was I think they were I

play25:59

think he was looking at a car and he was asking a question about it and The

play26:03

whole interaction took like 20 seconds there and he went through his it was

play26:06

just showing that he could do the whole thing On his phone in the same amount of

play26:09

time but the thing that I was thinking about when I was watching that was like

play26:12

He just did like 50 steps on his phone.

play26:14

That was awful.

play26:15

As opposed to just pushing a button and asking a question.

play26:17

And it was like, it was very clear that the UX interaction of just like, like

play26:22

asking the question and looking at the thing, was a way better experience than

play26:26

pushing the 50 buttons on your phone.

play26:28

But the 50 buttons still won just because it was faster to do 50 buttons than

play26:32

to, you know, deal with the latency impact of, um, of where we were before.

play26:36

And so it actually, it reminded me a lot of, Just the way I used to hear,

play26:41

remember hearing Spotify talk early about the way that they thought about

play26:45

latency and the things that they did to just make the first 15 seconds of a

play26:49

song land, um, essentially, so that it felt like, you know, appeal, like a file

play26:54

that you had on your device, because I think from their perspective, They,

play26:57

it felt like every time you wanted to listen to a song that was buffering as

play27:00

opposed to sitting on your device, you were never going to really adopt that

play27:03

thing because it's a horrible experience relative to just having the file locally.

play27:07

And so they put in all this work so that it felt the same.

play27:11

And that wound up being a huge part of how the technology ended up getting and

play27:14

the product ended up getting adopted.

play27:16

And, you know, I do think there's a lot of, a lot of stuff we're doing that.

play27:21

It's almost like, I don't want to say back office, but like just enterprise

play27:25

processes around how people do things, operational things, but there are plenty

play27:30

of ways where people are thinking about the way that we do more with like agents

play27:33

in terms of how that involves like customer experience, whether it's support

play27:36

interactions, whether it's like bots on the site, you can just Clearly imagine

play27:41

that that's going to play a bigger role in customer experience going far forward.

play27:45

And if you feel like every time you ask a question that you're waiting 20

play27:48

seconds to get a response from this thing.

play27:50

Like you're just getting the other person on the end of that interaction

play27:52

is just getting madder and madder and madder the entire time.

play27:55

Where the more it feels like you're talking to a person

play27:58

and that they're responding to you as fast as you're talking.

play28:00

I think the more likely it is that people are going to accept

play28:02

that as an interaction model.

play28:04

Um, and so I do think that that latency and like making that feel to you,

play28:09

like to your point about having a human beings being zero latency, um,

play28:14

I think that's a necessary condition for a lot of these interaction models.

play28:16

And so it's going to be super important going forward.

play28:18

And to me, it's also when I think about the Spotify thing, it's

play28:20

like, Are people are going to do interesting things to solve for the

play28:23

first 15 seconds of an interaction as opposed to the entire interaction?

play28:28

Like, you know, can you get, there was a lot of talk about like

play28:31

opening, I'm small I want to say like responding with like, sure, or just

play28:35

like some space filling entry point.

play28:37

Um, so it like, it could catch up with the rest of the dialogue.

play28:41

So I think it, I think people will prioritize that a lot

play28:43

because it don't matter a lot.

play28:44

I mean, I love the idea that like to save to save cost, basically,

play28:47

OpenAI is like, for the first few turns of the conversation, we

play28:49

deliver the really fast model, so it feels like you're really having,

play28:52

like, a nice, flowing conversation.

play28:53

And then, basically, once you build confidence, they, like, fall back to,

play28:56

like, the slower model that has better results, where you're like, oh, this

play28:58

person is a good conversationalist, but they're also smart, too, right, is, like,

play29:02

kind of what they're trying to do by, kind of, playing with model delivery.

play29:05

Um, so.

play29:07

We got to talk about search, but Chris, I saw you go off mute, so do

play29:09

you want to do a final quick hit on the question of latency before we move on?

play29:12

No, I was just coming to come up with what Brian was saying there, and what

play29:16

you were saying, Tim, I totally agree.

play29:18

It was always doing this, hey, and then repeat the question, so I wonder

play29:23

if underneath the hood, as you say, is there's a much smaller classifier

play29:27

model that is just doing that hey piece.

play29:29

And then as you say, there's probably a slightly larger model

play29:33

actually analyzing the real thing.

play29:35

So I do wonder if there's two small model, a small model and a slightly larger model

play29:40

in between there for that interaction.

play29:42

So it's super interesting.

play29:44

But maybe the thing I wanted to add to that is we don't have that

play29:48

voice model in our hands today.

play29:50

We only have the text model.

play29:52

So I wonder once we get out of the demo environment and then

play29:56

maybe in a three weeks time or whatever, we have that model.

play30:00

Whether that's gonna be super annoying every time we ask a

play30:03

question, it's gonna go, Hey, and then repeat the question back.

play30:06

So, it's cool for a demo, but I wonder if that will actually be

play30:10

super annoying in two weeks time.

play30:16

Alright, so, last topic that we got a few minutes on, uh, and

play30:19

this is like, Brian's big moment.

play30:21

So, Brian, get yourself ready for this.

play30:23

We've hyped this too much.

play30:23

I mean, show the Chris, you should get yourself ready, cause apparently

play30:25

Bryan's gonna, you know, you know.

play30:27

Everyone else can leave the meeting.

play30:28

Yeah, yeah, take our eyebrows off here with his, uh, with his rant.

play30:32

So, the, the setup for this is that basically Google announced,

play30:35

uh, that AI generated overviews will be rolling out to U.

play30:40

S.

play30:40

users and then everybody, uh, in the near future.

play30:43

And I think there's two things that are to set you up, Brian.

play30:45

I think the first one is, this is what we've been talking about, right?

play30:48

Like, is AI going to replace search?

play30:49

Here it is.

play30:50

You know, here it is consuming.

play30:52

The preeminent search engine.

play30:54

So I think it's like, we're here, right?

play30:55

This is happening.

play30:57

And then the second one is like, I'm a little nostalgic.

play30:59

You know, someone who grew up with Google.

play31:00

Um, you know, I'm like the ten blue links.

play31:03

You know, like the search engine.

play31:05

You know, it's like a big part of how I experienced and grew up with the web.

play31:07

And um, You know, this seems to me like kind of a big shift in how we

play31:11

interact with the web as a whole.

play31:13

And so, I do want you to kind of first talk a little about what you

play31:16

think it means for the market, um, and, uh, and how you think it's going

play31:19

to change the economy of the web.

play31:22

So, I follow two communities, I would say, pretty closely online.

play31:27

I follow the tech community and, uh, pretty closely.

play31:30

And then I, as a, somebody who works in marketing, I follow my SEOs community.

play31:34

Um, and they have very different, um, Reactions to, uh, to what's going on.

play31:39

I think your first question, though, of, um, you know, is this the

play31:44

equivalent of swallowing the web?

play31:46

Um, and from the minute, what's funny is from the minute sort of chat GPT

play31:50

arrived on the scene, people were proclaiming the death of search.

play31:53

Now, for what it's worth, if you've worked in marketing or on the internet

play31:56

for a while, people have proclaimed The death of search has like an annual

play31:59

event for the last like 25 years.

play32:02

And so, um, this is just like par for the course on, on some level.

play32:07

But what's interesting to me is that you had this product chat GPT, which is fast

play32:11

growing consumer product ever a hundred million users faster than anybody else.

play32:15

And.

play32:16

What was interesting is sort of like speed run, speed ran the sort of growth

play32:19

cycle that usually takes years or decades.

play32:22

I'm like, well, maybe not decades, but like it takes a long time for most

play32:25

consumer companies to do what they did.

play32:27

And the interesting thing about that is if it was gonna totally disrupt search,

play32:31

you would have expected it to show up and happen sooner than it would have

play32:34

with other products that maybe would have had a slower sort of growth trajectory.

play32:38

Um, but that didn't happen.

play32:40

Like as somebody who watches their search traffic super

play32:43

closely, like there's been no.

play32:45

chaotic drop of of this, like people have continued to use search engines

play32:50

and like one of the reasons I think that that happened is because people

play32:54

actually misunderstood, um, like, like the equivalent of like chat GPT

play32:58

and Google as competitors, um, with one another, I know Google and open

play33:03

AI probably are on some level, but I don't know that those two products are.

play33:06

And the reason I was thinking about that is like, if if chat GPT didn't You know,

play33:12

within the within basically the time plan we've had so far, uh, disrupt Google.

play33:16

The question is like, why, why didn't that happen?

play33:19

And I think you could have a couple different hypotheses for that.

play33:21

Like one, you could say the form factor wasn't right.

play33:24

It wasn't text that was going to do it.

play33:26

It was, we needed Scarlett Johansson on your phone.

play33:28

And that's the thing that's going to do it.

play33:29

And so they're leaning into that thought process a little bit.

play33:33

You could say it was hallucinations, like, oh, the content's just not accurate.

play33:36

Uh, yeah, right.

play33:38

So that's a possibility around it.

play33:40

You could say just like learn consumer behavior.

play33:42

People have been using this stuff for 20 years.

play33:44

It's going to take a while to get them to do something different.

play33:46

You could say Google's advantages in distribution.

play33:49

So it's like we're on the phone.

play33:50

We got browsers.

play33:51

Um, it's really hard.

play33:53

to, you know, get the level of penetration that we have.

play33:55

I think all of those probably play some role.

play33:57

But my biggest belief is that it's actually impossible to separate

play34:02

Google from the internet itself.

play34:03

Um, Google is kind of like the operating system for the web.

play34:06

So to disrupt Google, you actually are not disrupting search.

play34:09

You have to disrupt the internet.

play34:10

Um, and it turns out that that's an incredibly high

play34:12

bar, um, to have to disrupt.

play34:14

Because you're not only dealing with search, you're dealing with the

play34:16

capabilities, whether it's banks or, you know, Airlines or, you know, retail,

play34:20

whatever it is, every single website that sits on the opposite end of the internet.

play34:25

And it turns out that that's like an enormous amount of capability,

play34:28

um, that's built up there.

play34:30

And so I looked at it, I look at that and say like for as much as Like I think

play34:35

this, this technology is brought to the table, hasn't done that thing yet.

play34:39

And so because it hasn't done that, there hasn't been some dramatic shift there.

play34:44

The thing that Google search is not good at though, um, and I think you see it

play34:49

in a little bit in terms of how they described, you know, What they think

play34:52

the utility of AI overviews, um, will be is that it's not good at complex

play34:58

multi part questions of saying, like, if you're trying to plan, if you're doing

play35:01

anything from, like, doing a buying decision for a large enterprise product

play35:05

or, like, planning your kid's birthday party, like, you're going to have to do,

play35:08

like, 25 queries along the way there.

play35:10

And you just, you've just accepted and internalized 25

play35:14

like, basically, like, search is one shot, right?

play35:17

Like, you just say it and responses come back.

play35:18

So there's no.

play35:20

Yeah, sorry, go ahead.

play35:21

Yeah, and so, like, the way I was thinking about LLMs is they're kind

play35:24

of like Internet SQL, um, in a way, where you can ask this, like, much more

play35:28

complicated question and then you can actually describe the way that you want

play35:31

the output of that thing to look it's like, I want to compare these three

play35:34

products on these three dimensions, go get me all this data, and that

play35:37

would have been 40 queries, um, at one point, but now you can do it in one.

play35:41

And search is terrible at doing that right now.

play35:44

You have to go cherry pick each one of those data points.

play35:46

But the interesting thing is that that's also maybe the most valuable query to

play35:51

a user because you save 30 minutes.

play35:54

And so I think Google looks at that and says, Um, if we seed that particular space

play36:00

of complex queries to some other platform, like, that's a long term risk for us.

play36:05

And then if it's a long term risk for them, what it ends up being is a long

play36:08

term risk for the web, um, I think.

play36:10

So I actually think it was incredibly important that Google bring this type

play36:13

of capability into, into the web, even if it ends up being disruptive a little

play36:18

bit from a publisher's perspective.

play36:20

Because what it does is at least preserves some of the dynamic we have now, of

play36:24

like the web still being an important.

play36:25

thing.

play36:26

And I hope that continues to your point.

play36:28

I have like present and past nostalgia.

play36:30

Yeah, exactly.

play36:33

So I think it's, I think it's important that it continues to evolve if we all

play36:36

want the web to continue to persist as like a healthy dynamic place.

play36:40

Yeah, for sure.

play36:41

No, I think that's a, that's a great take on it.

play36:42

And you know, Google always used to say, look, we measure our success based on

play36:46

how fast we get you off our website.

play36:47

Right?

play36:48

And then kind of, Brian, what you're pointing out, which I think is, is

play36:50

very true, is that like, what they never said was, there's this whole

play36:53

set of queries we never surface that, you know, you really have to kind

play36:57

of keep, keep searching for, right?

play36:58

And like, that's, that ends up being kind of like, uh, the, the, the

play37:02

search volume of the future that everybody wants to, to capture.

play37:05

Um, well, uh, so Brian, I think we also had a little intervention

play37:09

from AI, the thumbs up thing.

play37:10

We were joking about that before the show.

play37:12

It's just, yeah.

play37:13

I didn't know that My ranking for worst AI feature of all time.

play37:16

Um, so, um, but, um, yeah,

play37:19

the thumbnail on the, on the video.

play37:21

That's right.

play37:22

Exactly.

play37:23

Um, well, great.

play37:24

So we've got just a few minutes left Shobhit, Chris, any final

play37:27

parting shots on this topic?

play37:29

Sure.

play37:30

I'm very bullish.

play37:31

I think, uh, AI overviews.

play37:34

Have a lot of future as long as There's a good mechanism of feedback incorporating

play37:40

and making hyper personalized a simple query like I want to go have dinner

play37:44

tonight Say I tell you I want to look for a Thai restaurant Yeah If you look

play37:48

if I go on open Open table or Yelp or Google and try to find that there's a

play37:53

particular way in which I think through it the filters that I Apply are very

play37:56

different from how Chris would do it.

play37:57

Right?

play37:57

So the way I make a decision If somebody's making that decision for me, great.

play38:03

The reason why TikTok works so much better than Netflix, on an average, I think,

play38:09

I was listening to a video by Scott, and he mentioned that we spend about

play38:14

155 minutes a week browsing Netflix.

play38:18

On an average in the U.

play38:19

S.

play38:19

and something of that nature.

play38:20

It's a pretty exorbitant amount of time.

play38:22

Versus TikTok has just completely taken that fallacy of choice out for you.

play38:27

When you go on TikTok, the video that they have made, there's

play38:29

just so many data points.

play38:30

A 17 second video averaged 16 minutes of viewing time

play38:35

across your TikTok engagement.

play38:37

And you have so many data points coming out of it.

play38:38

Seven, seven, ten of them.

play38:40

Every few seconds, right?

play38:41

So they have hyper personalized it based on how you interact with things, right?

play38:46

Because they have not, not asking you to go pick a channel,

play38:49

a choice, this is that nature.

play38:50

They're just showing you the next, next, next thing in the sequence.

play38:53

Hence the stickiness.

play38:54

They've understood the brains of teenagers and, and, and that

play38:57

demographic really, really well.

play38:58

I think that's the direction that Google will go into.

play39:01

It'll start hyper personalizing based on all the content.

play39:03

If they're reading and finding out where the receipt of my shoes are,

play39:07

they know what I actually ended up ordering at the restaurant I went to.

play39:10

So the full feedback loop coming into the Google ecosystem, I think it's going

play39:14

to be brilliant if they get to a point where they just Make a prediction on

play39:18

which restaurant is going to work for me, based on everything they know about me.

play39:21

That's right, yeah.

play39:22

I mean, the future is they're just going to book it for you, and a car is going

play39:24

to show up, and you're going to get in.

play39:25

It's going to take you someplace, right?

play39:27

Uh, so Chris, uh

play39:29

They'll send a confirmation from your email.

play39:31

Yeah, exactly, right.

play39:33

Uh, Chris, 30 seconds.

play39:34

You've got the last word.

play39:35

30 seconds.

play39:36

Search is going to be a commodity.

play39:38

And I think as we see the AI assistant era.

play39:41

How dare you.

play39:42

Yeah.

play39:43

But it will be a commodity because we are going to interact with search

play39:47

via these assistants it's going to be the serial on my phone, which

play39:51

will be enhanced by AI technology.

play39:54

It's going to be a Android and Gemini's version on there, we are not going

play39:59

to be interacting with Google search in the way we do today with browsers.

play40:03

That is going to be commoditized and we're going to be dealing with

play40:06

our assistants who are going to go and fetch those queries for us.

play40:09

So I, I think that's going to be upended and, and at the heart of that is going to

play40:13

be latency and multimodality as we said.

play40:16

So, uh, I think they got it or they're going to be disrupted.

play40:21

Yeah, I was going to say just like if that happens, what's interesting

play40:23

is that all of the advantage Google has actually vanishes, like, and then

play40:27

it's an even playing field against every other LLM, which is, you know,

play40:31

that's a very interesting market situation in that, at that point.

play40:35

Yeah, I'm going to pick that up next week.

play40:36

That's a very, very good topic.

play40:37

We should get more into it.

play40:38

Um, great.

play40:40

Well, we're out of time.

play40:41

Uh, Shobit, Chris, uh, thanks for joining us on the show again.

play40:44

Uh, Brian, we hope to see you again sometime.

play40:46

Um, and to all you out there in Radioland, if you enjoyed what you heard, you

play40:50

can get us on Apple Podcasts, Spotify, and podcast platforms everywhere.

play40:54

And we'll see you next week for Mixture of Experts.

Rate This

5.0 / 5 (0 votes)

Related Tags
Debate AITecnologíaInnovaciónOpenAIGoogleAnunciosMultimodalidadLatenciaCostoBúsquedaAI en Empresas
Do you need a summary in English?