Open sourcing the AI ecosystem ft. Arthur Mensch of Mistral AI and Matt Miller

Sequoia Capital
26 Mar 202426:14

Summary

TLDRArthur, founder and CEO of mistal AI, discusses the company's rapid growth and focus on open-source AI models. Despite being a young company, they've made significant strides by releasing high-quality models comparable to GPT-4 and forming strategic partnerships. Arthur emphasizes the importance of balancing open-source contributions with commercial interests and shares insights on the future of AI, including multilingual and multimodal models. He also highlights the advantages of building a business in Europe and the company's vision for AI's role in the next five years.

Takeaways

  • ๐Ÿš€ Arthur, the founder and CEO of mistol AI, has led the company to release high-quality models rivaling GPT-4 in just nine months since its inception.
  • ๐ŸŒŸ The company's success is attributed to its open-source approach and the ability to efficiently develop models with a lean team of experienced individuals.
  • ๐Ÿค Mistol AI has established strategic partnerships with major companies like Microsoft and Snowflake, indicating a strong go-to-market strategy.
  • ๐Ÿง  The decision to start mistol AI was driven by the founders' desire to see AI progress through open exchanges, which they felt was lacking in the field since 2022.
  • ๐ŸŒ The company aims to bring AI technology to every developer, promoting a more open platform than competitors, and accelerating adoption through this strategy.
  • ๐Ÿ”„ The balance between open-source and commercial models is managed by offering two families of models, ensuring leadership in open source while driving commercial adoption.
  • ๐Ÿ’ก Mistol AI's rapid development is attributed to hands-on work with data and a team that is willing to engage in the less glamorous aspects of machine learning.
  • ๐Ÿ” The company is focused on improving its large models and developing open source models for specific vertical domains.
  • ๐ŸŒ Being a European company provides Mistol AI with unique advantages, such as access to a strong talent pool and linguistic capabilities, as well as geographical opportunities.
  • ๐Ÿ”ฎ Looking ahead, Mistol AI envisions a future where AI infrastructure is open, enabling the creation of assistance and autonomous agents accessible to all users.
  • ๐Ÿ’ผ For founders in the AI space, the advice is to maintain an ambitious mindset and be prepared to build and reinvent from scratch every day, as the AI landscape is ever-evolving.

Q & A

  • What motivated Arthur and his co-founder to start mistol AI?

    -Arthur and his co-founder were inspired by the open exchanges between academic and industrial labs that contributed to the progress of AI. They were disappointed that this openness stopped early in the AI journey and wanted to push the field back towards more open source contributions, especially given the rapid advancements in AI technology.

  • How does mistol AI balance open source contributions with commercial interests?

    -Mistol AI maintains two families of models - one focused on open source to lead in that domain, and another for commercial purposes. They aim to stay relevant by continuously producing open source models while also developing better commercial models available on various cloud providers.

  • What is the advantage of being a European AI company like mistol AI?

    -Being a European company allows mistol AI to tap into a strong pool of junior talent from countries like France, Poland, and the UK. Additionally, they benefit from support at the state level and have a geographical advantage in serving the European market, including having a strong French language model.

  • What are some of the challenges mistol AI faces in maintaining its position in the AI field?

    -Mistol AI faces the challenge of staying ahead in the rapidly evolving AI field. They need to balance contributing to the open source community while also securing commercial adoption and enterprise deals to sustain their business model.

  • What is mistol AI's strategy for the future in terms of model development?

    -Mistol AI is working on improving their existing models and developing open source models for various vertical domains. They are also focusing on multilingual and multimodal models and plan to make customization and fine-tuning part of their platform.

  • How does Arthur view the future of AI technology?

    -Arthur envisions a future where AI technology becomes more autonomous, with the ability to create assistants and autonomous agents that can perform a wider range of tasks. He expects AI to become so controllable through human language that creating such agents will be a common skill learned at school.

  • What are some of the most exciting applications that mistol AI has seen built on their models?

    -Mistol AI has seen startups in the Bay Area using their models for fine-tuning and fast application development. They have also seen web search companies and enterprises using their models for knowledge management, marketing, and other standard enterprise applications.

  • How does mistol AI support its community of developers?

    -Mistol AI invests in developer relations, creating guides and gathering use cases to showcase what can be built with their models. They encourage developers to engage with them to discuss use cases, advertise their applications, and provide insights for future evaluations and improvements to their models.

  • What is mistol AI's approach to partnerships with companies like Snowflake and Databricks?

    -Mistol AI believes that AI models become stronger when connected to data. They have formed partnerships to run natively in the clouds of these companies, allowing customers to deploy mistol AI's technology where their data resides, which they see as important for the future of AI deployment.

  • How does mistol AI decide on the size of the models they develop?

    -The decision on model size is based on scaling laws and depends on the compute resources available, the desired training and inference costs, and the balance between latency and reasoning capabilities. Mistol AI aims to have a family of models ranging from small to very large ones.

  • What advice does Arthur have for founders in the AI space?

    -Arthur advises founders to always act as if it's day one, to be ambitious, and to dream big. He emphasizes the importance of continuous exploration and innovation while also leveraging existing achievements to stay relevant in the rapidly evolving AI field.

Outlines

00:00

๐ŸŽค Introduction and Background of Mistal AI

The speaker, Arthur, is introduced as the founder and CEO of Mistal AI, a company that has made significant strides in the AI field despite being only nine months old. The introduction highlights the company's success in releasing high-quality models comparable to GPT-4 and its open-source approach. Arthur's background at DeepMind and his work on the chinchilla paper are mentioned, setting the stage for his discussion on the philosophy behind starting Mistal AI and the opportunities it presents in the AI ecosystem.

05:01

๐Ÿš€ The Genesis of Mistal AI and Open Source Mission

Arthur shares the story behind the establishment of Mistal AI, emphasizing the importance of open exchanges in AI development. He discusses the shift in the AI field in 2022, when open contributions declined, and how this motivated him and his co-founder to create Mistal AI. The company's mission is to democratize AI by making it accessible to every developer, in contrast to the closed approach of competitors. Arthur also outlines the company's rapid development and benchmark achievements, attributing their success to a lean team of experienced individuals.

10:02

๐Ÿค Balancing Open Source and Commercial Models

The discussion shifts to how Mistal AI balances its open-source offerings with commercial strategies. Arthur explains the company's approach of maintaining leadership in open source while evolving its commercial models. He acknowledges the tension between community contribution and commercial adoption, highlighting the need for constant adaptation and strategic planning. Arthur also touches on the company's partnerships with Microsoft, Snowflake, and Databricks, and how these collaborations have contributed to Mistal AI's trajectory.

15:04

๐ŸŒ Geographic Advantages and Future Plans

Arthur discusses the benefits and challenges of building Mistal AI in Europe, particularly France. He highlights the strong talent pool, government support, and the advantage of being a European company. However, he also mentions regulatory challenges. Looking ahead, Arthur envisions a future where AI infrastructure will be open, with Mistal AI becoming a platform for creating assistance and autonomous agents. He predicts that in five years, AI technology will be more accessible, allowing any user to create their own assistant or agent.

20:05

๐Ÿ’ก Engaging with the Community and Future Directions

The conversation focuses on Mistal AI's engagement with the developer community and the importance of feedback for model improvement. Arthur invites the community to share their use cases and collaborate for mutual benefit. He also discusses the company's future plans, including the development of multilingual and multimodal models, and the expansion of the platform to include customization features. Arthur emphasizes the company's commitment to remaining the best solution for developers and staying relevant in the open-source world.

25:06

๐ŸŒŸ Final Thoughts and Advice for AI Entrepreneurs

In the concluding segment, Arthur reflects on Mistal AI's rapid growth and the company's strategic approach to the AI ecosystem. He shares his perspective on the balance between exploration and exploitation, emphasizing the need for continuous innovation while maintaining a strong product and business focus. For aspiring AI entrepreneurs, Arthur advises maintaining an ambitious mindset and embracing the challenge of building from scratch every day, encapsulating the spirit of entrepreneurship.

Mindmap

Keywords

๐Ÿ’กOpen Source

Open Source refers to something that is freely available for everyone to view, use, modify, and distribute. In the context of the video, it is a core value proposition of the company Mistral AI, which aims to make AI technology accessible to developers by providing high-quality models through open source contributions. This approach is intended to accelerate the adoption of AI and democratize its usage.

๐Ÿ’กAI

Artificial Intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think and learn like humans. In the video, AI is the central theme, with discussions around its progression, the development of foundational models, and the balance between open source and commercial models.

๐Ÿ’กFoundation Models

Foundation models are large-scale machine learning models that are pre-trained on a wide variety of data and can be fine-tuned for specific tasks. They serve as a foundation or starting point for building AI applications. In the video, the company's goal is to create models that approach the quality of GPT-4, which is a type of foundation model.

๐Ÿ’กCommunity

In the context of the video, 'community' refers to the group of developers, researchers, and users who engage with, contribute to, and use the open source AI models provided by Mistral AI. The community is essential for the feedback, improvement, and adoption of the technology.

๐Ÿ’กCommercial Models

Commercial models refer to AI models that are developed for sale or licensing to customers, often including additional features, support, or customizations beyond what is available in the open source versions. These models are part of a company's revenue-generating strategy.

๐Ÿ’กBenchmarking

Benchmarking is the process of evaluating and comparing the performance of a product, service, or model against a standard or other competitors. In the context of AI, it often involves testing how well a model performs on specific tasks or datasets.

๐Ÿ’กFine-tuning

Fine-tuning is a process in machine learning where a pre-trained model is further trained or adjusted on a specific dataset to improve its performance for a particular task. It is a technique used to adapt foundation models to specific applications or industries.

๐Ÿ’กMultimodal Models

Multimodal models are AI models that can process and understand multiple types of data inputs, such as text, images, and audio. These models aim to mimic human-like understanding by integrating various forms of information.

๐Ÿ’กDeveloper Relations

Developer relations refer to the practices and strategies a company uses to build and maintain a positive relationship with its developer community. This includes providing resources, support, and platforms that enable developers to create and innovate.

๐Ÿ’กAutonomous Agents

Autonomous agents are systems or entities that can operate independently, making decisions and performing tasks without constant human intervention. In the context of AI, this refers to AI systems that can act on their own with some level of autonomy.

Highlights

Arthur, the founder and CEO of mistol AI, shares insights on the company's mission and achievements.

Mistol AI, despite being only nine months old, has managed to release high-quality AI models comparable to GPT-4.

The company's founding story began with a desire to continue the tradition of open exchanges in AI research and development.

Arthur and his co-founder, Timothรฉe, were inspired by the lack of open contributions in AI in 2022.

Mistol AI's vision is to bring AI capabilities to every developer through an open-source platform.

The company has successfully balanced open-source contributions with commercial partnerships, such as with Microsoft and Snowflake.

Mistol AI's approach to model development includes both large and small models to cater to different needs and use cases.

The company is focused on fast development and reaching benchmark levels efficiently, outpacing other foundational model companies.

Mistol AI's strategy involves a lean team of experienced individuals who are willing to do the 'dirty work' of machine learning.

Arthur discusses the economic opportunity in AI and the company's plans for multilingual and multimodal models.

Mistol AI aims to become a platform for AI infrastructure, enabling the creation of assistance and autonomous agents.

The company's location in Europe provides a unique advantage in terms of talent pool and regional opportunities.

Mistol AI's partnerships with data providers like Snowflake and Databricks allow for AI models to run natively in their clouds.

Arthur shares his thoughts on the future of open source versus commercial models and how Mistol AI plans to stay relevant.

The company's approach to model sizes is influenced by scaling laws, training costs, and inference needs.

Mistol AI is working on an enterprise off-the-shelf solution to help businesses integrate AI more easily.

Arthur advises founders to always be ambitious and to view each day as a new opportunity to build from scratch.

Transcripts

play00:03

I'm excited to introduce our first

play00:04

Speaker uh Arthur from mistol uh Arthur

play00:08

is the founder and CEO of mistal AI

play00:11

despite just being nine months old as a

play00:13

company uh and having many fewer

play00:16

resources than some of the large

play00:17

Foundation model companies so far I

play00:20

think they've really shocked Everybody

play00:21

by putting out incredibly high quality

play00:23

models approaching GPT 4 and caliber uh

play00:26

out into the open so we're thrilled to

play00:27

have Arthur with us today um all the way

play00:29

from BRS to share more about the

play00:31

opportunity behind building an open

play00:33

source um and please uh interviewing

play00:37

Arthur will be my partner Matt Miller

play00:39

who is dressed in his best French wear

play00:41

to to honor Arthur today um and and and

play00:44

helps lead lead our efforts in Europe so

play00:46

please Welcome Matt and

play00:50

[Applause]

play00:52

Arthur with all the efficiency of a of a

play00:55

French train right just just just just

play00:57

right on time right on time we we're

play00:59

sweating a little bit back there cuz

play01:00

just just just walked in the door um but

play01:03

good to see you thanks for thanks for

play01:04

coming all this way thanks for being

play01:05

with us here at aisn today thank you for

play01:08

hosting us yeah absolutely would love to

play01:11

maybe start with the background story of

play01:14

you know why you why you chose to start

play01:16

mrra and and maybe just take us to the

play01:19

beginning you know you we all know about

play01:20

your career at Deep your successful

play01:22

career at Deep Mind your work on the

play01:23

chinchilla paper um but tell us maybe

play01:26

share with us we always love to hear at

play01:28

seoa and I know that our founder commun

play01:29

also L to hear that spark that like gave

play01:32

you the idea to to launch and to to

play01:34

start to break out and start your own

play01:35

company yeah sure um so we started the

play01:38

company in April but I guess the ID was

play01:41

out there for a couple of months before

play01:43

uh timot and I were in master together G

play01:47

and I were in school together so we knew

play01:48

each other from before and we had been

play01:51

in the field for like 10 years uh doing

play01:54

research uh and so we loved the way AI

play01:57

progressed because of the open exchanges

play01:59

that occurred between uh academic Labs

play02:02

uh industrial Labs uh and how everybody

play02:04

was able to build on on on top of one

play02:07

another and it was still the case I

play02:09

guess when uh in between even in the

play02:12

beginning of the llm era where uh openi

play02:15

and deep mine were actually uh like uh

play02:19

contributing to another one another road

play02:22

map and this kind of stopped in 2022 so

play02:26

basically the one of the last uh paper

play02:28

doing important changes to the way we

play02:30

train models was chinchila and that was

play02:32

the last Model that uh Google ever

play02:35

published uh last important model in the

play02:37

field that Google published and so for

play02:40

us it was a bit of a shame that uh we

play02:43

stopped uh that the field stopped doing

play02:45

open uh open contributions that early in

play02:48

the AI Journey because we are very far

play02:50

away from uh finishing it uh and so when

play02:53

we saw chat GPT at the at the end of the

play02:56

year and um and I think we reflect on

play03:00

the fact that there was some opportunity

play03:01

for doing things differently for doing

play03:03

things from France because in France you

play03:05

have as it turned out there was a lot of

play03:08

talented people that were a bit bored at

play03:10

in big tech companies and so that's how

play03:13

we figured out that there was an

play03:14

opportunity for building very strong

play03:17

open source models going very fast with

play03:19

a lean team uh of experienced people uh

play03:22

and show yeah and try to correct the the

play03:25

the direction that the field was taking

play03:27

so we wanted to push it to push the open

play03:29

Source model is much more and I think we

play03:31

did a good job at that because we've

play03:33

been followed by various companies uh in

play03:36

our trajectory wonderful and so it was

play03:39

really a lot of the open source move

play03:41

movement was a lot of the a lot of the

play03:43

drive behind starting the company yeah

play03:45

that's one of one of the yeah that was

play03:48

one of the driver uh Our intention and

play03:51

the mission that we gave ourselves is

play03:52

really to bring AI to the hands of every

play03:54

developers and the way it was done and

play03:57

the way it is still done by our

play03:58

competitors is very closed uh

play04:00

and so we want to push a much more open

play04:02

platform and we want to spread the

play04:04

adoption and accelerate the adoption

play04:06

through that strategy so that's very

play04:08

much uh at the core well the reason why

play04:12

we started the company indeed wonderful

play04:14

and you know just recently I mean fast

play04:16

forward to today You released Mr all

play04:19

large you've been on this tear of like

play04:20

amazing Partnerships with Microsoft

play04:22

snowflake data bricks announcements and

play04:25

so how do you balance the what you're

play04:27

going to do open source with what you're

play04:29

going to do commercial commercially and

play04:30

how you're going to think about the the

play04:32

tradeoff because that's something that

play04:33

you know many open source companies

play04:35

contend with you know how do they keep

play04:37

their Community thriving but then how do

play04:38

they also build a successful business to

play04:40

contribute to their Community yeah it's

play04:42

it's a hard question and the way we've

play04:43

addressed it is currently through uh two

play04:46

families of model but this might evolve

play04:47

with time um we intend to stay the

play04:50

leader in open source so that kind of

play04:51

puts a pressure on on the open source

play04:53

family because there's obviously some

play04:55

contenders out there um the I think

play04:59

compared to to how various software

play05:01

providers playing this strategy uh

play05:04

developed we need to go faster uh

play05:06

because AI develops actually faster than

play05:07

software develops faster than databases

play05:10

like mongodb played a very good game at

play05:12

that and this is a good uh a good

play05:14

example of what we could do uh but we

play05:16

need to adapt faster so yeah uh yeah

play05:19

there's obviously this tension and we're

play05:21

constantly thinking on how we should

play05:23

contribute to the community but also how

play05:25

we should show and start uh getting some

play05:28

commercial adoption uh Enterprise deals

play05:31

Etc and this is uh there's obviously a

play05:33

attention and for now I think we've done

play05:35

a good job at at doing it but it's it's

play05:37

very it's a very Dynamic thing to to

play05:39

think through so it's basically every

play05:40

week we think of uh what we should

play05:42

release next on the on both families and

play05:44

you have been the

play05:46

fastest uh in developing models fastest

play05:49

reaching different benchmarking levels

play05:51

you know one of the most leanest in

play05:53

amount of expenditure to reach these

play05:55

benchmarks out of any of the any of the

play05:58

foundational model companies what do you

play05:59

think is like giving you that advantage

play06:01

to move quicker than your predecessors

play06:04

and more efficiently well I think we

play06:06

like to do uh the like get our hands

play06:09

dirty uh it's uh machine learning has

play06:12

always been about crunching numbers uh

play06:15

looking at your data uh doing a lot of

play06:18

uh extract transform and load and things

play06:20

that are uh oftentimes not fascinating

play06:23

and so we hired people that were willing

play06:26

to do the dot stuff uh and I think

play06:28

that's a uh that has been critical to

play06:31

our speed and that's something that we

play06:33

want to to keep awesome and the in

play06:36

addition to the large model you also

play06:38

have several small models that are

play06:39

extremely popular when would you tell

play06:41

people that they should spend their time

play06:43

working with you on the small models

play06:44

when would you tell them working on the

play06:45

large models and where do you think the

play06:47

Economic Opportunity from mrol lies is

play06:49

it in doing more of the big or doing

play06:51

more of the small I think and I think

play06:54

this is um this is an observation that

play06:57

every llm provider has made uh that like

play07:01

one size does not fit all and uh

play07:04

depending on what you want to when you

play07:06

make an application you typically have

play07:08

different large language model calls and

play07:10

some should be low latency and because

play07:12

they don't require a lot of intelligence

play07:13

but some should be higher latency and

play07:15

require more intelligence and an

play07:17

efficient application should leverage

play07:19

both of them potentially using the large

play07:21

models as an orchestrator for the small

play07:23

ones um and I think the challenge here

play07:26

is how do you make sure that everything

play07:28

works so you end up with like a system

play07:30

that is not only a model but it's really

play07:32

like two models plus an out Loop of of

play07:34

calling your model calling systems

play07:36

calling functions and I think some of

play07:39

the developer challenge that we also

play07:41

want to address is how do you make sure

play07:43

that this works that that you can

play07:45

evaluate it properly how do you make

play07:46

sure that you can do continuous

play07:47

integration how do you how do you change

play07:49

like one how do you move from one

play07:51

version to another of a model and make

play07:52

sure that your application has actually

play07:54

improved and not deteriorated so all of

play07:57

these things are addressed by various

play07:59

companies

play07:59

uh but these are also things that we

play08:01

think should be core to our value

play08:04

proposition and what are some of the

play08:06

most exciting things you see being built

play08:08

on mrra like what are the things that

play08:09

you get really excited about that you

play08:10

see the community doing or customers

play08:12

doing I think pretty much uh every young

play08:15

startup in the Bay area has been using

play08:17

it for like fine tune fine-tuning

play08:19

purposes for fast application making uh

play08:22

so really I think one part of the value

play08:25

of mix for instance is that it's very

play08:26

fast and so you can make applications

play08:28

that uh are more involved uh and so

play08:31

we've seen uh we've seen web search

play08:33

companies using us uh we've seen uh I

play08:38

mean all of the standard Enterprise

play08:39

stuff as well like uh Knowledge

play08:41

Management uh marketing uh the fact that

play08:44

you have access to the weights means

play08:46

that you can pour in your editorial tone

play08:47

much more uh so that's yeah we we see

play08:51

the typical use cases I think the the

play08:53

but the value is that uh for of the open

play08:56

source part is that uh developers have

play08:58

control so they can deploy everywhere

play09:00

they can have very high quality of

play09:01

service because they can uh use their

play09:04

dedicated instances for instance and

play09:06

they can modify the weights to suit

play09:08

their needs and to bump the performance

play09:10

to a level which is close to the largest

play09:12

ones the largest models while being much

play09:14

cheaper and what what's the next big

play09:17

thing do you think that we're going to

play09:18

get to see from you guys like can you

play09:19

give us a sneak peek of what might be

play09:21

coming soon or how what we should be

play09:22

expecting from MRA yeah for sure so we

play09:25

have uh so Mr Large was good but not

play09:28

good enough so we are working on

play09:29

improving it quite quite heavily uh we

play09:31

have uh interesting open source models

play09:34

uh on various vertical domains uh that

play09:37

will be announcing very soon um we have

play09:41

uh the platform is currently just apis

play09:43

like serverless apis uh and so we are

play09:45

working on making customization part of

play09:47

it so like the fine tuning part um and

play09:51

obviously and I think as many other

play09:52

companies we we're heavily betting on

play09:55

multilingual uh data and and

play09:58

multilingual model uh because as a

play10:00

European company we're also well

play10:02

positioned and this is the demand of our

play10:04

customers uh that I think is higher than

play10:06

here MH um and then yeah eventually uh

play10:10

in the months to come we are we will

play10:12

also release some multimodal models okay

play10:15

exciting we we look forward to that um

play10:18

as you mentioned many of the people in

play10:19

this room are using mrol models many of

play10:21

the companies we work with every day

play10:23

here in the silan valley ecosystem are

play10:25

working already working with mrol how

play10:27

should they work with you and how should

play10:28

they work work with the company and what

play10:31

what type of what's the best way for

play10:32

them to work with you well well they can

play10:34

reach out so we have uh some developer

play10:37

relations that are really uh like

play10:39

pushing the community forward making

play10:41

guides uh also Gathering use cases uh to

play10:45

Showcase what you can build uh with mral

play10:47

model so this is we're very uh like

play10:50

investing a lot on the community um

play10:52

something that basically makes the model

play10:55

better uh and that we are trying to set

play10:57

up is our ways to for us to get

play11:00

evaluations benchmarks actual use cases

play11:03

on which we can evaluate our models on

play11:05

and so having like a mapping of what

play11:07

people are building with our model is

play11:09

also a way for us to make a better

play11:10

generation of new open source models and

play11:13

so please engage with us to uh discuss

play11:16

how we can help uh how discuss your use

play11:19

cases we can advertise it uh we can uh

play11:22

also gather some insight of of the new

play11:24

evaluations that we should add to our

play11:26

evaluation suit to verify that our model

play11:27

is are getting better over time MH and

play11:30

on the commercial side our models are

play11:32

available on our platform so the

play11:34

commercial models are actually working

play11:35

better than than the the open source

play11:38

ones they're also available on various

play11:40

Cloud providers so that it facilitates

play11:42

adoption for Enterprises um and

play11:44

customization capabilities like

play11:46

fine-tuning which really made the value

play11:47

of the open source models are actually

play11:49

coming very soon wonderful and you

play11:51

talked a little bit about the benefits

play11:53

of being in Europe you touched on it

play11:55

briefly you're already this example

play11:58

Global example of the great innovations

play12:00

that can come from Europe and are coming

play12:01

from Europe what you know talk a little

play12:03

bit more about the advantages of

play12:05

building a business in France and like

play12:07

building this company from Europe the

play12:09

advantage and drawbacks I guess yeah

play12:11

both both I guess what one advantage is

play12:14

that you have a very strong junior pool

play12:16

of talent uh so we there's a lot of uh

play12:19

people coming from Masters in France in

play12:22

Poland in the UK uh that we can train in

play12:25

like three months and get them up to

play12:26

speed get get them basically producing

play12:29

as much as a as a million dollar

play12:31

engineer in the Bay Area for 10 times 10

play12:35

10 times less the cost so that's that's

play12:36

kind of efficient sh don't tell them all

play12:38

that they're goingon to hire people in

play12:39

France sure uh so that like the the

play12:43

workforce is very good engineers and uh

play12:46

and machine learning Engineers um

play12:49

generally speaking we have a lot of

play12:51

support from uh like the state which is

play12:53

actually more important in Europe than

play12:55

in in the US they tend to over regulate

play12:57

a bit bit too fast uh we've been telling

play13:00

them not to but they don't always listen

play13:03

uh and then generally uh I mean yeah

play13:06

like European companies like to work

play13:08

with us because we are European and we

play13:10

we are better in European languages as

play13:13

it turns out like French uh the the

play13:15

French Mr Large is actually probably the

play13:17

strongest French model out there uh so

play13:20

yeah that's uh I guess that's not an

play13:21

advantage but at least there's a lot of

play13:23

opportunities that are geographical and

play13:25

that we're leveraging wonderful and you

play13:27

know paint the picture for us 5 years

play13:29

from now like I know that this world's

play13:30

moving so fast you just think like all

play13:32

the things you've gone through in the

play13:34

two years it's not even two years old as

play13:36

a company almost two years old as a

play13:38

company um but but five years from now

play13:41

where does mrr sit what do you think you

play13:43

have achieved what what does this

play13:44

landscape look like so our bet is that

play13:47

uh basically the platform and the

play13:50

infrastructure uh of int of artificial

play13:52

intelligence will be open yeah and based

play13:55

on that we'll be able to create uh

play13:58

assistance and then potentially

play13:59

autonomous agent and we believe that we

play14:02

can become this platform uh by being the

play14:05

most open platform out there by being

play14:07

independent from cloud providers Etc so

play14:10

in five years from now I have literally

play14:12

no idea of what this is going to look

play14:13

like if you were if you looked at the

play14:15

field in like 2019 I don't think you

play14:17

could bet on where we would be today but

play14:20

we are evolving toward more and more

play14:21

autonomous agents we can do more and

play14:23

more tasks I think the way we work is

play14:25

going to be changed profoundly and

play14:27

making such agents and assist

play14:30

is going to be easier and easier so

play14:31

right now we're focusing on the

play14:32

developer world but I expect that like

play14:36

AI technology is in itself uh so uh

play14:40

easily controllable through human

play14:42

languages human language that

play14:44

potentially at some point the developer

play14:46

becomes the user and so we're evolving

play14:49

toward uh any user being able to create

play14:53

its own assistant or its own autonomous

play14:55

agent I'm pretty sure that in five years

play14:57

from now this will be uh uh like

play14:59

something that you learn to do at school

play15:02

awesome well we have about five minutes

play15:04

left just want to open up in case

play15:05

there's any questions from the

play15:08

audience don't be shy son's got a

play15:11

question how do you see the future of

play15:13

Open Source versus commercial models

play15:15

playing out for your company like I

play15:16

think you made a huge Splash with open

play15:17

source at first as you mentioned some of

play15:19

the commercial models are even better

play15:21

now how do you imagine that plays out

play15:23

over the next cample of years well I

play15:25

guess the one thing we optimize for is

play15:26

to be able to continuously Produce open

play15:28

model with a sustainable business model

play15:30

to actually uh like fuel the development

play15:34

of the Next

play15:35

Generation uh and so that's I think

play15:38

that's as I've said this is uh this is

play15:40

going to evolve with time but in order

play15:41

to stay relevant we need to stay uh the

play15:44

best at producing open source models uh

play15:46

at least on some part of the spectrum so

play15:47

that can be the small models that can be

play15:49

the very big models uh and so that's

play15:51

very much something that basically that

play15:53

sets the constraints of whatever we can

play15:54

say we can do uh staying relevant in the

play15:56

open source uh World staying the best

play15:59

best uh solution for developers is

play16:01

really our mission and and we'll keep

play16:03

doing

play16:04

it

play16:06

David there's got to be questions for

play16:08

more than just the Sequoia Partners guys

play16:10

come on you talk to us a littleit about

play16:13

uh llama 3 and Facebook and how you

play16:15

think about competition with them well

play16:17

lfre is working on I guess uh making

play16:20

models I'm not sure they will be open

play16:22

source I have no idea of what's going on

play16:23

there uh so far I think we've been

play16:26

delivering faster and smaller models so

play16:28

we expect expect to be continuing doing

play16:29

it but uh generally the the good thing

play16:32

about open source is that it's never too

play16:33

much of a competition because uh uh once

play16:36

you have like uh if you have several

play16:38

actors normally that should actually

play16:40

benefit to everybody uh and so there

play16:44

should be some if if they turn out to be

play16:46

very strong there will be some cination

play16:48

and and we'll welcome it one thing

play16:50

that's uh made you guys different from

play16:51

other proprietary model providers is the

play16:53

Partnerships with uh snowflakes and data

play16:55

bricks for example and running natively

play16:58

in their clouds as to sort of just

play16:59

having API connectivity um curious if

play17:02

you can talk about why you did those

play17:04

deals and then also what you see is the

play17:07

future of say data bricks or snowflake

play17:09

in the brave new LM world I guess you

play17:12

should ask them but uh I think generally

play17:14

speaking AI models become very strong if

play17:16

they are connected to data and grounding

play17:19

uh yeah grounding information as it

play17:21

turns out uh the Enterprise data is

play17:23

oftentimes either on snowflake or on

play17:25

data rcks or sometimes on AWS uh and so

play17:29

being able for customers for customers

play17:32

to be able to deploy the technology

play17:34

exactly where their data is uh is I

play17:37

think quite important I expect that this

play17:39

will continue continue doing the ca

play17:42

being the case uh especially as I

play17:45

believe we'll move onto more stateful AI

play17:47

deployment so today we deploy serverless

play17:50

apis with not much State it's really

play17:52

like Lambda uh Lambda functions but as

play17:55

we go forward and as we make models more

play17:57

and more specialized as we make them uh

play18:00

more tuned to use cases and as we make

play18:02

them um

play18:04

self-improving you will have to manage

play18:06

State and those could actually be part

play18:07

of the data cloud or so there there's an

play18:10

open question of where do you put the AI

play18:12

State and I think that's the uh my

play18:15

understanding is that Snowflake and

play18:16

datab Bricks would like it to be on

play18:18

their data

play18:19

cloud and I think there's a question

play18:21

right behind him the

play18:23

grace I'm curious where you draw the

play18:25

line between uh openness and proprietary

play18:28

so you you're releasing the weights

play18:30

would you also be comfortable sharing

play18:32

more about how you train the models the

play18:34

recipe for how you collect the data how

play18:36

you do mixure experts training or do you

play18:38

draw the line at like we release the

play18:39

weights and the rest is proprietary so

play18:41

that's where we draw the line and I

play18:42

think the the reason for that is that

play18:44

it's a very competitive landscape uh and

play18:46

so it's uh similar to like the tension

play18:51

there is in between having a some form

play18:53

of Revenue to sustain the Next

play18:55

Generation and there's also tension

play18:58

between what you actually disclose and

play19:01

and everything that yeah in order to

play19:03

stay ahead of of the curve and not to

play19:05

give your recipe to your competitors uh

play19:08

and so again this is this is the moving

play19:10

line uh if there's also some some Game

play19:13

Theory at at stake like if everybody

play19:15

starts doing it then then we could do it

play19:17

uh but for now uh for now we are not

play19:20

taking this risk indeed I'm curious when

play19:24

an when another company releases weights

play19:26

for a model like grock for example

play19:29

um and you only see the weights what

play19:32

what kinds of practices do you guys do

play19:33

internally to see what you can learn

play19:35

from it you can't learn a lot of things

play19:37

from weights we don't even look at it

play19:40

it's actually too big for us to deploy a

play19:42

gr is is quite

play19:44

big or uh was there any architecture

play19:47

learning I guess they have they are

play19:49

using like a mixture of expert uh pretty

play19:51

standard setting uh with a couple of

play19:54

Tricks uh that I knew about actually but

play19:57

uh yeah that's uh

play19:59

uh there's there's not not a lot of

play20:01

things to learn from the recipe

play20:03

themselves by looking at the weights you

play20:04

can try to infer things but that's like

play20:07

reverse engineering is not that easy

play20:09

it's basically compressing information

play20:11

and it compresses information

play20:12

sufficiently highly so that you can't

play20:15

really find out what's going

play20:23

on coming the cube is coming okay it's

play20:26

okay uh yeah I'm just curious about like

play20:29

um what are you guys going to focus on

play20:31

uh the model sizes your opinions on that

play20:33

is like you guys going to still go on

play20:34

the small or uh yeah going to go to the

play20:37

larger ones basically so model size are

play20:40

kind of set by like scaling lows so it

play20:43

depends on like the compu you have based

play20:45

on the computer you have based on the

play20:47

The Landing AR infrastructure you want

play20:49

to go to you make some choices uh and so

play20:52

you optimize for training cost and for

play20:53

inference cost and then there's

play20:55

obviously um uh there's the weight in

play20:58

between between uh like for depends on

play21:01

the weight that you put on the training

play21:03

cost

play21:04

amortization uh the more you amortize it

play21:07

the more you can compress models uh but

play21:11

basically our goal is to be uh low

play21:13

latency and to be uh relevant on the

play21:16

reasoning front so that means having a

play21:19

family of model that goes from the small

play21:20

ones to the very large

play21:26

ones um hi are there any plans for

play21:28

mistol to exp expand into uh you know

play21:31

the application stack so for example

play21:32

open a released uh the custom gpts and

play21:35

the assistance API is that the direction

play21:37

that you think that M will take in the

play21:39

future uh yeah so I think as I've said

play21:42

the we're really focusing on the

play21:43

developer first uh but there's many um

play21:47

like the the frontier is pretty thin in

play21:48

between developers and users for this

play21:50

technology and so that's the reason why

play21:52

we released like a an assistant

play21:54

demonstrator called lha which is the cat

play21:56

in English and uh it's uh the point here

play22:00

is to expose it to Enterprises as well

play22:02

and be make them able to connect their

play22:05

data connect their context um I think

play22:09

that's that that answers some some need

play22:13

from our customers that many of of the

play22:15

people we've been talking to uh are

play22:17

willing to adopt the technology but they

play22:18

need an entry point and if you just give

play22:20

them apis they're going to say okay but

play22:22

I need an integrator and then if you

play22:25

don't have an integrator at end and

play22:26

often times this is the case it's good

play22:28

if you have like an off the shelf

play22:29

solution at least you get them into the

play22:31

technology and show them what they could

play22:33

build for their core business so that's

play22:35

the reason why we now have like two

play22:36

product offering there the first one

play22:37

which is the platform and then we have

play22:38

the sh uh which should evolve into an

play22:40

Enterprise off the shelf

play22:44

solution more over there there there I'm

play22:47

just wondering like where would you be

play22:49

drawing the line between like stop doing

play22:51

prompt engineering and start doing like

play22:53

fine tuning because like a lot of my

play22:54

friends and our customers are suffering

play22:56

from like where they should be stopped

play22:58

doing more PRT engineering yeah I think

play23:00

that's that's the number one pain Point

play23:02

uh that is hard to solve uh from from a

play23:06

product product standpoint uh the

play23:08

question is normally your workflow

play23:11

should be what should you evaluate on

play23:13

and based on that uh have your model

play23:16

kind of find out a way of solving your

play23:19

task uh and so right now this is still a

play23:21

bit manual you you go and and you have

play23:23

like several versions of prompting uh

play23:26

but this is something that actually AI

play23:27

can can help solving uh and I expect

play23:30

that this is going to grow more and more

play23:31

automatic across time uh and this is

play23:34

something that yeah we would love to try

play23:35

and

play23:37

enable I wanted to ask a bit more of a

play23:39

personal question so like as a Founder

play23:41

in The Cutting Edge of AI how do you

play23:44

balance your time between explore and

play23:45

exploit like how do you yourself stay on

play23:47

top of like a field that's rapidly

play23:48

evolving and becoming larger and deeper

play23:50

every day how do you stay on top so I

play23:54

think this question has um I mean we

play23:56

explore on the science part on the produ

play23:58

part and on the business part uh and the

play24:01

way you balance it is is effectively

play24:03

hard for a startup you do have to

play24:05

explore it a lot because you you need to

play24:07

ship fast uh but on the science part for

play24:10

instance we have like two or three

play24:11

people that are like working on the next

play24:13

generation of models and sometimes they

play24:15

lose time but if you don't do that

play24:17

you're at risk of becoming irrelevant

play24:19

and this is very true for the product

play24:20

side as well so being right now we have

play24:23

a fairly simple product but being able

play24:24

to try out new features and see how they

play24:27

pick up is something that we we are we

play24:30

need to do and on the business part you

play24:32

never know who is actually quite mature

play24:34

enough to to use your technology so yeah

play24:38

the balance between uh exploitation and

play24:41

exploration is something that we Master

play24:42

well at the science level because we've

play24:44

been doing it for years uh and somehow

play24:46

it transcribes into the product and the

play24:48

business but I guess we're currently

play24:50

still learning to do it

play24:52

properly so one more question for me and

play24:55

then I think we'll be we'll be done

play24:56

we're out of time but you know you've in

play24:59

at the scope of two years models big

play25:01

models small that have like taken the

play25:03

World by storm killer go to market

play25:05

Partnerships you know just tremendous

play25:08

momentum at the center of the AI

play25:10

ecosystem what advice would you give to

play25:12

Founders here like what you have

play25:13

achieved in the pace of what you have

play25:15

achieved is truly extraordinary and what

play25:18

advice would you give to people here who

play25:19

are at different levels of starting and

play25:20

running and building their own

play25:21

businesses in it around the AI

play25:24

opportunity I would say it's it's always

play25:26

day one so I guess we yeah we are uh I

play25:30

mean we got some mind share but there's

play25:33

I mean there's still many proof points

play25:34

that we need to establish uh and so yeah

play25:37

like being a Founder is basically waking

play25:40

up every day and and figuring out that

play25:42

uh you need to build everything from

play25:44

scratch every time all the time so it's

play25:47

uh it's I guess a bit exhausting but

play25:49

it's also exhilarating uh and so I would

play25:52

recommend to be quite ambitious usually

play25:54

uh being more ambitious uh I mean

play25:57

ambition can get you very far uh and so

play26:01

you yeah you should uh dream big uh

play26:04

that's that would be my advice awesome

play26:06

thank you arur thanks for being with us

play26:09

[Applause]

play26:13

today

Rate This
โ˜…
โ˜…
โ˜…
โ˜…
โ˜…

5.0 / 5 (0 votes)

Related Tags
AI InnovationOpen SourceBusiness StrategyCommunity EngagementIndustry InsightsFounder ExperienceTechnology GrowthEuropean TechAI EthicsFuture Predictions