Don’t Build AI Products The Way Everyone Else Is Doing It
Summary
TLDRCe script met en lumière les défis de créer des produits AI basés sur des modèles pré-entraînés et propose une approche différentiée. Il souligne les problèmes de coût, de lenteur et de manque de personnalisation des grands modèles de langage (LLM). L'orateur recommande de combiner des techniques de programmation traditionnelles avec des modèles AI spécialisés pour construire des produits plus rapides, fiables, économiques et uniques. Il partage l'expérience de développement d'un outil de codage automatisé à partir de designs visuels, montrant comment l'intelligence artificielle peut être utilisée de manière efficace dans des domaines spécifiques pour améliorer grandement l'expérience utilisateur.
Takeaways
- 🚀 Évitez de simplement utiliser des modèles d'IA existants et créez une technologie différentiée.
- 💰 Les modèles de langage de grande taille (LLM) sont coûteux à exécuter en raison de leur complexité et de leur taille.
- 🛠️ Construisez votre propre chaîne d'outils en combinant un LLM fine-tuned, d'autres technologies et un modèle entraîné spécifiquement pour votre cas d'utilisation.
- 📈 Les coûts peuvent devenir prohibitifs pour les utilisateurs lorsque les modèles de langage de grande taille ne sont pas nécessaires pour la tâche.
- 🚫 Les LLM sont lent et difficiles à personnaliser pour répondre aux besoins spécifiques d'une application.
- 🔍 Explorez d'abord le problème en utilisant des pratiques de programmation normales pour déterminer quelles zones nécessitent un modèle d'IA spécialisé.
- 🧠 Les produits d'IA avancés sont généralement construits à partir de plusieurs modèles spécialisés et de code normal.
- 🤖 L'IA doit être utilisée le moins possible, uniquement pour résoudre des problèmes spécifiques qui ne peuvent être traités que par des modèles d'IA bien établis.
- 🌐 Utilisez des ressources en ligne pour générer des données de formation et des outils tels que Google's Vertex AI pour entraîner vos propres modèles.
- 🔒 Ayez un contrôle complet sur vos modèles d'IA pour pouvoir les améliorer continuellement et répondre aux besoins de confidentialité.
- 📈 Les produits d'IA réussis sont ceux qui utilisent une combinaison de code normal et de modèles d'IA spécialisés pour créer une valeur proposition unique et différentiée.
Q & A
Quelle est la principale problématique liée aux produits AI basés sur des wrappers de modèles pré-entraînés?
-La principale problématique est que ces produits ne sont pas différenciés du point de vue technologique, car ils utilisent une technique simple avec un modèle pré-entraîné que n'importe qui peut copier rapidement, ce qui les rend vulnérables à la concurrence facile.
Comment les coûts des grandes modèles linguistiques (LLM) peuvent-ils poser problème pour les entreprises?
-Les LLM sont très larges et complexes, ce qui les rend coûteuses à exécuter. Par exemple, Co-pilot perd de l'argent par utilisateur, avec un coût moyen de 20 $ pour les appels API pour un abonnement mensuel de 10 $.
Pourquoi les LLM sont-elles lentes pour certaines applications?
-Les LLM sont lentes car, pour certaines applications, il faut attendre une réponse complète avant de pouvoir passer à l'étape suivante, ce qui peut prendre des minutes si l'on doit transmettre une spécification de conception entière et recevoir une représentation complète token par token.
Comment les LLM sont-elles limitées en termes de personnalisation?
-Bien que les LLM prennent en charge l'ajustement de fine-tuning, cela ne permet que d'améliorer progressivement le modèle pour l'adapter à des besoins spécifiques. Cependant, dans certains cas, comme la conversion de designs Figma en code, le fine-tuning ne semble pas rendre le modèle significativement plus intelligent.
Qu'est-ce qu'une 'toolchain' et pourquoi est-elle importante pour les produits AI avancés?
-Une 'toolchain' est une série d'outils et de modèles spécialisés combinés avec du code normal pour résoudre des problèmes spécifiques. Elle est importante car elle permet de construire des produits plus rapides, fiables, économiques et différenciés, sans avoir à se soucier de produits clones ou open source qui émergent rapidement.
Comment les développeurs peuvent-ils entraîner leur propre modèle AI sans être un data scientist ou un doctorant en apprentissage automatique?
-Avec des produits comme Google's Vertex AI, les développeurs expérimentés peuvent facilement entraîner leur propre modèle en utilisant des types de modèles couramment disponibles et en préparant les données nécessaires, ce qui rend le processus d'entraînement accessible à un public plus large.
Pourquoi est-il important de ne pas utiliser d'AI dès le départ lors de la construction d'une solution?
-Il est important d'explorer d'abord l'espace du problème avec des pratiques de programmation normales pour déterminer quelles zones nécessitent un modèle spécialisé. Utiliser des 'super models' pour résoudre des problèmes complexes peut entraîner des modèles incroyablement complexes, lents et coûteux qui ne sont pas viables.
Comment les développeurs peuvent-ils générer des données pour entraîner leurs modèles d'appariement d'objets?
-Ils peuvent utiliser des outils comme Puppeteer pour ouvrir automatiquement des sites web, prendre des captures d'écran et parcourir le HTML pour identifier les balises d'image. En utilisant les coordonnées d'image comme données de sortie et les captures d'écran comme données d'entrée, ils obtiennent ce dont ils ont besoin pour entraîner leur modèle.
Quels sont les avantages de la solution 'visual co-pilot' présentée dans le script?
-La solution 'visual co-pilot' offre des avantages tels que la conversion rapide et efficace de designs en code de qualité supérieur, la prise en charge d'une large variété de frameworks et d'options, et la possibilité d'une personnalisation complète du code généré. De plus, le coût est faible et la solution est très rapide, car les modèles sont spécifiquement construits pour cet usage.
Comment les entreprises peuvent-elles bénéficier de la propriété de leurs propres modèles AI?
-La propriété de leurs propres modèles AI permet aux entreprises de les améliorer constamment, d'ajouter des fonctionnalités et de les adapter à leurs besoins spécifiques. Elles peuvent également garantir un niveau élevé de confidentialité et, si nécessaire, utiliser des modèles d'entreprise propres ou des instances d'entreprise d'AI pour respecter leurs exigences de confidentialité.
Quel est le conseil principal donné pour construire des produits AI?
-Le conseil principal est de ne pas utiliser d'AI aussi longtemps que possible et de se concentrer sur les problèmes spécifiques qui ne peuvent être résolus que par des modèles d'IA bien établis. Les développeurs devraient remplir les lacunes avec des modèles spécialisés uniquement là où ils sont nécessaires et continuer à utiliser du code normal pour le reste de la solution.
Outlines
🚀 Créer des produits IA uniques et performants
Le paragraphe aborde les défis et les solutions pour construire des produits IA qui se démarquent de la concurrence. Il souligne les problèmes majeurs liés à l'utilisation de modèles pré-entraînés comme GPT, notamment l'absence de différenciation, les coûts élevés et les lenteurs. L'auteur propose une approche alternative qui consiste à créer une chaîne d'outils personnalisés en combinant des technologies et des modèles d'apprentissage automatique fine-tunés pour construire des produits plus rapides, fiables, économiques et uniques.
🤖 Les erreurs courantes dans la construction de produits IA
Ce paragraphe discute des erreurs fréquentes commises lors de la création de produits IA, telles que la simplification de l'intelligence artificielle à un seul modèle géant et la confusion entre les modèles de langage large et les chaînes d'outils complètes. L'auteur explique que les véhicules autonomes, par exemple, sont le résultat de plusieurs modèles spécialisés et de code normal, ce qui montre que la complexité est construite en couches.
🛠️ Approche recommandée pour les solutions IA
Dans ce paragraphe, l'auteur recommande une approche pour construire des solutions IA avancées en utilisant une combinaison de code normal et de modèles IA spécialisés. Il insiste sur l'importance de comprendre le problème en premier lieu sans AI, de créer des données et de former ses propres modèles IA pour résoudre des problèmes spécifiques. L'auteur souligne également la possibilité d'améliorer continuellement les modèles et de s'adapter aux besoins de confidentialité des entreprises.
Mindmap
Keywords
💡Différenciation technologique
💡Coût
💡Modèles de langage de grande taille (LLM)
💡Personnalisation
💡Développement de modèles personnalisés
💡Solutions AI avancées
💡Chaîne d'outils
💡Performance
💡Fiable
💡Sécurité des données
💡Conception de produits
💡Code
Highlights
Building unique and valuable AI products requires a differentiated approach from the norm.
Many AI products are simply wrappers over existing models, which lacks differentiation.
Using pre-trained models like GPT is easy but risky due to their replicability.
Large language models (LLMs) are costly to run due to their size and complexity.
LLMs may not be necessary for specific use cases, as they cover a vast range of topics.
Cost economics of using LLMs might not align with what users are willing to pay.
LLMs are slow, which can be a problem for applications requiring quick responses.
Customization of LLMs through fine-tuning has its limitations.
Building a custom toolchain with a combination of technologies can lead to faster, cheaper, and more reliable products.
Advanced AI products are often built with a combination of specialized models and normal code.
Self-driving cars are an example of a complex toolchain, not a single AI brain.
Starting with normal programming practices is crucial before integrating specialized AI models.
Breaking down complex problems to solve them without AI is a recommended approach.
Creating your own AI models allows for constant improvements and control over the technology.
Privacy-focused companies can benefit from models that are not reliant on external AI services.
Using AI models minimally and focusing on robust, normal code leads to more efficient products.
The magic of AI products comes from using AI in small, critical areas where normal coding falls short.
For more information, refer to the latest blog post on the builder.blog.
Transcripts
if you want to build AI products that
are unique valuable and fast don't do
what everybody else is doing I'll show
you what to do instead the vast majority
of AI products being built right now are
just wrappers over other models for
instance basically just calling chat GPT
over an API and while that's incredibly
easy you send natural language in and
get natural language out and it can do
some really cool things there are some
major problems with this approach that
people are running into and there's a
solution for them that I'll show you the
first major issue is is this is not
differentiated technology if you've
noticed that one person creates a chat
with a PDF app and then another dozen
people do two and then open AI builds
that into chat GPT directly That's
because nobody there actually built
something differentiated they use a
simple technique with a pre-trained
model which anyone can copy in a very
short period of time when building a
product whose unique value proposition
is some type of advanced AI technology
it's a very risky position to be so easy
to copy now of course there's a whole
Spectrum here here if you're on the
right side of the spectrum where all you
made was a button that sends something
to chat GPT and gets a response back
that you showed to your end users where
chat GPT basically did all the work
you're at the highest risk here on the
other end if you actually built some
substantial technology and LMS like open
AIS only assisted with a small but
crucial piece then you may be in a
better position but you're still going
to run into two other major issues the
first major issue you'll run into is
cost the best part of a large language
model is their broad versatility but
they achieve this by being incredibly
large and complex which makes them
incredibly costly to run as an example
co-pilot is losing money per user
charging $10 but on average costing $20
just on API calls and some users cost
GitHub up to $80 and the worst part is
you probably don't need such a large
model your use case probably doesn't
need a model trained on the entirety of
the internet which 99.9% % will be
covering topics that have nothing to do
with your use case so while the ease of
this approach might be tempting you
could run into this common issue where
what your users want to pay is less than
what it costs to run your service on top
of large language models but even if
you're the rare case where the cost
economics might work out okay for you
you're still going to hit one more major
issue llms are painfully slow now this
isn't a huge problem for all
applications for instance for use cases
like chat GPT where you can read one
word at a time anyway this isn't the
worst thing but for applications that
are not about streaming text where
nobody is going to be reading it word
for word but instead waiting on the
entire response before the next step in
the flow can be taken this can be a big
problem for instance when we started
building our visual co-pilot product
where we wanted one button click to turn
any design into highquality code one of
the approaches we explored was using an
llm for the conversion but one of the
key issues was it took forever because
if you need to pass an entire design
spec into an llm and get an entire new
representation out token by token it was
taking literally minutes to give us a
reply which was just not viable and
because the representation returned by
the llm is not what a human would see
the loading state was just a spinner and
it was horrific but if for some reason
performance is still not even an issue
to you and for some reason your users do
not care about having a slow and
expensive product that's easy for your
competitors to copy you'll still likely
hit at some point one other major issue
which is llms cannot be customized that
much yes they all support fine-tuning
and fine-tuning can incrementally help
the model get closer to what you need
but in our case we tried using fine
tuning to provide figma designs and get
code out the other side but no matter
how many examples we gave the model it
did not seem to get hardly any smarter
at all but we were left with was
something slow expensive and Incredibly
poor quality and that's where we
realized we had to take a different
approach what did we find we had to do
instead we had to create our own tool
chain in this case we combined a
fine-tuned llm a whole lot of other
technology and a custom trained model
and this is not necessarily as hard as
you might think these days you don't
have to be a data scientist or a PhD in
machine learning to train your own model
any moderately experienced developer Now
can do it what this can allow you to
build is something that is way faster
way more reliable far cheaper and far
more differentiated so you won't have to
worry about copycat products or open
source clones spawning overnight either
and this isn't just a theory most if not
all advanced AI products are built in a
way like this a lot of people have a
major misconception about how AI
products are built I've seen that they
often think that all the core Tech is
handled by one super smart model where
they trained it with tons of inputs to
give exactly the right output for
instance for self-driving cars I've seen
a lot of people have the impression that
there's this giant model that takes in
all these different inputs like cameras
sensors GPS Etc it crunches it through
the smart Ai and then out comes the
action on the other side such as turn
right but this could not be farther from
the truth that car driving itself is not
one big AI brain but instead a whole
tool chain of several specialized models
all connected with normal code such as
models for computer vision to find and
identify objects and predictive
decision- making to anticipate the
actions of others or natural language
processing for understanding voice
commands all of these specialized models
combined with tons of just normal code
and logic creates the end result that
you see now keep in mind autonomous
vehicles is a highly complex example
that include many more models than I'm
even showing here but for building your
own product you won't need something
nearly this complex especially to start
remember self-driving car cars didn't
spawn overnight my 2018 Prius is capable
of parking itself stopping automatically
when too close to an object and many
other things using little to no AI over
time more and more layers were added to
do more and more advanced things like
correcting L departure or eventually
making entire decisions to drive from
one place to another but like all
software these things are built in
layers one on top of the next the way we
build visual co-pilot is a way I would
highly recommend you explore for your
own AI Solutions it's a very simple but
counterintuitive approach the most
important thing is don't use AI to start
you need to explore the problem space
using normal programming practices first
to even determine what areas need a
specialized model because remember
making super models is generally not the
right approach we don't want to just
send tons of figma data into a model and
get finished code out the other side
that would be an insanely complex
problem to solve with just one model
model and when you factor in all the
different Frameworks we support and
styling options and customizations this
would just get insane to retrain this
model with all this different data and
it would likely become so complex slow
and expensive that our product probably
would have never shipped in the first
place instead what we did is we looked
at the problem and said well how can we
solve this without Ai and how far can we
get before it just gets impossible
without the types of specialized
decision- making AI is best at so broke
the problem down and said okay we need
to convert each of these nodes to things
we can represent in code like HTML nodes
for the web we need to understand what
is an image what is a background what is
a foreground and most importantly how to
make this responsive because this only
works if what we import becomes fully
responsive for all screen sizes
automatically then we started looking at
more complex examples and realize there
are many cases where many many layers
need to be turned into one image we
started writing hand-coded logic to say
if a set of items is in a vertical stack
that should probably be a flex column
and if groups are side by side they
should probably be a flex row and we got
as far as we could creating all these
different types of sophisticated
algorithms to automatically transform
designs to responsive code before we
started hitting limits and in my
experience wherever you think the limit
is it's probably actually a lot further
at a certain point you'll find some
things are just near impossible to do
with normal code for example
automatically detecting which of these
layers should turn into one image is
something that our eyes are really good
at understanding but not necessarily
normal imperative code in our case we
wrote all this in JavaScript now lucky
for us training your own object
detection model is not that hard for
example products like Google's vertex AI
has a range of common types of models
that you can easily train yourself one
of which is object detection I can
choose that with a guey and then prepare
data and just upload it as a file for a
wellestablished typee of model like this
all it comes down to is creating the
data now where things get interesting is
finding creative ways of generating the
data you need one awesome massive free
resource for generating data is simply
the internet and so one way we explored
approaching this is using Puppeteer to
automate opening websites in a web
browser we can then take a screenshot of
the site and we can Traverse the HTML to
find the image tags we can then use the
location of the images as the output
data and the screenshot of the web page
as the input data and now we have
exactly what we need a source image and
coordinates of where all the sub images
are to train this AI model so while in
figma this which should be one image as
many layers our object detection model
can take the pixels identify that this
rectangle should be one image we can
compress it into one and use it as part
of our code gen using these techniques
where we fill in the unknowns with
specialized AI models and ping multiple
together is how we're able to produce
end results like this where I can just
select this hit generate code launch
into Builder and get a completely
responsive website out the other side
with highquality code that you can
customize yourself completely supporting
a wide variety of Frameworks and options
and it's all incredibly fast because all
of our models are specially built just
for this purpose incredibly low cost to
provide we provide a generous free tier
and ultimately really valuable for our
customers to save them lots of time and
the best part is this is only the
beginning because one of the best parts
of this approach as opposed to just
wrapping somebody else's model is we
completely own the models so we can
constantly improve them if you're fully
dependent only on someone else's model
like open AI there's no guarantee it's
going to get smarter faster or cheaper
for your use case and your ability to
control that with prompt engineering and
fine-tuning is severely limited but when
we own our own model we're making
drastic improvements every day when new
designs come in that don't import well
which still happens as we're in beta we
look at user feedback we find areas to
improve and we improve at a rapid
Cadence shipping improvements every
single day and we never have to worry
about a lack of control for instance we
started talking to some very large and
very privacy focused companies to be
early beta customers and one of the
first pieces of feedback was they're not
able to use open aai or any products
using open AI because of their privacy
requirements and the need to make sure
their data never goes into systems that
they don't allow in our case because we
control the entire technology we can
hold our models to an extremely high
privacy bar and for the llm step it can
either be disabled because it's purely a
nice to have or we're allowing companies
to plug in their own llm which might be
a completely in-house built model a fork
of llama 2 their own Enterprise instance
of open AI or something else entirely so
if you want to build AI products I would
highly recommend taking a similar
approach as strange as it sounds don't
use AI as long as possible when you
start finding extremely specific
problems that normal coding doesn't
solve well but wellestablished AI models
can start generating your own data and
training your own models using a wide
variety of tools that you can find off
the shelf connect your model or multiple
models to your code at only the small
points that they're needed and I want to
emphasize this use AI for as little as
possible because at the end of the day
normal plain code is some of the fastest
most reliable most deterministic most
easy to debug easy to fix easy to manage
and easy to test code you will ever have
but the magic will come from the small
small but critical areas you use AI
models for if you'd like to learn more
about this topic you can see more on my
latest blog post on the builder. blog
thanks for watching and I can't wait to
see what you build
Voir Plus de Vidéos Connexes
IA : Ce qu'on ne vous dit pas ...
Et si l'IA était un nouvel instrument pour les musiciens ? | Benoit Carré | TEDxTours
Perplexity AI : Le Tuto Complet (cette IA est PHÉNOMÉNALE)
J'ai codé une IA qui a soumis 14 252 mecs, voici ce que j'ai appris.
4 logiciels qui devraient être payants (tellement ils sont ouf)
Skills-Based Organization: What Works, And Twitter vs. Threads
5.0 / 5 (0 votes)