Build a RAG app in minutes using Langflow OpenAI and Azure | StudioFP101
Summary
TLDRカーター・ラバサ氏はDataStaxの開発者関係部門の責任者として、開発者がLangflow、Astro DB、Azureを使用してRAG(リトライバル・オーギュメント・ジェネレーション)アプリケーションを構築する方法について語りました。ラバサ氏は、Langflowを使いながら、Wikipediaのデータをもとにインテリジェントなチャットボットを作成し、それをPythonのFlaskアプリケーションに統合し、Azureに展開するプロセスを説明しました。また、LangSmithというツールを使ってAIアプリケーションの可視化を紹介し、開発者が機械学習エンジニアでなくてもAIアプリケーションを構築できるようになっていると強調しました。
Takeaways
- 🚀 DataStax社のCarter Rabasaが開発者向けにRAG(Retrieval-Augmented Generation)アプリケーションの構築方法を紹介。
- 🛠️ Langflowツールを使ってRAGアプリケーションを構築し、その後PythonのFlaskアプリケーションに組み込む方法を説明。
- 🔄 Microsoft Azureにアプリケーションを展開し、LangSmithを利用して可視性機能を紹介。
- 📚 RAGは、プライベートデータやファイアウォール内にあるデータに対しても動作する知能型ソフトウェアを構築する手法。
- 🤖 LLM(Large Language Models)はパブリックデータのみを学習源としており、プライベートデータについてはRAGがその解決策。
- 📈 Langflowの視覚的なインターフェースを通じてAIフローの構築と実行が可能で、開発が容易。
- 📝 記事やWikipediaページなどのデータをベクターデータベースに取り込み、埋め込みモデルを用いてベクターへの変換を行う。
- 🔍 ベクター検索を用いて質問に関連する最も重要なドキュメントを取得し、LLMに提示して答えを得るプロセスを紹介。
- 💰 Langflowを通じてOpenAIなどのホステッドLLMに渡すデータ量を減らし、コスト効率を高める方法を学ぶ。
- 🔧 LangFlowはPythonのみ対応で、LangChainフレームワーク上にあると同時にLangSmithはAIアプリケーションの可視性向上に役立つ。
- 🌐 Azureを用いてアプリケーションをコンテナ化し展開する方法を簡単に説明し、AIアプリケーションのデプロイメントの容易さを強調。
Q & A
カーター・ラバサはDataStaxでどのようなポジションを担当していますか?
-カーター・ラバサはDataStaxの開発者リレーションズ部門の責任者を務めています。
RAGアプリケーションとは何ですか?
-RAGとはRetrieval Augmented Generationの略で、プライベートデータやファイアウォールの内側にあるデータに基づく知能型ソフトウェアを構築する際に使用される技術です。
Langflowはどのようなツールですか?
-LangflowはAIフローをビジュアルエディターを使って構築するためのツールで、PythonのFlaskアプリケーションに組み込むことができます。
Astro DBはどのようなデータベースですか?
-Astro DBはDataStaxが提供するベクターデータベースで、非ベクターデータとベクターデータを両方保管することができます。
ベクター埋め込みモデルとは何を意味しますか?
-ベクター埋め込みモデルとはテキストを長方形の数字の配列、つまりベクター空間内のポイントに変換するモデルです。これにより、類似する概念やデータが近くに集まります。
LangSmithはどのようなツールですか?
-LangSmithはAIアプリケーションの可視性と可観察性を提供するツールで、LLM呼び出しに関する情報を追跡し、デバッグを容易にします。
カーターはどのようにしてDune 2に関する情報を含むチャットボットを作成しましたか?
-カーターはWikipediaのDune 2のページをベクターデータベースに取り込み、Langflowを使ってRAGアプリケーションを作成し、その後PythonのFlaskアプリケーションに組み込みました。
OpenAIのAPIを使用する際に料金がかかるのはどのような基準に基づいていますか?
-OpenAIのAPIを使用する際の料金は、渡されたデータの量に基づいて決まります。つまり、LLMに渡されるトークンの数によって課金されます。
LangFlowは現在どのプログラミング言語と互換性がありますか?
-LangFlowは現在、Pythonのみと互換性がありますが、LangChainというフレームワークの上に構築されています。
AIアプリケーションを構築する際にLangSmithはどのように役立ちますか?
-LangSmithはAIアプリケーションのLLM呼び出しに関する詳細情報を提供し、モデルプロバイダーや使用モデル、トークン数、コストなどを見ることができます。
カーターはどのようにしてAzureにアプリケーションを展開しましたか?
-カーターはDockerファイルを作成し、Gunicornを設定してローカルでビルドし、Azure CLIを使用してAzureに展開しました。
Outlines
😀 RAGアプリケーションの構築方法
カーター・ラバサはDataStaxの開発者関係部門の責任者として、開発者がLangflow、Astro DB、Azureを使用してRAG(検索強化された生成)アプリケーションを構築する方法について話します。カーターは、Langflowを使用してRAGアプリケーションを構築し、次にそれをPythonのFlaskアプリケーションに統合し、Microsoft Azureに展開するプロセスを説明します。また、LangSmithを使用して可視性機能を紹介し、時間があればラップアップとQ&Aを行います。カーターはDataStaxで働く開発者関係の専門家であり、最近AIプログラミングを始めました。
📚 RAGの基礎とデータベースの使用
カーターはRAG(検索強化された生成)の概念を説明し、LLM(大規模言語モデル)が公開されているデータでトレーニングされる一方で、プライベートデータの場合にはRAGがそのギャップを埋めると語ります。次に、Wikipediaのページを使用して、どのようにデータをベクターデータベースであるAstro DBに取り込み、それを使用してチャットボットを作成するかのデモンストレーションを行います。ベクターデータベースは、自然言語モデルが生成するベクターを保存し、類似する概念を近くに集める機能を持ちます。
💻 LangFlowを使ったアプリケーション開発
カーターはLangFlowを使用してAIフローをビジュアルエディタで作成し、その後Pythonアプリケーションに統合する方法を紹介します。LangFlowはLangChainフレームワークに基づいており、Pythonコードとしてエクスポートできます。カーターは、FlaskアプリケーションにLangFlowコードを貼り付け、JSONファイルを使用してフローを実行するデモンストレーションを行います。また、LangFlowは現在Python専用であり、LangChainの上に構築されています。
🚀 Azureへの展開とLangSmithの可視性機能
カーターは、作成したアプリケーションをAzureに展開し、Dockerを使用してコンテナ化する方法を説明します。さらに、LangSmithというツールを使用して、AIアプリケーションの可視性とデバッグを向上させる方法も紹介します。LangSmithは、APIキーを通じて簡単に統合でき、モデルプロバイダーや使用モデル、トークン数、コストなどの情報を提供します。カーターは、これらのツールがAIアプリケーションの構築を容易にし、誰もがAIアプリケーションを構築できると結び付けます。
🎉 総括と参加者へのメッセージ
カーターはプレゼンテーションを締めくくり、参加者に感謝の言葉を述べ、さらに質問がある場合はQRコードをスキャンしてリソースにアクセスできると案内します。また、DataStaxのブースやフィーチャーパートナーディレクトリでカーターの活動についてもっと知ることができます。
Mindmap
Keywords
💡DataStax
💡Langflow
💡RAG (Retrieval-Augmented Generation)
💡Astro DB
💡LLM (Large Language Models)
💡Embedding Model
💡Vector Search
💡LangChain
💡LangSmith
💡Flask
Highlights
Carter Rabasa, head of developer relations at DataStax, discusses building RAG applications with Langflow, Astro DB, and Azure.
RAG stands for retrieval augmented generation, a method to enhance AI with private data.
Langflow is introduced as a tool for building AI flows with a visual interface.
Astro DB from DataStax is highlighted as a vector database that can store both non-vector and vector data.
Vector embeddings are explained as points in space representing concepts, with similar concepts grouped together.
Langflow's visual editor is demonstrated for creating an AI chatbot with data from a Wikipedia page.
The process of ingesting data into a vector database and using it to enhance AI responses is detailed.
Langflow's interactive playground is shown to test the chatbot's functionality.
Efficiency in AI interactions is emphasized by passing less data to the LLM for cost and accuracy benefits.
LangFlow's capability to export flows as Python code and JSON files for integration into applications is discussed.
A Python Flask application is demonstrated to integrate the chatbot flow into a web app.
LangChain is mentioned as the underlying framework for building AI applications that LangFlow is built upon.
LangSmith is introduced as an observability tool for monitoring and debugging AI applications.
The deployment process of the AI application to Microsoft Azure using containers is outlined.
LangSmith's ability to provide insights into LLM calls, including model usage and token costs, is highlighted.
Carter emphasizes that building AI applications is accessible to developers without a machine learning background.
A QR code is provided for accessing more resources on the tools and techniques presented in the discussion.
Transcripts
SPEAKER 1: I am joined right
now by Carter Rabasa,
who's the head of developer relations at DataStax.
DataStax. Data, Data.
Tomato-tomato. How are you doing?
CARTER RABASA: That's right.
I'm doing awesome.
SPEAKER 1: Cool.
What are we talking about today?
CARTER RABASA: We're going to talk about
how developers can build
RAG applications using Langflow,
Astro DB, and Azure.
I'm going to be flying through this, so buckle up.
SPEAKER 1: I got my seatbelt on.
CARTER RABASA: First I'm
going to show you how you can build
RAG applications using this awesome tool called Langflow.
After we use Langflow,
I'm going to show you how you can take
what you built and bake
it into just a regular Python flask application.
Then we're going to deploy this to Microsoft Azure.
I'm going to show you
some cool observability capabilities
using something called LangSmith.
Then if we have time, we'll do some wrap-up in Q&A.
First, my name is Carter Rabasa.
SPEAKER 1: Hi, Carter.
CARTER RABASA: Hello.
My pronouns are He-Him.
As mentioned, I work in developer relations at DataStax,
and I'm just a huge gen AI nerd.
But I only started doing this about six months ago.
So I'm really new to AI programming,
and after talking to a lot of people here at Build,
I think you are all pretty new to it too, right?
SPEAKER 1: Yes.
CARTER RABASA: So we're
all learning this together.
I want to start by saying, what is RAG?
SPEAKER 1: Thank you.
CARTER RABASA: That acronym has
been popping up, it's everywhere.
It stands for retrieval augmented generation.
Still not helpful.
The way that I like to think about
RAG is with this Iceberg metaphor.
LLMs are trained on public data,
they scrape the Internet, and that's what they know.
But if you want to build an intelligent piece of
software that operates on data that isn't public,
it's behind your firewall,
it's in your database. Well, how do you do that?
Because the LLMs don't know
about the stuff that's in your SQL server.
That's where RAG comes in.
It's time to buckle up. We're going to do some coding.
First, I'm going to be mean.
I'm going to show you where LLMs
fail by asking about one of my favorite movies.
I'm going to ask when was Dune 2, released in theaters.
SPEAKER 1: March 2024?
CARTER RABASA: Pretty close.
SPEAKER 1: Is that close?
CARTER RABASA: Well, guess what?
Guess who doesn't know.
SPEAKER 1: ChatGPT.
CARTER RABASA: ChatGPT doesn't know.
In fact, he doesn't
even know that there was a Dune 2, movie.
SPEAKER 1: No.
CARTER RABASA: Now, look,
you guys think I'm cheating.
You're like, "Carter, use a newer LLM,
it'll know about Dune 2."
Sure. But that doesn't matter,
because those new LLMs still
don't know about your private data,
the stuff behind the firewall,
and that's where RAG comes in.
What I want to do is, I'm going to use,
I'm a Dune 2, fanboy.
I'm a Dune fanboy. I'm a Dani Villanueva fanboy.
SPEAKER 1: Oh, he's so talented.
CARTER RABASA: We're going to use
ChatGPT 3.5,
with data from this Wikipedia page,
to build an intelligent chatbot.
How are we going to do that? First, I'm
going to use resources for Microsoft.
This is just a quick start that I found on
your documentation that shows
me how to build a simple Flask app,
and how to deploy it to Azure using containers.
I'm just getting started by cloning this repo,
and I'm using this as a basis
for building this chat application.
I'm doing this because I want to prove
to you that you can use
these tools I'm about to show you
with the toolchains that you already use.
I've gone ahead and clone this,
and here's Visual Studio Code.
What I'm going to do is I'm going to add
a tool called Langflow.
That's as easy as pip install Langflow.
I'm going to show you my ReadMe,
you're just going to pip install Langflow.
Then once you pip install it,
you can run it from the command line,
and it's going to spawn
this web-based visual editor for building these AI flows.
It looks exactly like this.
I've gone ahead and created a Dune 2 chatbot.
SPEAKER 1: Oh, I love these.
I'm a visual person, so I like [inaudible]
CARTER RABASA: Well I'm going to zoom out.
I'm going to zoom out for a minute.
Building RAG applications is a two-step process.
First you have to take the data that the LLMs don't know,
and you have to ingest it into a vector database.
Let's talk about that.
I'm going to zoom in. That's what's happening right here.
I am taking the URL for the Wikipedia page,
and I'm ingesting it into my application.
I'm then running it through something called
a recursive character text splitter.
It's taking that whole Wikipedia page and
breaking it into chunks of 1,000 characters.
Then I'm taking all of those chunks,
and I'm storing them in my vector database.
But before I do it, I'm going to run it
through something called an embedding model.
I'm going to take that text,
and I'm going to convert it into
something called vector embeddings,
and I'm going to store the text and
the embeddings in the vector database.
Astro DB is the vector database from DataStax.
It allows you to store non-vector data.
Here on the right are
all the chunks from the Wikipedia page.
But it also allows you to store vector data,
which is what we got from the LLM.
SPEAKER 1: We say vector data, that's
another term that gets floated around.
CARTER RABASA: I know, sorry. Vectors are
these really long arrays of numbers.
The way that I would ask you to visualize it,
if you give an LLM the word Dog,
it'll create a vector that's like a point in space.
If you give it the word Cat,
it'll give it another point in space that's near Dog.
But if you give it the word Truck,
it's a point the way over here.
These vectors represent points in space,
and similar concepts and
similar data get grouped together.
I'm going to show you how that works with vector search.
SPEAKER 1: So the closer they
are, the more similar?
CARTER RABASA: Absolutely.
Now we've ingested our data.
There are 105 chunks of data in my database.
Now we're going to build the chat experience.
That is what we have right here.
I'm going to go ahead and run this using
the interactive playground that's built into Langflow.
Let's go ahead and ask the question that failed.
When was the movie released?
Hopefully it will say March 1st,
because that is the truth.
Indeed very soon it will.
Awesome, March 1st, 2024.
But I want to show you how that happened.
Let's zoom in and walk through the steps.
We asked the question.
The question itself is converted into a vector.
It's converted into that array of numbers.
That vector is then used to execute a vector search.
Remember, I can show you.
We have 105 documents in my vector store,
but we don't need all 105 to answer the question.
I'm going to execute
a vector search on the vector created from the question.
I'm going to get the top four documents back,
and I'm going to pass those four documents
to something called a prompt.
This is my prompt.
All of the content from the database is
being injected into this variable called context.
Then I'm telling the LLM,
given the context,
answer the question as best as you can.
Then I remind the LLM what the question was.
Now, one thing that's great about LangFlow is that
you can see the data as it flows through the system.
I can hover over this and I can see
the results from my vector search and then over here,
I can hover over this,
and I can actually see the entire prompt that I
construct all the way
down to the question that I'm asking.
SPEAKER 1: It's so
be long prompt, too.
CARTER RABASA: Yeah, but here's the deal.
Here's what I want everyone to remember.
When you use OpenAI to ask these questions,
OpenAI is charging you
based on how much data you pass it.
If I hover my mouse over here,
I can see that I passed it 1,096 tokens of data.
OpenAI and all these hosted LLMs
charge you based on the number of tokens.
It's important to know that not only are you saving
money by passing less information to the LLM,
you're also getting more accurate answers.
If you pass information to the LLM,
like the entire Wikipedia page where most of
the data on that page is
completely irrelevant to the question that I asked,
there's a chance that the LLM is going to
hallucinate and maybe give you a wrong answer.
So using a vector database and using
vector search gives you more accurate answers,
and it saves you money with your LLM.
SPEAKER 1: So less is more.
CARTER RABASA: Yeah, less is more.
Anyway,
so we've built this flow.
We're happy with it, but now it's not an app yet.
We want to turn this into an app that people can
use. Let's do that.
We're going to click on API,
and we're going to go over to Python
code and all you have to
do is copy this code
and paste it into your Python application.
You also are going to export this flow as a JSON file,
download that, and then
you're going to reference that in your application.
Let's switch over to Visual Studio
code and see what that looks like.
SPEAKER 1: Also, is LangFlow
just Python compatible,
or can you use other languages with it?
CARTER RABASA: No.
That's a great question,
and thanks for asking.
Right now, LangFlow is only for Python,
but this is actually really useful for y'all to know.
Everything that you're seeing is Python code,
and everything that you've seen is actually
built on top of a tool called LangChain.
LangChain is the most popular framework
for building AI applications and
LangFlow is this fantastic UI built on top of LangChain.
I'm going to go back to my Python application.
This is what it looks like. Remember, this is
just what I cloned from the QuickStart.
All I did was paste in that code that you saw,
and then down here,
I'm going to create a route called chat,
and I'm going to invoke LangFlow
using this run flow from JSON method.
I'm going to give it the name of my JSON file,
and I'm going to pass in
the input value that I get from my web form.
Let's see what that looks like.
I'm going to go ahead and fire up
flask and show you exactly what this web app looks like.
Boom. This is going to be running locally on port 5,000.
Let's go over to my browser
and reload this page and press Store.
You've got my web app so let's go
ahead and ask it a new question.
I'm going to ask it, what was the meme about?
Do you know what meme I'm talking about? Do you remember?
SPEAKER 1: Actually? I don't remember.
CARTER RABASA: Well, let's see what
the chatbot says.
There was a meme about Dune-themed popcorn bucket,
the famous sandworm popcorn bucket.
SPEAKER 1: I wanted one
of those so bad.
CARTER RABASA: Oh, I've got two. Anyway,
so the chatbot now knows everything about Dune 2,
including the hilarious meme about the popcorn bucket.
Now, all this left for us
is getting this deployed to production.
SPEAKER 1: The world needs to know.
CARTER RABASA: The world needs to know
about popcorn buckets.
Doing that is just super easy.
You can use Azure to
provision a container application and then,
once again, like I'm new to this.
All I did was follow the instructions on the QuickStart.
I created this docker file.
I configured guncorn.
I built it locally,
and then I deployed it to Azure using the Azure CLI.
It was insanely easy
and then next thing you know, here it is.
Deployed to Azure Container apps dot O.
But of course, you can build
your own custom URL or domain for this app.
SPEAKER 1: I'm doing mine for
Dune Messiah when that comes out.
CARTER RABASA: There you go.
Awesome. I'm going
to build the Dune 2 fanbot,
you build the Dune Messiah chatbot.
We'll team up on this.
The last thing I want to show you before I step away
is an amazing observability tool called LangSmith.
As I mentioned, LangFlow is built on top of LangChain.
LangChain is this super popular framework
for building AI apps.
Without adding a single line of code,
all I had to do was create
a LangSmith account and get an API key.
All I had to do was put that API key in
an environment variable when I deployed
it to the container and then automatically,
LangSmith has gives me
observability to all my LLM calls, so check it out.
I can look at the most recent call that I made.
This is the call where I asked about the meme.
This is all the content that it got back from
my vector database. I can go back.
I can go all the way to the bottom and I
can find the question that I asked,
and I can see down here in the output,
what the output from the LLM was.
But also, I have information about
which model provider I used.
Exactly what model I used,
and even how many tokens
I used and what the cost of this was.
When you're building these AI applications
and you're deploying them,
sometimes they can be hard to debug because there's a lot
of chattiness between your app
and the LLM and LangSmith is
this fantastic tool for giving you
observability and visibility into
what's happening with the LLM.
Anyway, going back to my presentation,
I hope that in this time,
y'all feel that you do not need
to be a machine learning engineer to do these things.
You do not need to have
a giant data science team to do these things.
These tools have gotten so good and so
easy that even a very mediocre web developer like
me can build a pretty reasonable AI application
and so can you?
I'll just wrap up real fast.
Thank you so much for having me.
If you have any questions that I didn't answer,
scan this QR code.
It'll send you to a web page that has
more resources on all of the things that I showed you,
and thanks so much for having me.
SPEAKER 1: Sweet.
You did it with one minute to spare.
Thanks. That was great
and what you said about AI
being for everyone with tools like that,
it really does feel that way.
I just really enjoy
seeing that there's stuff for visual people like me,
with LangFlow to be able to break them all down.
Be able to see the cost of things I
think is very important at the end of the day
so having all those tools to make it easy
to manage the LLMs that you're working with,
as well as being able to put out
your general flow of events for the AI chat.
CARTER RABASA: No doubt.
Then also, it's really important.
There's a lot of conversations about responsible AI.
We need more people and more people building AI apps.
SPEAKER 1: Absolutely.
CARTER RABASA: I just think
these tools are making that possible.
SPEAKER 1: Sweet.
Well, thank you so much, Carter.
Again, if you want to learn more about
data stacks or the work that Carter is doing,
please go check out their booth,
or you can check them out on
the feature partner directory.
Thanks Carter, and thanks y' all.
CARTER RABASA: Awesome. Thank you.
SPEAKER 1: Bye.
Weitere ähnliche Videos ansehen
5.0 / 5 (0 votes)