Chat With Documents Using ChainLit, LangChain, Ollama & Mistral 🧠
Summary
TLDRThis video tutorial, part of an AMA series, guides viewers on creating a simple Retrieval-Augmented Generation (RAG) application using open-source models. The host demonstrates how to embed PDF documents into a knowledge base with the help of Langchain and AMA, and then interact with them through a user-friendly chat interface. The video covers setting up the environment, installing necessary packages, and deploying the application, which allows users to ask questions and receive answers based on the embedded documents. It also emphasizes the importance of tweaking parameters for optimal results and suggests referring to official repositories for updates and troubleshooting.
Takeaways
- 😀 The video is part of an AMA series, focused on creating a simple Retrieval-Augmented Generation (RAG) application using open-source models.
- 📚 The presenter has previously created videos on using AMA for chat UI and integrating it with Langchain.
- 🔗 The video introduces using Langchain for deploying applications and utilizing Langsmith for application tracing.
- 📝 The GitHub repository mentioned in the video contains the code for the RAG application and instructions for setting it up.
- 💻 Viewers are guided to clone the repository, set up a virtual environment, and install necessary packages using a `requirements.txt` file.
- 🔍 The video covers two examples: ingesting documents from a data folder and creating a chain lead application for uploading PDFs directly through a UI.
- 📈 It demonstrates how to use the `ama` embeddings for creating a vector database with the help of the `mistol` model from the Langchain Hub.
- 🔧 The video provides troubleshooting tips for issues with code or functions, suggesting checking the official GitHub repository for updates or creating an issue.
- 🗣️ The presenter shows how to interact with the chat UI, asking questions related to the PDF content and receiving answers based on the embedded knowledge base.
- 🛠️ The video emphasizes the flexibility of using different models with AMA and the importance of precise prompts for better question-answering results.
- 🔄 The process includes creating a vector database, using a RAG prompt, and deploying a chat application that can answer questions based on uploaded PDFs.
Q & A
What is the purpose of the video?
-The video aims to guide viewers on how to create a simple Retrieval-Augmented Generation (RAG) application using open-source models and tools like AMA, Lang chain, and Chain lead.
What is AMA in the context of this video?
-AMA refers to a simple chat UI interface that uses models to interact with users, similar to Chat GPT, and is utilized to create the RAG application.
What is Lang chain used for in this tutorial?
-Lang chain is used for deploying the RAG application and also for managing the traces of the application.
What is the significance of using the 'rag prompt mistol' in the video?
-The 'rag prompt mistol' is a pre-existing template available in the Lang chain Hub that simplifies the process of creating the RAG application by providing a structured format for prompts.
How many different ways are shown in the video to create a RAG application?
-Two different methods are demonstrated: one where documents are ingested from a data folder and another where PDFs are uploaded through a UI for conversation.
What is the role of the 'create Vector database' function in the script?
-The 'create Vector database' function initializes the loaders for different file formats, in this case, PDFs, and is responsible for embedding the documents into a vector database for retrieval purposes.
What is the importance of splitting documents into chunks in the script?
-Splitting documents into chunks is necessary for the Recursive Character Splitter to process the text effectively, allowing the model to handle large documents and maintain context with an overlap.
Why is it recommended to use a virtual environment in the video?
-A virtual environment is recommended to isolate the project's packages from existing ones on the system, preventing conflicts and ensuring the correct versions of dependencies are used.
What is the process of running the RAG application as described in the video?
-The process involves installing necessary packages, creating a vector database with embedded documents, using the 'main.py' file to set up the chat interface, and finally running the application using Chain lead with the 'run main.py' command.
How can viewers get help if they encounter issues with the code in the video?
-Viewers are advised to go to the official GitHub repository of Lang chain or Chain lead and create an issue, where they can seek assistance from the community or the creators of the tools.
What is the purpose of the 'on_message' function in the 'main.py' file?
-The 'on_message' function handles user input in the chat interface, processes it through the RAG application, and retrieves information from the knowledge base to provide answers.
Outlines
📹 Introduction to the AMA Series and Chat UI Application
This paragraph introduces the fourth video in an AMA series, where the creator has previously made videos about a simple chat UI using AMA and Lang chain. The video aims to create a simple Rag (Retrieval-Augmented Generation) application. The creator has also made videos on using open-source models for Rag applications but emphasizes that this tutorial will focus on a simpler PD (prototype?) of Rag applications using AMA and Lang chain, with a focus on deploying applications and using Lang Smith for tracing. The GitHub repository for the project is mentioned, along with the addition of new files for ingesting documents and creating a chain lead application.
🛠 Setting Up the Environment and Ingesting Documents
The paragraph explains the process of setting up the development environment for the Rag application. It involves cloning a repository, setting up a virtual environment, and installing necessary packages. The focus is on ingesting documents by initializing loaders for PDF files and splitting them into chunks using a recursive character splitter. The AMA tool is introduced for creating and persisting a chroma vector database, with instructions on how to install and use models from the AMA website. The paragraph also details the steps to run the ingest file and create a persistent database for the application.
🔍 Developing a Retrieval-QA Chain Lead Application
This section delves into the development of a Retrieval-QA (Question Answering) application using chain lead and the AMA tool. It explains how to use a persistent database and the Lang chain hop to pull prompts for the application. The video creator shows how to use the Rag prompt 'mistol' from the Lang chain Hub and load the AMA 'mistol' model for the application. The paragraph outlines the creation of a QA bot that interacts with the persistent database and the user through a chat interface, demonstrating how to run the main.py file to start the application and ask questions related to the ingested PDF document.
📚 Enhancing the Application with User-Uploaded PDFs
The final paragraph discusses enhancing the application to allow users to upload PDFs directly through the UI. It covers the process of adjusting the chunk size and overlap for text splitting, and handling file uploads with specified maximum sizes and timeout settings. The paragraph also addresses potential issues with code updates in Lang chain and chain lead, advising viewers to check the official GitHub repositories for solutions. The video concludes by demonstrating the uploading and processing of a PDF file through the UI, asking questions, and receiving answers based on the uploaded document's content.
Mindmap
Keywords
💡AMA
💡Lang chain
💡RAG application
💡GitHub repository
💡Virtual environment
💡Requirements.txt
💡Embeddings
💡Chroma
💡Chain lead
💡PDF loader
💡Retrievable QA chain
Highlights
Introduction to the fourth video in the AMA series, focusing on creating a simple RAG (Retrieval-Augmented Generation) application.
Explanation of previous videos on using AMA for chat UI and Lang chain integration.
Demonstration of using open-source models to create RAG applications with AMA and Lang chain.
Overview of using Lang Smith for application traces and the RAG prompt mistol from the Lang chain Hub.
Instructions on cloning the GitHub repository and setting up the environment for the project.
Details on creating a virtual environment and installing necessary packages for the project.
Description of the process to ingest documents and create a chain lead application for conversing with PDFs.
Guidance on using the AMA tool for embeddings and model selection in the RAG application.
Steps to create and persist a chroma vector database for the knowledge base of the RAG application.
Discussion on using the hop from Lang chain to pull prompts for the RAG application.
Explanation of the main.py file and its role in the RAG application for handling user queries.
Demonstration of running the RAG application and its ability to answer questions about a PDF document.
Illustration of how the RAG application retrieves information from the embedded PDF for user queries.
Insight into the limitations of local models and the importance of precise prompts for accurate answers.
Introduction to example three, which allows uploading PDFs directly through the UI for conversation.
Clarification on the use of the PDF2 reader and adjustments made to accommodate changes in Lang chain and chain lead.
Final demonstration of the RAG application with an updated UI, processing uploaded PDFs, and answering questions.
Advice on exploring chain lead documentation and experimenting with different document formats for RAG applications.
Recommendation to use official repositories for support and to report issues encountered during development.
Closing remarks, encouraging viewers to practice with RAG applications and explore sophisticated solutions like Private GPT and QV.
Transcripts
hello guys welcome back this is the
fourth video in the let's say AMA series
I have already created three different
videos if you are new to AMA please
refer to these videos before I have
created before a simple chat UI where
you can use the uh chat GPT like
interface right using AMA and to use the
models out of it and then also how to
use it with Lang chain right this is the
simple application I have used before
and now some of you mention in that
video that can we use this to create a
simple rag application right that is
what we are going to achieve this by the
way I have already created many videos
where you can use the open source models
to create the simple rag applications
but they are you need let's say that
this is provided by somebody else and
there are so many code involved in it
but if you want to have a simple PD of
rag applications then that is what we
are going to do and as I said you before
also we are going to use AMA and use the
models to download we with wama Lang
chain and then chain lead to have the
let's say deploying for the applications
and by the way we are also going to use
the lank Smith to have the traces of the
applications as you can see here these
are the traces and then also we are
going to use the rag prompt mistol which
is already in the Lang chain Hub there
are many pieces here and there but you
will have a clear understanding once we
go through all these things let's get
started okay so this is the GitHub
repository I actually added this this
content of today's video also in the
existing one I have already shown before
a simple application Chachi for running
LMS locally with let's say AMA and Len
I've added two different files here so
I'll show you two different ways one how
to inest the files already from a data
folder and have the conversation with
that PDF and the next one is having
simple chain lead application where you
can upload the PDFs on the UI itself and
have the conversation with it all the
instructions are here what I did first
is already cloned this locally so what
you can also do is if you want to follow
along with me go here and either choose
https or SSH and then copy this right
and then you can go to your terminal
which I did already here there is this
Lang chainama chain Le and then I open
this in the vs code you can see I'm
opening the vs code here and all the
things are here which you see in the
GitHub so this is the readme files so
you can just go through here I'm saying
here we'll be going through example 1 2
3 but one is already done in my previous
video please refer to that second and
third is what we are going to do so
second one first we will injest the
documents and then create a chain lad
applications right so what is the step
this is what I said you you need to
clone go inside that particular clone
repository and open it in any ID you
want but I have opened this in the vs
code right and next thing is there is
this EnV if I go inside this this will
be as example EnV uh I I will rename
that in the in the GitHub so this one we
don't need to have any paid versions of
API calls but we need to have I will
show you this because I'm going to
delete this later so you need to have
this Lang Smith related things here and
if you are new to Lang Smith related
things I have already created three four
videos related to Lang Smith and how to
get started with it please refer to that
right so now if I go to the read me the
next thing you need to do is create a
virtual environment right so you can
just copy this right click copy let me
open a terminal here so I will just open
a new terminal here and first you of
course I hope you have python install in
your system so if I go here and say
maybe python version I have
3.11.0 this should work for your system
also I will do control V what this this
does now is create a virtual environment
and activate the virtual environment as
you can see there is EnV here and I have
set up my terminal in such a way that
once the virtual environment is created
it is shown something like this here
right now I have uh virtual environment
created and if you are new to Virtual
environment also I have created the
productivity videos where you will know
how to isolate the projects or packages
so that it doesn't conflict with
existing ones in your computer right and
the next thing what we can do is we can
just install the packages right I will
copy this what it is doing if you go to
the requirements.txt here I have all the
packages that is needed for this uh
particular project I will just go to the
terminal and do control V so it will
install all the necessary packages as it
is say it's using the cast one because I
have already tested this before and
there is already the cast version of
this it is installing but remember that
it is installing in this particular
virtual environment once that is done
the example one you can just run this
chain lead run simple chat UI so this is
here simple chat UI this is the previous
video content just go through this and
have a simple chat interface right
please follow the previous video for
this one I will not go through this and
okay it is setting the environment now
now what I will do is first I will show
you the example two let me actually
clear the screen example two there is
Python 3 inest file and the main file
right I will just show you what is in
the inest file so here is the inest dop
here all the necessary things are being
imported and this is the normal path
things that I have created here and
there is function called create Vector
database right so instructions are also
mentioned here meaning that the data
must be inside this data folder I have
this CPT for all PDF what I'm doing here
is initializing the loaders for
different file formats but for this case
it's just a PDF right so PDF loader I'm
using the directory loader and passing
the data all the PDF files and loader is
p PDF loader let me do control Z okay
and loaded documents is PDF loader. load
and that is loaded here and you can just
uncomment this if you want to run step
by step you can also do that and next
what we will do is we will split the
loaded document into chunks right we
need to split that into chunks so we'll
be using the recursive character
splitter we said that Chong size 500 you
can play around with this numbers and
Chong overlap is 40 so that we have
let's say in two different CH some
informations related to the previous one
and yeah text split. split documents and
we pass this loaded documents in the
previous step there right so next as I
said you we'll be using the AMA but
before running this AMA and providing
the model you need to have Ama installed
in your machine if you are new as I said
you before also I have already created
different videos in ama so please refer
to that but if I if if I just go to the
terminal and run AMA list it should show
me some models as you see here mistol
and lava I'm going to use the mistol so
if this is running or swing here it
knows that ama is here once this is done
you can install the models that you want
I'm not going to make this video longer
but you can follow my previous videos if
you have these models downloaded you can
provide this in the AMA embeddings and
the model name the good part of using
Ama embeddings or AMA is that if you
maybe now have let's say Lama 2
installed you can just provide Lama 2
here and by default actually Lama 2 is
being used but you need to provide the
one that you want to use so quickly
changing different models from the AMA
website this is the beauty of AMA right
and now we will create a create and
persist a chroma Vector database right
I'm going to use chroma in this case so
this is normal things that I have been
explaining many places and after this I
said this persist so that it is being
persisted here it will be created a new
DV folder because I have said here to
create a DB folder and yeah once this is
done I will run this file so how to run
this now is just you can go to the
terminal Python 3 and then you just say
in. F so it is going to do all the
necessary things here and once this is
going through the steps you will notice
here a DB folder being created now here
right inside the the DB folder now it is
just showing chroma sql3 but the process
is ongoing once this is completed you
will see some other files also being
shown here this is still ongoing so when
this is ongoing what is the next step we
will be doing we will be going through
this main.py file so if I go inside here
again here also the normal importing
things are mentioned here and I have
also provided you link from the chain
lead documentation where I have taken
this this code you can go through that
normal things here I want to use from
the persistent DB and as I said you
before I'm going to use the hop right
Lang chin hop from there you can pull
the prompts that's what I'm doing here
if you go back and I will show you here
here this is the hob and I'm searching
here for mistol and as you can see this
is the rlm persons user ID and then this
is the rag prompt M if I go inside this
so somebody has provided the prom
template for us so we can just pull it
like this and use in our existing code
so this is quite easier to make the code
look better right I will go back to the
PS code and now as you can see here
python in. f is completed so if I expand
this DB and there are many files here
right meaning that our embedding things
is now completed and our knowledge base
is ready right and we will go through
this and we'll be using this rag prompt
mistol and we will load the model here
as you can see I'm using the AMA mistol
right and was true and you can just
provide the call back and return llm so
this is the llm things and we have the
retrievable QA chain just there is not
that much of the things I need to
explain here and there is the QA bot
which goes through that existing
persistent DB and get the information
and here also you need to provide the
same embedding model that you use to
embed the particular PDF so yeah now we
have the normal things this chain lead
on start chat what it does is it goes
through the QA bot and it says okay
starting the bot just the normal prompt
things in the UI and hello welcome to
chat with documents you can just
customize this and this is just the
normal on chat start and once you have
that and this is the on message kind of
when you ask things then it goes through
this process as you can see it goes
through the session and gets the session
information and yeah just the normal
things here and also we want to have
this Source also printed along with this
answer right so I if if you have
followed my previous let's say rag
videos then this is the normal things
and you can just copy paste and if you
want to know more you can go through
step by step so that's all so now let's
start this application right so I have
already embedded things I clear the
screen now to run this you need to run
chain lead run main.py and you can pass
dasw if you want to maybe have uh
something change in your code and update
as it goes I will just pass this one you
can just run any of these things so it
says create a default config at this so
it is created by default things and yeah
here is our simple chat UI now we can
ask the questions related to that PDF
right so maybe I will say here what is
the paper about right so now as you can
see it is going through the retrieval QA
and when this is going in the UI as you
can see here in the terminal there is
already the answer being printed right
and now we have the answer the paper is
about the introduction and release of
open instruction tune large language
models and all the different things is
being provided here and if you want to
know if it gets the information out of
it or not you can just go to the data
open the PDF file here right and then
you can ask the question from here let
me ask one question by the way we can
just go through the PDF here and what
questions do we do we want to ask from
here let's say let me see if it gives me
the total cost okay what is the total
cost of creating the model something
like this right so I will go to the
application here I will say let's say
what is
cost to train the model let me say
something like this if it gets the
answer or not so here it already says
down something okay I'm unable to
determine the total cost okay this is
how llm works and also because this is
the local model it depends upon your
Hardware requirements also it sometimes
gives the correct answer sometimes it
doesn't give maybe you need to modify
the prompt or be precise when asking the
questions for example if we go to the
PDF let's let me ask some some other
things what can we ask from here okay
maybe something related here what uh
what is the fine-tune variant right of
the CPT for all model let me say that
let me ask here on which model is GPT
for all model let's say
dependent something like this right
maybe so here it says here okay DBT for
all model is something gbt for all model
is dependent on the large scale data
especially the data set of with GPT 3.5
model with dis still is important to
know that GPT for all team does not
train model something something it shows
the source also here and GPT for all it
is giving us the GitHub of this model
and in the source two we have all the
sources here three okay it is getting
the informations from here and the
fourth one is here okay I'm not going to
go in depth because it's just getting
the information from these chks right
now you get the idea the right and wrong
things is not what I want to show you
but you can just ask the questions from
here locally without paying something
right when you do some prototyping kind
of things this is really good or maybe
also if you have good Hardware resources
and if you can tweak uh the parameters
or the prompt of course why not the
local models can be also great now let
me cancel this contrl C and let me clear
the screen this is what we do the inje
part separately and we ask the question
right let me close this one also now let
me go to the example three what I'm
doing in the example three I have this
rack Pi file and same things as before
but then we will be uploading the files
on the UI itself and by the way many of
you also mentioned in previous videos
the code is not working or maybe some
functions are not working because the
reason for that is because Lang chain as
well as chain lead are rapidly growing
right and the code might be updated
maybe the functions does not exist
anymore something like that right best
way to get the answers of those things
is to go to the official GitHub
repository and just create a issue so
that if someone else also has faced the
same issues that you have they will
provide you the answer of course I will
go through there and I will try to
provide the answers if I have because I
will be creating some video let's say
five months before and I will not be
going through that video right but
somebody else might have gone through
that so they might provide you good
informations before me right so just
keep that in mind and here this is just
a normal importing things as I did
before also and here there is the text
splitter again recursive splitter so
Chong size now I'm providing 1,000 and
overlap is 100 as I said you it's the
parameters that you can you can just
tweak so on chat so here I'm saying
please upload a PDF file and you can
provide the max size 20 and the time out
180 because some of you also mentioned
how how big of the file size it must be
right so you can provide here and yeah
and and some of the things also again
here also because this code was working
before and now it is not working and
some of you also get the issue so now I
have fixed this in such a way that it
just use the P PDF 2.pdf reader and we
need to pass the file. path here and it
works right before there was the content
things and now it shows that okay there
is no content available anymore
something like that right so this is the
normal things we are reading the PDF
files and again the splitting into
chunks and this is the metadata being
created now we have the chroma vectory
store similar as before and we have the
chat message history also here right so
that we can ask the follow-up questions
as we go and yeah here is the creat a
chain that that uses the chroma Vector
DV I don't actually need to explain
maybe now you know I'm using by the way
chart AMA here you can just try around
with this chart Ama or AMA just try it
because this is the chat kind of things
I I think if I provide chat AMA it's
also fine right but it depends upon your
use case also and now here once our file
is uploaded that means the edding is
done it will say Okay processing done
you can ask the questions and now we
have this it is stored now in the
sessions and now we have on message kind
of thing
and here the same things like before but
in different format right it it just
takes that particular chain before and
then we can ask the questions it goes
through the knowledge base and get the
information out of it the normal rag
kind of things so yeah now let's run
this how to run this again same as
before but now I will draun chain lead
run and then it is rack. pi as I said
you we can just pass DW or without this
also let me run without this I'm running
all at one go now right as you can see
here now we have different looking UI it
says please upload a PDF file to begin I
can just browse from here maybe upload
the same GPT for all paper I will go
here and it says processing GPT for all
PDF because we have mentioned here to do
that right processing and the file that
we uploaded do name and once the
processing or the embedding part is done
it will show us the message uh that okay
processing the file done you can ask the
questions right let me go if it is
already done so it's still ongoing here
so once that is done we can ask the
question and by the way the good part of
chain lead is also that when you ask the
questions here you can just ask the
history from here and ask the same
questions here right processing this St
you can now ask the questions now let me
ask the same question that I asked
before what is the paper about and I
will just say here okay what is the
paper about it is going through the
knowledge base and then it will provide
the answer for us the paper is about GPT
for all or whatever it provides right so
here you can see the paper is about
original GPT for all model and all the
different things and maybe again if you
go here it will provide you the uh
sources from here so it also depends
because now I have the chunk size of
1,000 and it has more information in one
chunk so maybe I see that it provides
better answer on that also it's it's up
to you you just go here and play around
okay which number works the best right
and now if I go here again and maybe I
will say what is the total cost to train
the model let me see if it provides
answer here because same PDF is being
used here and there right what is the
total cost of to train the model OKAY
the text does not provide information
okay strange I can say there is can you
search it for me just random things it
will understand this if not also you can
just play around as I said before also
prom the main idea here is to show you
how how you can achieve these things
okay the text does not provide any
information so it doesn't provide let's
say in that way so yeah I'm not going to
go through all the things again and
trying to get the right answer out of it
but you get the idea how to run this and
and and let's say that now this is for
PDF right you can go and go through the
documentation in the chain lead or Lang
chain there are many cookbooks there
just try with CSV or try with normal
text document or docs microsof dos or
any docks kind of things so yeah just
play around with it and just practice I
think if you do that then you will have
a simple looking chat but if you want to
have the sophisticated one let's say the
private GPT and the QV are the best ones
to go if you want to have the ready made
solution from from someone who has spent
lots of hours there right okay that's
all for this video If you enjoyed the
video or if you find some useful
information out of it please give thumbs
up you can share this or subscribe if
you haven't already thank you for
watching and see you in the next
Voir Plus de Vidéos Connexes
End to end RAG LLM App Using Llamaindex and OpenAI- Indexing and Querying Multiple pdf's
Easy 100% Local RAG Tutorial (Ollama) + Full Code
Realtime Powerful RAG Pipeline using Neo4j(Knowledge Graph Db) and Langchain #rag
Private AI Chatbot on Your Computer - Step by Step Tutorial
2-Langchain Series-Building Chatbot Using Paid And Open Source LLM's using Langchain And Ollama
5-Langchain Series-Advanced RAG Q&A Chatbot With Chain And Retrievers Using Langchain
5.0 / 5 (0 votes)