Easy 100% Local RAG Tutorial (Ollama) + Full Code
Summary
TLDREl guion del video muestra un tutorial sobre cómo crear un sistema RAG (Retrieval-Augmented Generation) sin conexión utilizando AMA y un modelo de lenguaje. El creador muestra cómo convertir un PDF en texto, crear embeddings y realizar consultas de búsqueda para obtener respuestas relevantes. Se ofrece un enlace a un repositorio de GitHub con instrucciones y código para configurar el sistema. El tutorial es fácil de seguir y se destaca por su sencillez y eficiencia, aunque no es perfecto, es adecuado para casos de uso personales.
Takeaways
- 📄 El script habla sobre cómo extraer información de un PDF utilizando un sistema local llamado 'ama'.
- 🔍 Se menciona que el proceso se realiza sin conexión a Internet, lo que implica que todo se ejecuta de manera local.
- 📝 El script fue convertido en un PDF y luego se extrajo el texto para su uso en la creación de 'embeddings'.
- 📖 Se destaca la importancia de tener los 'chunks' de texto en líneas separadas para un mejor funcionamiento del sistema.
- 🤖 Se utiliza un modelo llamado 'mral' para realizar consultas y obtener respuestas contextuales de los documentos.
- 💻 Se describe cómo configurar el sistema 'ama' en un ordenador usando Windows, incluyendo la instalación y la descarga del modelo.
- 🔗 Se proporciona un enlace en el repositorio de GitHub para seguir los pasos y configurar el sistema.
- 🛠️ Se sugiere realizar ajustes en el código para adaptar el sistema a diferentes necesidades, como cambiar el número de resultados mostrados.
- 🔧 Se menciona la posibilidad de ajustar el tamaño de los 'chunks' de texto para obtener mejores resultados.
- 📈 Se destaca la facilidad de configuración y la brevedad del código, que consta de aproximadamente 70 líneas.
- 🌐 Aunque el sistema es local y no requiere conexión a Internet, se admite que no es perfecto pero es suficiente para casos de uso personales.
Outlines
😀 Configuración de un sistema RAG offline
El primer párrafo describe el proceso de configuración de un sistema de respuesta a consultas (RAG) en un entorno desconectado. El usuario explica que ha convertido noticias recopiladas en un archivo PDF y planea extraer información de este con un script de Python llamado 'pdf.py'. Después de cargar el contenido en un archivo de texto, se ejecuta otro script, 'local rag.py', para generar 'embeddings' y realizar consultas como '¿Qué dijo Joe Biden?'. Se menciona que el sistema utiliza la biblioteca 'AMA' y se ofrece un tutorial en un repositorio de GitHub para que otros puedan replicar la configuración. El script es sencillo, con solo unas 70 líneas de código, y el usuario destaca su facilidad de uso y la posibilidad de ajustar la cantidad de resultados mostrados ('top K').
😉 Ajustes y personalización del sistema RAG
El segundo párrafo se centra en cómo personalizar y ajustar el sistema RAG para adaptarse a diferentes necesidades. El usuario muestra cómo cambiar el número de resultados mostrados de tres a cinco, utilizando el parámetro 'top K'. Además, se discute la posibilidad de modificar el tamaño de los 'chunks' de texto para obtener mejores respuestas. Se menciona un ejemplo de cómo el sistema puede ser útil para responder consultas específicas, como '¿Qué dice el artículo sobre muestreo y votación?', y se destaca cómo el sistema puede ser útil para casos de uso personales, aunque no necesariamente para un uso empresarial. El usuario invita a los espectadores a probar el sistema, compartirlo y darle una estrella en el repositorio de GitHub si lo encuentran útil.
Transcripts
okay so here we have a PDF file that uh
I just gathered some news from yesterday
and put it in and converted it to PDF so
what we're going to do now is try to
extract information from this PDF uh
with our local l so we are now offline
right so I'm just going to go Python
pdf. py and let's upload this to our
text file before before we create
embeddings from it okay so this was
appended to WTO text each Chunk on
separate line so let's take a look at it
it so you can see this is the structure
I want for my data so we want uh I don't
know how many how big the trunks are but
we want them on separate lines because I
found out that that works best so if we
go back to the terminal now and just
close this right and then we
run let's say we run python local rag.
py then we can start asking questions
about our data or document created this
embeddings here that we can now use as
if we do the search query what did Joe
Biden
say you can see uh context pulled from
documents so we have these three chunks
here I set my top K to three so we
pulled three different chunks and all of
this we have you can see President Biden
uh US President Biden and yeah probably
Biden here too right and you can see
mroll response Joe Biden spoke with
prime minister Benjamin natano yeah you
can see see uh we get the answer here
from mistr so this is running on AMA
100% locally if we go down here you can
see I'm not online so yeah it's working
great and the good thing is that it's
very short it's only about 70 lines of
code so this is all you need and yeah
it's been working great so I'm going to
show you today how you can set this up
yourself and just go through the full
tutorial and it's going to be open on
our um GitHub repo so you can download
it and try out for yourself okay so I'm
just going to walk you through the steps
you need to do to make this work so just
head over to ama.com download and
download AMA for whatever OS you're
using uh today I'm using Windows so I'm
just going to click on Windows and
install AMA pretty straightforward after
you've done that uh yeah head over to
the terminal and just run the command
AMA and pull yeah I'm going to pull mol
you can pull whatever model you want
here okay uh since I've already done
this it's 4.1 gab and yeah that is
pretty much it you can check this now by
doing AMA uh run mraw I think and yeah
that should be it so send message hello
and you can see all Lama is running here
now so very easy to set up next step
then is just going to be following the
link in the description you will come to
my GitHub repo here and just follow the
instructions here so you can just start
by cloning this rot you can Fork it too
if you want to so just go to your
terminal let's just clone this and let's
CD into this here right and then we're
just going to install our dependencies
so pip install um requirements. text
right I have this installed um
already and then we can just start if we
have a PDF file we can just do python
pdf. pirate we can upload our PDF like
this this has been pen bed right and
let's just close that and again next
part just going to run local rag. py
python local rag. py and we should be
good to go here now so hopefully we will
see our
embeddings and yeah we can ask what did
Joe Biden say right yeah we are pulling
this and hopefully we will get an answer
from mral
now pretty good yeah so very easy setup
and of course uh I'm going to go through
some quick adjustments you can make so
you can know how to do that if you want
to so let's do that now okay so let's
say you wanted to upload a PDF with a
paper so more agents is all you need
pretty big paper and now we kind of want
to do some
adjustments uh I want to bring in kind
of the top five results instead of three
right so I'm just going to change this
top K here to five uh you can also
change this down to one if you only want
the best uh or the results that matches
the coin similarity most with the user
input right but let's put it to five now
and let's head over to our terminal
let's do python pdf. pi and let's upload
the agent paper now instead right okay
so let's close that and if we open here
now you can see this is a bit figger but
it's the same format
right and this mentions a lot of
sampling and voting so if we go here now
and we ask let's say what does the paper
say
about sampling and voting right you can
see we have our embeddings that's good
uh so let's say if I run this
now oh you can see now we bring in like
more chunks Two Three or at least more
we bring in more information and you can
see the paper introduces a method called
sampling voting for handing task queries
using llms and yeah uh I think that's a
pretty good answer so uh that's some
adjustments you have to play around with
yourself you can also play around with
trying to change how big these chunks
can be uh I'm not going to go into
detail that in this simple setup but
that is something you can do yourself uh
yeah I think that's pretty much it
that's kind of what I wanted to cover
and uh I just found it neat that you can
create a pretty okay offline rag system
in so few lines of code and I really
found it helpful it's lightweight easy
to use Quick and 100% local but it's of
course it's not perfect that was not the
point either but it's good enough for my
use case right so maybe not use this at
your company but on your local PC sure
so again if you want to try this out
just head over to the repo you can find
the link in the description would really
appreciate it if you gave this project a
star if you want to try it out yourself
just share it with your friends if you
want to that's cool and yeah thank you
for tuning in have a great day
Ver Más Videos Relacionados
Cómo EMBEBER UN GPT en una página WEB [Tutorial paso a paso]
18. Tutorial Xcode: Reproducir un sonido automáticamente al abrir la aplicacion
Como hacer un Rifle Casero (Tutorial Paso a Paso)
Web I - SPA y Ajax - Partial Render
Flet Curso: 5 Crear un Ambiente Virtual como Proyecto para las Librerías y Recursos del Curso
Cómo añadir ACCIONES a un GPT para darle SUPERPODERES 🦸🏻♂️
5.0 / 5 (0 votes)