Easy 100% Local RAG Tutorial (Ollama) + Full Code

All About AI
15 Apr 202406:50

Summary

TLDREl guion del video muestra un tutorial sobre cómo crear un sistema RAG (Retrieval-Augmented Generation) sin conexión utilizando AMA y un modelo de lenguaje. El creador muestra cómo convertir un PDF en texto, crear embeddings y realizar consultas de búsqueda para obtener respuestas relevantes. Se ofrece un enlace a un repositorio de GitHub con instrucciones y código para configurar el sistema. El tutorial es fácil de seguir y se destaca por su sencillez y eficiencia, aunque no es perfecto, es adecuado para casos de uso personales.

Takeaways

  • 📄 El script habla sobre cómo extraer información de un PDF utilizando un sistema local llamado 'ama'.
  • 🔍 Se menciona que el proceso se realiza sin conexión a Internet, lo que implica que todo se ejecuta de manera local.
  • 📝 El script fue convertido en un PDF y luego se extrajo el texto para su uso en la creación de 'embeddings'.
  • 📖 Se destaca la importancia de tener los 'chunks' de texto en líneas separadas para un mejor funcionamiento del sistema.
  • 🤖 Se utiliza un modelo llamado 'mral' para realizar consultas y obtener respuestas contextuales de los documentos.
  • 💻 Se describe cómo configurar el sistema 'ama' en un ordenador usando Windows, incluyendo la instalación y la descarga del modelo.
  • 🔗 Se proporciona un enlace en el repositorio de GitHub para seguir los pasos y configurar el sistema.
  • 🛠️ Se sugiere realizar ajustes en el código para adaptar el sistema a diferentes necesidades, como cambiar el número de resultados mostrados.
  • 🔧 Se menciona la posibilidad de ajustar el tamaño de los 'chunks' de texto para obtener mejores resultados.
  • 📈 Se destaca la facilidad de configuración y la brevedad del código, que consta de aproximadamente 70 líneas.
  • 🌐 Aunque el sistema es local y no requiere conexión a Internet, se admite que no es perfecto pero es suficiente para casos de uso personales.

Outlines

00:00

😀 Configuración de un sistema RAG offline

El primer párrafo describe el proceso de configuración de un sistema de respuesta a consultas (RAG) en un entorno desconectado. El usuario explica que ha convertido noticias recopiladas en un archivo PDF y planea extraer información de este con un script de Python llamado 'pdf.py'. Después de cargar el contenido en un archivo de texto, se ejecuta otro script, 'local rag.py', para generar 'embeddings' y realizar consultas como '¿Qué dijo Joe Biden?'. Se menciona que el sistema utiliza la biblioteca 'AMA' y se ofrece un tutorial en un repositorio de GitHub para que otros puedan replicar la configuración. El script es sencillo, con solo unas 70 líneas de código, y el usuario destaca su facilidad de uso y la posibilidad de ajustar la cantidad de resultados mostrados ('top K').

05:01

😉 Ajustes y personalización del sistema RAG

El segundo párrafo se centra en cómo personalizar y ajustar el sistema RAG para adaptarse a diferentes necesidades. El usuario muestra cómo cambiar el número de resultados mostrados de tres a cinco, utilizando el parámetro 'top K'. Además, se discute la posibilidad de modificar el tamaño de los 'chunks' de texto para obtener mejores respuestas. Se menciona un ejemplo de cómo el sistema puede ser útil para responder consultas específicas, como '¿Qué dice el artículo sobre muestreo y votación?', y se destaca cómo el sistema puede ser útil para casos de uso personales, aunque no necesariamente para un uso empresarial. El usuario invita a los espectadores a probar el sistema, compartirlo y darle una estrella en el repositorio de GitHub si lo encuentran útil.

Transcripts

play00:00

okay so here we have a PDF file that uh

play00:03

I just gathered some news from yesterday

play00:05

and put it in and converted it to PDF so

play00:08

what we're going to do now is try to

play00:10

extract information from this PDF uh

play00:13

with our local l so we are now offline

play00:15

right so I'm just going to go Python

play00:18

pdf. py and let's upload this to our

play00:21

text file before before we create

play00:23

embeddings from it okay so this was

play00:26

appended to WTO text each Chunk on

play00:28

separate line so let's take a look at it

play00:30

it so you can see this is the structure

play00:32

I want for my data so we want uh I don't

play00:36

know how many how big the trunks are but

play00:38

we want them on separate lines because I

play00:40

found out that that works best so if we

play00:43

go back to the terminal now and just

play00:45

close this right and then we

play00:48

run let's say we run python local rag.

play00:51

py then we can start asking questions

play00:54

about our data or document created this

play00:57

embeddings here that we can now use as

play01:00

if we do the search query what did Joe

play01:03

Biden

play01:05

say you can see uh context pulled from

play01:08

documents so we have these three chunks

play01:10

here I set my top K to three so we

play01:13

pulled three different chunks and all of

play01:16

this we have you can see President Biden

play01:19

uh US President Biden and yeah probably

play01:22

Biden here too right and you can see

play01:24

mroll response Joe Biden spoke with

play01:27

prime minister Benjamin natano yeah you

play01:29

can see see uh we get the answer here

play01:32

from mistr so this is running on AMA

play01:34

100% locally if we go down here you can

play01:37

see I'm not online so yeah it's working

play01:40

great and the good thing is that it's

play01:42

very short it's only about 70 lines of

play01:46

code so this is all you need and yeah

play01:49

it's been working great so I'm going to

play01:50

show you today how you can set this up

play01:52

yourself and just go through the full

play01:54

tutorial and it's going to be open on

play01:56

our um GitHub repo so you can download

play01:59

it and try out for yourself okay so I'm

play02:01

just going to walk you through the steps

play02:02

you need to do to make this work so just

play02:04

head over to ama.com download and

play02:07

download AMA for whatever OS you're

play02:10

using uh today I'm using Windows so I'm

play02:12

just going to click on Windows and

play02:14

install AMA pretty straightforward after

play02:17

you've done that uh yeah head over to

play02:19

the terminal and just run the command

play02:22

AMA and pull yeah I'm going to pull mol

play02:24

you can pull whatever model you want

play02:26

here okay uh since I've already done

play02:29

this it's 4.1 gab and yeah that is

play02:32

pretty much it you can check this now by

play02:35

doing AMA uh run mraw I think and yeah

play02:42

that should be it so send message hello

play02:45

and you can see all Lama is running here

play02:47

now so very easy to set up next step

play02:49

then is just going to be following the

play02:50

link in the description you will come to

play02:52

my GitHub repo here and just follow the

play02:55

instructions here so you can just start

play02:57

by cloning this rot you can Fork it too

play03:00

if you want to so just go to your

play03:01

terminal let's just clone this and let's

play03:04

CD into this here right and then we're

play03:08

just going to install our dependencies

play03:11

so pip install um requirements. text

play03:15

right I have this installed um

play03:18

already and then we can just start if we

play03:21

have a PDF file we can just do python

play03:23

pdf. pirate we can upload our PDF like

play03:28

this this has been pen bed right and

play03:32

let's just close that and again next

play03:35

part just going to run local rag. py

play03:38

python local rag. py and we should be

play03:43

good to go here now so hopefully we will

play03:46

see our

play03:47

embeddings and yeah we can ask what did

play03:51

Joe Biden say right yeah we are pulling

play03:55

this and hopefully we will get an answer

play03:58

from mral

play04:00

now pretty good yeah so very easy setup

play04:05

and of course uh I'm going to go through

play04:07

some quick adjustments you can make so

play04:09

you can know how to do that if you want

play04:11

to so let's do that now okay so let's

play04:13

say you wanted to upload a PDF with a

play04:15

paper so more agents is all you need

play04:18

pretty big paper and now we kind of want

play04:21

to do some

play04:22

adjustments uh I want to bring in kind

play04:24

of the top five results instead of three

play04:28

right so I'm just going to change this

play04:30

top K here to five uh you can also

play04:33

change this down to one if you only want

play04:36

the best uh or the results that matches

play04:38

the coin similarity most with the user

play04:41

input right but let's put it to five now

play04:44

and let's head over to our terminal

play04:46

let's do python pdf. pi and let's upload

play04:50

the agent paper now instead right okay

play04:54

so let's close that and if we open here

play04:57

now you can see this is a bit figger but

play05:01

it's the same format

play05:03

right and this mentions a lot of

play05:05

sampling and voting so if we go here now

play05:10

and we ask let's say what does the paper

play05:15

say

play05:16

about sampling and voting right you can

play05:21

see we have our embeddings that's good

play05:24

uh so let's say if I run this

play05:27

now oh you can see now we bring in like

play05:31

more chunks Two Three or at least more

play05:37

we bring in more information and you can

play05:39

see the paper introduces a method called

play05:42

sampling voting for handing task queries

play05:43

using llms and yeah uh I think that's a

play05:47

pretty good answer so uh that's some

play05:50

adjustments you have to play around with

play05:53

yourself you can also play around with

play05:55

trying to change how big these chunks

play05:58

can be uh I'm not going to go into

play06:00

detail that in this simple setup but

play06:02

that is something you can do yourself uh

play06:05

yeah I think that's pretty much it

play06:06

that's kind of what I wanted to cover

play06:08

and uh I just found it neat that you can

play06:11

create a pretty okay offline rag system

play06:14

in so few lines of code and I really

play06:17

found it helpful it's lightweight easy

play06:19

to use Quick and 100% local but it's of

play06:23

course it's not perfect that was not the

play06:26

point either but it's good enough for my

play06:28

use case right so maybe not use this at

play06:31

your company but on your local PC sure

play06:34

so again if you want to try this out

play06:35

just head over to the repo you can find

play06:38

the link in the description would really

play06:39

appreciate it if you gave this project a

play06:42

star if you want to try it out yourself

play06:44

just share it with your friends if you

play06:46

want to that's cool and yeah thank you

play06:48

for tuning in have a great day

Rate This

5.0 / 5 (0 votes)

Etiquetas Relacionadas
RAGAMAPythonPDFBúsquedaOfflineTutorialExtracciónInformaciónGitHubLocal