OpenAI Embeddings and Vector Databases Crash Course

Adrian Twarog
30 Jun 202318:41

Summary

TLDRThis video tutorial explores the concept of embeddings and vector databases, essential for AI product development. It breaks down the process into three parts: theory, application, and integration. The host demonstrates how to create embeddings using OpenAI's API and store them in a vector database for semantic searches and recommendations. Step-by-step instructions are provided for generating embeddings with Postman and storing them in SingleStore, a cloud-based database. The video also includes a JavaScript function for interacting with embeddings and a teaser for a comprehensive guide on OpenAI and GPT.

Takeaways

  • 📚 Embeddings are a way to convert data like words into numerical vectors that capture patterns of relationships.
  • 📈 In a vector space, words with similar uses, like 'dog' and 'puppy', are represented by vectors that are close to each other.
  • 🌐 Vector databases store these embeddings and can be used for searching, clustering, recommendations, and classification based on similarity.
  • 🔍 OpenAI provides a model to create embeddings but does not offer a storage solution, necessitating the use of a cloud database.
  • 🛠 Postman is a GUI tool that simplifies the process of making API requests, including creating embeddings with OpenAI's API.
  • 🔑 To use OpenAI's API, an API key is required for authorization, which should be kept secure and private.
  • 📝 Embeddings can be created for single words, phrases, or large documents, with the latter being particularly useful for capturing complex information.
  • 🗄️ SingleStore is an example of a cloud database provider that supports vector databases, allowing for real-time, distributed SQL databases.
  • 📊 SQL queries can be used to create tables in a vector database to store text and corresponding embedding vectors as blobs.
  • 🔎 Searching a vector database involves creating an embedding for the search term and comparing it against stored embeddings to find the most similar results.
  • 💻 JavaScript and Node.js can be used to create functions that interact with embeddings, automating the process of fetching, creating, and storing them.

Q & A

  • What are embeddings and how do they relate to AI products?

    -Embeddings are arrays of numbers, also known as vectors, that represent data such as words in a way that captures patterns of relationships. They are essential for AI products as they allow for the measurement of similarity between different pieces of data, which is crucial for tasks like semantic search and natural language processing.

  • Can embeddings be used for images as well as text?

    -Yes, embeddings can be used for images too. Just like with text, images are broken down into arrays of numbers, which can then be used to find patterns of similarity, enabling features like Google's similar image search.

  • What is a vector database and how is it used?

    -A vector database is a database that stores embeddings. It can be used in various ways, including searching (where results are ranked by relevance), clustering (grouping text strings by similarity), and recommendations or classifications based on similarity to related items or labels.

  • How does the video guide the creation of an embedding using OpenAI's API?

    -The video provides a step-by-step guide on creating an embedding using OpenAI's API, starting from accessing the API documentation, setting up API requests in Postman, and making a POST request with the necessary model and input text to receive the embedding response.

  • Why is Postman used in the demonstration and what is its role?

    -Postman is used as an API platform and web app that simplifies the process of making API requests. It is used in the demonstration to create and send API requests to OpenAI for generating embeddings, due to its user-friendly interface and features that facilitate API testing.

  • What is the significance of the 'ada002' model used in the OpenAI embedding example?

    -The 'ada002' model is the specific version of OpenAI's text embedding model used in the demonstration. It is mentioned as the cheapest version, suitable for creating embeddings for the input text.

  • How can embeddings be used to create a long-term memory for a chatbot?

    -Embeddings can be used to create a long-term memory for a chatbot by storing embeddings of conversations or knowledge bases in a vector database. When a user's query comes in, the chatbot can search the database for the most similar embedding to retrieve relevant information or context.

  • What is SingleStore and how does it relate to storing embeddings?

    -SingleStore is a real-time, unified distributed SQL database provider that allows for the incorporation of vector databases. It is used in the video to create a database for storing embeddings and to perform vector searches based on similarity.

  • How does the video demonstrate storing embeddings in a SingleStore database?

    -The video demonstrates creating a table in SingleStore with columns for the original text and the embedding vector. It then shows how to insert data into this table using SQL queries, using embeddings created with OpenAI's API.

  • What is the process for searching a vector database for similar embeddings?

    -The process involves creating an embedding for the search term, and then performing a search in the database against the existing embeddings. The results are ranked by similarity, with the closest matches appearing first.

  • How is JavaScript used in the video to interact with embeddings?

    -JavaScript is used to create a function that makes a fetch request to the OpenAI API to create an embedding. The function takes text as input, sends it to the API, and then logs and returns the embedding data.

  • What is the purpose of the book 'Teach Me OpenAI and GPT' mentioned in the video?

    -The book 'Teach Me OpenAI and GPT' is a digital book that covers comprehensive information about using the OpenAI API, including fine-tuning and other advanced topics, presented in a visually interesting and engaging manner.

Outlines

00:00

🧠 Introduction to Embeddings and Vector Databases

The video script introduces the concept of embeddings and vector databases as crucial components in AI product development. It outlines a three-part approach to explain the theory, practical use, and integration of these technologies with OpenAI APIs. Embeddings are described as a transformation of data, such as words, into numerical vectors that capture relational patterns. The script provides an example of word embeddings in a 2D graph and explains how these vectors can be used for various applications like search, clustering, and recommendations. The video aims to teach viewers how to create a long-term memory for AI chatbots or perform semantic searches on large PDF databases.

05:01

🔍 Creating Embeddings with OpenAI's API

The script details the process of creating embeddings using OpenAI's API with the help of Postman, an API platform. It explains how to log in to OpenAI, navigate to the API documentation, and use the embeddings endpoint to make POST requests. The example given uses the 'ada002' model to create an embedding for the text 'hello world'. The script also discusses different types of embeddings, such as single-word and multi-word embeddings, and the potential for embedding large documents. It emphasizes the importance of storing these embeddings in a vector database and introduces the concept of using a cloud database for this purpose.

10:02

🗄️ Setting Up a Vector Database with SingleStore

The script guides the viewer through setting up a vector database using SingleStore, a real-time unified distributed SQL database. It covers creating an account, setting up a workspace, and creating a database and table within SingleStore to store the embeddings. The table creation process involves defining columns for the original text and the vector data type. The script demonstrates how to insert data into the vector database using SQL queries and Postman, showing the process of embedding a 'hello world' example and a larger document example, and how to view the stored data.

15:04

🔎 Searching Vector Databases and JavaScript Integration

The final part of the script explains how to perform searches within a vector database by creating an embedding for the search term and comparing it with the stored embeddings to find the most similar results. It also includes a demonstration of creating a JavaScript function using Node.js to interact with the OpenAI API and handle embeddings. The script provides a step-by-step guide to set up the function, make a fetch request to the OpenAI API, and process the response. Additionally, it suggests potential applications for embeddings, such as importing PDFs, websites, and other data sources into the database for searchability.

Mindmap

Keywords

💡Embeddings

Embeddings are a fundamental concept in the video, referring to the process of converting data, such as words, into numerical vectors that capture the relationships and patterns among the data. In the context of AI, embeddings allow for the measurement of similarity between different pieces of data. For instance, the script mentions that 'dog' and 'puppy' would be represented by vectors close to each other, indicating their semantic similarity. Embeddings are crucial for tasks like search, clustering, and recommendations.

💡Vector Databases

A vector database is a type of database designed to store and manage vector representations of data, such as embeddings. The script explains that these databases can be used for various applications, including searching for relevant results based on a query string, clustering text strings by similarity, and classifying text strings by their most similar labels. The video focuses on using vector databases for searching, which is a common use case in AI products.

💡Open AI

Open AI is mentioned in the script as a provider of AI models specifically for creating embeddings. The video demonstrates how to access Open AI's API to generate embeddings, which are then used for various applications. Open AI is significant in the video's narrative as it provides the technology to convert text into embeddings, a key step in building AI products with capabilities like long-term memory for chatbots or semantic searches.

💡APIs

APIs, or Application Programming Interfaces, are sets of rules and protocols that allow different software applications to communicate with each other. In the video, the script discusses how to use Open AI's APIs to create embeddings and how to interact with them programmatically. APIs are essential for integrating AI capabilities into various products and services.

💡Postman

Postman is an API platform and web application introduced in the script as a tool for making API requests. It is used to demonstrate how to create embeddings by sending POST requests to Open AI's API. Postman simplifies the process of interacting with APIs and is highlighted in the video as the GUI software used for creating and testing API requests.

💡Semantic Searches

Semantic searches involve finding information based on the meaning and context of the search terms, rather than just their literal keywords. The script mentions that embeddings can be used to perform semantic searches based on a large database of PDFs connected to an AI. This type of search is important for AI products that require understanding the relationships and context of data.

💡Chat GPT

Chat GPT, or Chat Generative Pre-trained Transformer, is a type of AI model that can be enhanced with embeddings to create long-term memory. The script suggests that understanding and using embeddings is essential for building AI chatbots that can remember past interactions and provide more contextually relevant responses.

💡ADA Model

The ADA model, specifically ADA002, is an embedding model provided by Open AI that the script uses as an example for creating text embeddings. The ADA model is chosen for its capabilities and cost-effectiveness, and it is used to demonstrate the process of generating embeddings from text inputs.

💡SingleStore

SingleStore is a real-time, unified distributed SQL database provider mentioned in the script as the chosen platform for storing embeddings. It is used to create a vector database where embeddings can be stored and searched. The script demonstrates setting up a SingleStore database and inserting data into it.

💡SQL

SQL, or Structured Query Language, is used in the script to interact with the SingleStore database. It is the standard language for managing and manipulating relational databases. The video shows how to use SQL to create tables, insert data, and perform searches within a vector database.

💡Node.js

Node.js is a JavaScript runtime environment mentioned in the script for creating a function to interact with embeddings. The script provides an example of using Node.js to write a function that fetches embeddings from the Open AI API, demonstrating how to programmatically work with embeddings in a JavaScript environment.

Highlights

Embeddings and vector databases are essential for building AI products.

Embeddings convert data like words into numerical vectors to measure similarity.

Vectors can represent complex relationships in a multi-dimensional space.

Images can also be turned into vectors for pattern recognition in searches.

Vector databases store embeddings and can be used for searching, clustering, and classification.

OpenAI provides an AI model to create embeddings but not a storage solution.

Postman is a GUI software for making API requests, useful for creating embeddings.

Creating an embedding involves sending a POST request with text input to the OpenAI API.

Embeddings can be created for single words, phrases, or large documents.

OpenAI's ada002 model allows for embedding up to 8,000 tokens, suitable for large documents.

SingleStore is a cloud database provider that supports vector databases.

A vector database can be queried using SQL to search for the most similar embeddings.

JavaScript and Node.js can be used to interact with embeddings and databases programmatically.

The video demonstrates creating a function in Node.js to fetch and use embeddings.

Embeddings can be used to create long-term memory for AI chatbots or perform semantic searches.

The tutorial covers integrating OpenAI's embeddings with a database for practical applications.

The speaker offers a digital book 'Teach Me OpenAI and GPT' for further learning on OpenAI's capabilities.

The video concludes with a summary of the basics and a suggestion to explore more advanced topics.

Transcripts

play00:00

embeddings and Vector databases are

play00:02

essential if you're building any type of

play00:04

AI product in this video I'll go over

play00:06

what they are and how to use them with

play00:08

open Ai and their apis I'll cover this

play00:10

in three parts I'll explain the theory

play00:12

then the use and finally integration

play00:14

after watching this video you'll be able

play00:16

to create long-term memory for a chat

play00:19

GPT or perform semantic searches based

play00:21

on a huge database of PDFs connected

play00:24

directly to an AI let's start there are

play00:27

two terms I'll cover embeddings and

play00:29

Vector databases they work together I'll

play00:31

begin with embeddings what are

play00:33

embeddings to put it simply an embedding

play00:35

is data like words that have been

play00:37

converted into an array of numbers known

play00:40

as a vector that contains patterns of

play00:41

relationships the combination of these

play00:43

numbers that make up the vector act as a

play00:46

multi-dimensional map to measure

play00:48

similarity for a simple example let me

play00:50

describe a 2d graph the words dog and

play00:53

puppy are often used in similar

play00:55

situations so in a word embedding they

play00:57

would be represented by vectors that are

play00:59

close together well this is a simple 2D

play01:02

example of a single dimension in reality

play01:04

the vector has hundreds of Dimensions

play01:06

that cover the rich multi-dimensional

play01:08

complex relationship between words

play01:10

images can also be turned into vectors

play01:13

too and it's how Google does similar

play01:15

image searches the image sections are

play01:17

broken down into arrays of numbers

play01:19

allowing you to find patterns of

play01:21

similarity for those with closely

play01:23

resembling vectors once an embedding is

play01:25

created it can be stored in a database a

play01:28

database full of these is considered a

play01:30

vector database and can be used in

play01:32

several ways including searching where

play01:34

results are ranked by relevance to a

play01:36

query string or clustering where text

play01:39

strings are grouped by similarity and

play01:41

recommendations where items with related

play01:43

text strings are recommended also

play01:45

classification where text strings are

play01:47

classified by their most similar label

play01:49

for the purpose of this video I'm going

play01:52

to just cover searching since it would

play01:54

be the most commonly used there are many

play01:56

practical ways to do this but open AI

play01:58

has provided a great AI model to

play02:01

specifically create embeddings it does

play02:03

not however provide a way to store them

play02:06

which we will be using a cloud database

play02:07

for later in the video now let's start

play02:10

making an embedding by accessing open AI

play02:12

on the Google page I'm going to browse

play02:15

open AI I'm going to head over to the

play02:18

open AI website where I can create a new

play02:21

account or log into an existing one

play02:23

which is what I'm gonna do it's free if

play02:25

you want to sign up I'm going to log in

play02:27

using my Google credentials and I'll be

play02:29

taken to a few options here between chat

play02:31

gbt Dali and other apis I'm going to

play02:34

head over to their API page and this

play02:37

will take me to the dashboard here I

play02:39

want to start off having a look at the

play02:41

documentation on the embeddings which

play02:44

you can find just over here and I'll

play02:46

also link this in the description but

play02:48

what we're going to start off with is

play02:50

creating some API requests for

play02:52

embeddings themselves in order to do

play02:54

that we're going to head over to the API

play02:56

references page here under embeddings is

play02:58

all the information to create one and

play03:00

it's quite simple it's as easy as doing

play03:03

a single post request with some inputs

play03:05

and getting a response back we could

play03:07

write code to do this or do it inside of

play03:09

a terminal but the easiest way is inside

play03:11

of some GUI software and this is called

play03:13

Postman it's an API platform and it's

play03:17

also the sponsor for today's video

play03:19

Postman is a piece of software and also

play03:21

a web app that's entirely free that

play03:23

allows you to do all sorts of API

play03:25

requests it's quite easy to get running

play03:27

I'm using Windows so I'm going to

play03:29

download the windows version of the

play03:31

application here I've launched it the

play03:34

first thing I'm going to do is create a

play03:35

workspace for my API queries this I'm

play03:38

going to call open AI Vector database

play03:40

there are a few different types of

play03:42

workspaces between personal private but

play03:45

I'm going to select the team one because

play03:46

this allows me to share out my workspace

play03:49

to other people in the future if I

play03:51

wanted to for this new workspace I'm

play03:53

going to open up a new tab kind of like

play03:55

Chrome here I've got a few different

play03:57

options but the main one here is the

play03:59

type of request which can be a get post

play04:02

push I'm going to do a post request and

play04:05

here I'm going to get the URL that we're

play04:07

going to be using for the embeddings

play04:08

which is just down here and it's the

play04:11

api.openai forward slash version one

play04:13

forward slash embeddings URL I'm going

play04:16

to paste this here in the URL section

play04:18

and what's useful is that Postman is

play04:20

telling me that I do need authorization

play04:22

to use this endpoint for an API I also

play04:25

have the instructions on how to generate

play04:27

one so I'm going to head over to this

play04:29

open AI API Keys page and on this page I

play04:33

can select it to create a new secret key

play04:35

I'm going to label this embeddings and

play04:37

then I'm going to generate it make sure

play04:40

you keep this key private I'm going to

play04:42

copy and paste this into Postman and

play04:46

I'll place it here in the API key

play04:48

section be aware Postman allows any type

play04:51

of authorization we're using the bearer

play04:53

token which is how open AI authenticates

play04:56

with all this setup we can now create

play04:58

our first embedding which is the easiest

play05:00

part on the open AI website under create

play05:04

embeddings all the information to create

play05:06

a request is available it only really

play05:09

needs two things the model as well as

play05:12

the input for the model in this example

play05:14

we'll be using open ai's text embedding

play05:17

using ada002 which is also the cheapest

play05:20

version and for the input it can be any

play05:22

type of text so here on Postman I'm

play05:25

gonna head over to create this body

play05:27

request under the body I'm going to

play05:29

select to pass in some raw information

play05:31

and this will be in the form of a Json

play05:34

object I'll add in some curly brackets

play05:36

and here I'm going to pass in the string

play05:39

for the model as well as the string for

play05:41

the input in true programming fashion

play05:43

I'll have this input say hello world and

play05:46

that's it all that's left is to send

play05:48

this post request to open Ai and its API

play05:51

and here's the response we've created

play05:53

our first embedding quite easy and this

play05:55

is also the vector for the embedding

play05:57

down here and if you preview it it quite

play06:00

a lot of numbers let me now show a few

play06:03

different types of examples of

play06:04

embeddings that could be created there

play06:06

are single word embeddings these are

play06:09

things like dog or cat where the

play06:11

embedding is a single word and the

play06:13

embedding Vector is generated when it's

play06:15

sent to open AI these would be used in

play06:18

situations where you might want to

play06:19

perform a search you can do multi-word

play06:21

embeddings too so I could for example

play06:23

have a small sentence like open AI

play06:26

vectors and embeddings are easy and this

play06:29

creates a more nuanced embedding but for

play06:32

the most part it looks the same to us

play06:34

humans the strength of embeddings is

play06:36

where you chunk large bits of

play06:38

information together such as paragraphs

play06:40

or entire sections of documents to

play06:44

create an embedding that can be then

play06:46

later drawn upon when you search a

play06:48

database and for some useful information

play06:50

Ada version 2 allows for a maximum input

play06:53

of 8 000 tokens more or less which is

play06:55

around 30 000 letters or characters and

play06:58

to give you an idea of how much that is

play07:00

a page from a contract or a legal

play07:02

document can be up to about 3 000

play07:04

characters on a single page this means

play07:07

you could probably embed about 10 pages

play07:09

in a single request to give you an idea

play07:12

of what this looks like a let me copy

play07:14

paste this entire page from this NDA

play07:16

contract then I'm going to jump into

play07:19

Postman and paste this here into the

play07:21

input removing any line breaks or

play07:23

additional spaces this gives you an idea

play07:26

of just how much is being currently

play07:27

embedded and this embed request takes

play07:30

just about as much time as a single word

play07:32

now that we can create embeddings we

play07:34

need to store them somewhere openai

play07:36

doesn't provide databases so we'll need

play07:38

to create our own and a database full of

play07:40

embeddings is often referred to as a

play07:42

vector database I'm going to use a

play07:45

provider called a single store they

play07:47

provide a real-time unified distributed

play07:49

SQL database which also is quite easy to

play07:52

use since it's in the cloud and on top

play07:54

of that they allow for you to

play07:56

incorporate Vector databases straight in

play07:59

there so what we'll do is set up a

play08:00

database and we'll start storing our

play08:02

embeddings and then also start searching

play08:04

through them first you'll want to set up

play08:06

an account it's free and comes with

play08:08

additional credit as well as the ability

play08:10

to set up unlimited databases I already

play08:12

have an account so I'm just going to

play08:13

sign in via Google on the main dashboard

play08:16

the first thing you'll want to do is

play08:17

create a workspace and in here we'll

play08:20

soon create a database for your

play08:22

workspace you can call it anything in

play08:24

mind I'm going to call it open AI Vector

play08:26

database I'm going to select the cloud

play08:28

platform for AWS but you can also select

play08:32

a Google or Microsoft Azure platform and

play08:35

for the region I would recommend just

play08:37

picking the one that's closest to you

play08:38

but I'm just going to leave it to the

play08:40

default on us West next I'll select the

play08:42

button next on this page I can set how

play08:45

fast the workspace is how many CPUs how

play08:48

much RAM it has but for this tutorial

play08:50

I'm not going to do much I'm just going

play08:51

to keep it at the bare minimum I'll

play08:53

leave the advanced settings as the

play08:55

default and select next this workstation

play08:57

will set up in the background and once

play09:00

complete I'm going to create a database

play09:02

on it let me close these info panels

play09:04

here we'll have the workspace on the

play09:06

left hand side and on the right side I'm

play09:08

going to select create a database I'm

play09:10

going to name this database open AI

play09:13

database and then select to create it

play09:15

once it's created it'll show up here in

play09:18

the interface and I can select to view

play09:20

it it doesn't have any tables or data

play09:23

just now so what we'll do next is head

play09:25

over to the SQL editor where we can run

play09:28

some simple commands to create a table

play09:31

and then input some data make sure that

play09:34

on the top you have selected the

play09:36

database you're currently using so in

play09:38

this case I'm going to select the

play09:39

database open AI database and now I'm

play09:42

going to write a simple SQL query it'll

play09:44

be create table if not exist and I'll

play09:48

have the name of the table as my table

play09:49

actually I'll do my Vector table here in

play09:53

Brackets I'm then going to put in the

play09:55

different types of columns I'm going to

play09:57

have the first will be the text the

play09:59

original text as a text type the next

play10:02

will be the vector and the vector will

play10:04

be a blob type you can have different

play10:07

types such as integers and numbers and

play10:09

decimals and you can even have a larger

play10:11

table with things like an ID and other

play10:14

attributes like a URL but I'm keeping

play10:16

this nice and simple for this demo let's

play10:18

run this command and this has created

play10:21

the table here we can see a logged below

play10:23

I can head into the database and select

play10:26

it and view the table additionally I can

play10:28

view the different columns as well as

play10:30

the data type additionally I can head

play10:32

over to sample.2 have a look at the rows

play10:35

which don't exist yet let's create our

play10:37

first row and insert a vector into this

play10:40

database what I'm going to do is open up

play10:42

the SQL editor in here I'm going to

play10:45

write some new syntax to insert a row

play10:47

this is insert into the table name which

play10:50

is a my Vector table then in Brackets

play10:53

text and Vector being the two attributes

play10:55

we want to fill and then the values for

play10:58

those which in the this case will be the

play11:01

ones that we have in Postman so I'm

play11:03

going to open up a postman here I'm

play11:04

going to copy out of this input hello

play11:06

world and I'm going to paste this here

play11:08

into the values next I'll put in the

play11:11

embedding here I'm going to pass in Json

play11:13

underscore array underscore pack which

play11:16

is necessary to turn it into this blob

play11:18

structure and in here I'm going to pass

play11:20

in the string of the array this is in

play11:23

Postman here under embedding and it's

play11:25

this huge block of numbers I'll select

play11:28

to copy all of them and then here in the

play11:31

interface I'll paste them back in and

play11:33

that's it now I simply have to run the

play11:35

command and this should enter this

play11:37

embedding into the vector database which

play11:40

it just has here we can see the results

play11:42

one row added and if I head over to the

play11:45

open AI database under my Vector table

play11:48

I'll be able to see under sample data

play11:50

this a new row has been added now it's

play11:53

time to add some more data because we

play11:55

can't really have a database with just a

play11:57

single row what I'm going to do is is

play11:59

head back to postman search through the

play12:02

history of all the posts we've done so

play12:04

far and pull out some of these examples

play12:08

to place in as a rose inside of the

play12:11

database so I've got this one here for

play12:12

open AI vectors and embeddings are easy

play12:15

I'll copy the input as well as the

play12:18

embedding straight into single store wow

play12:21

it always surprises me how many

play12:22

dimensions there are to this embedding

play12:24

now I'll enter this and insert this row

play12:27

heading back to the database I'll be

play12:29

able to view the sample data and here it

play12:31

is let me try this one more time with a

play12:33

larger example I've got this one here

play12:35

which was the document which had quite a

play12:38

few different characters in here and the

play12:40

process is more or less the same copying

play12:42

out the text then copying the embedding

play12:45

vector and then finally inserting the

play12:48

row into single store let me show you

play12:50

the three rows that now exist that are

play12:53

quite simple but enough for us to start

play12:55

performing searches searching a vector

play12:57

database for embeddings is actually

play12:59

quite simple the first step is to

play13:02

identify what you want to search for so

play13:04

for example we might want to search for

play13:06

anything related to open AI next we have

play13:09

to create an embedding for our search

play13:11

term in this case we would create an

play13:13

embedding for the word open Ai and then

play13:16

finally we would perform a search in the

play13:18

database against the existing embeddings

play13:21

this would return a list with the

play13:23

closest similarity being at the top

play13:25

heading over to the SQL editor I'm going

play13:28

to write this query by writing the

play13:29

following select text which is one of

play13:32

the columns then I'm going to pass in

play13:34

dot underscore product passing in the

play13:37

next column the vector column and adding

play13:40

in Json underscore array underscore pack

play13:42

I'll leave this empty for now but this

play13:45

is where I'll add the array and then

play13:46

convert this to a score that we can use

play13:48

for ranking results next I'll grab this

play13:52

from my Vector table and I'll order it

play13:55

by the score in descending order this

play13:58

will be limited to just Five results now

play14:01

opening up Postman I'm going to need to

play14:03

create an amending to search so the

play14:06

embedding term I want to use is open AI

play14:08

this will create the vector here which

play14:12

I'm going to copy out and then paste

play14:15

back into the SQL query the SQL query

play14:18

will use this as a reference and use it

play14:21

as part of the scoring system so let me

play14:23

run this and show you the results here's

play14:26

our successful search of the vector

play14:27

database against the term open Ai and

play14:30

all the other vectors that are similar

play14:32

to it the scores are ratings essentially

play14:35

how highly ranked it is is how similar

play14:37

it is to the content already in the

play14:39

database I could create another search

play14:41

term embedding here called hello Earth

play14:43

and use the vector that I receive as a

play14:47

search and have a try to see if this

play14:49

changes up the results pasting this into

play14:53

the query here on the vector database

play14:55

I'll find that now hello world is the

play14:58

top ranking result built with a much

play15:00

higher score than all the others and

play15:02

this is fundamentally how Vector

play15:03

searching Works in this part of the

play15:05

video I'm going to create an actual

play15:07

function using JavaScript on node.js to

play15:11

interact with embeddings first let me

play15:13

create a brand new folder called openai

play15:15

vectors and embeddings next I'm going to

play15:18

do a fetched request to the open AI API

play15:21

so here inside of the file index.js I'm

play15:25

going to write in a small header here to

play15:28

connect up to open AI in this header I'm

play15:31

going to make the content type as

play15:33

application Json and I'm also going to

play15:36

pass in the authorization which will

play15:38

have the key here as a barrier token

play15:40

similar to what I did inside of Postman

play15:43

the safe and secure way to do this is

play15:45

using an environmental key but for this

play15:48

example here I'm just going to pass in

play15:50

my token straight away to make a life

play15:52

easier for myself this of course is

play15:54

something you want to keep secure and

play15:56

after this video I'll be deleting this

play15:58

key now Now to create the function I'm

play16:00

going to create a async function and I'm

play16:02

going to call this create embedding it's

play16:05

only going to pass in one item which

play16:07

will be the text and this is the text

play16:09

that we're going to embed and I'm going

play16:11

to use this to pass to an API now I'm

play16:14

not going to use axios or anything like

play16:15

that I'm just doing a regular fetch

play16:17

command here so I'm going to pass in let

play16:20

response equals await fetch I'm going to

play16:23

pass in the URL for open ai's version 1

play16:26

of embeddings which is just a forward

play16:29

slash V1 forward slash embeddings then

play16:31

I'm going to pass in a couple of

play16:33

parameters this will be a post request

play16:36

so I'll set the method here as a post

play16:38

I'll pass in the headers that we just

play16:40

defined earlier and then finally I'll

play16:44

pass in the body which I'm going to Json

play16:46

stringify and this is the post data

play16:49

there's two items here just like before

play16:51

the model itself which is the text

play16:54

embedding Ada model and the second item

play16:57

is the input which is the text we want

play17:00

to embed next I'll listen to see if

play17:04

there is a response from the server and

play17:06

that response is okay and then I just

play17:08

want to console log it out and return it

play17:11

as part of this function here I'll write

play17:13

response Json so it'll turn it into a

play17:17

Json object and then I'll pull out the

play17:20

data from that I'll console log out the

play17:23

data and then I'll return the data this

play17:25

function is now complete so what I'm

play17:28

going to do is call it to test it out

play17:30

here I'm going to call create embeddings

play17:33

and just pass in hello world next I'll

play17:36

open up the terminal and run node

play17:39

index.js to run this function and just

play17:42

like on Postman this embedding gets

play17:44

returned quite quickly we can see it

play17:46

just over here with the array here on

play17:49

the right hand side and we can plug this

play17:50

into a database next what can we do with

play17:53

this well there's quite a few things you

play17:55

can import a PDF library to read PDFs

play17:58

chunk them and and then store them in a

play18:00

database so that they could be retrieved

play18:02

later and searched through you could do

play18:04

the same for websites or you could do

play18:06

the same for all sorts of things if

play18:08

you've stayed until this point you

play18:10

definitely want to learn more about

play18:11

openai check out my new book teach me

play18:13

open Ai and GPT this is a digital book

play18:17

I've put on sale for 49 that covers

play18:19

everything from how to use the openai

play18:21

API how to do fine tuning and much more

play18:24

in a really visually interesting and fun

play18:26

way it's over 48 pages and it's in the

play18:29

description below these were just the

play18:30

basics if you want to learn more have a

play18:32

look at my video here on integrating

play18:33

openai a database and even a web app

play18:36

together and if you haven't already this

play18:38

is the end of the video so don't forget

play18:40

to hit

Rate This

5.0 / 5 (0 votes)

Related Tags
AI EmbeddingsVector DatabasesSemantic SearchOpenAI APIPostman ToolDatabase IntegrationText EmbeddingImage SearchAI ProductSearch AlgorithmsSQL Database