LlamaIndex Webinar: Advanced RAG with Knowledge Graphs (with Tomaz from Neo4j)

LlamaIndex
18 Jun 202453:32

Summary

TLDRこのウェビナーでは、Llama IndexとNeo4jが共同で提供するプロパティグラフインデックスのワークショップが行われました。新しいプロパティグラフの抽象化を使用して高度なナレッジグラフの構築方法が学べます。ドキュメントからグラフコンストラクタを通じてナレッジグラフを作成し、グラフリトリーバーを使用してユーザーの質問に基づいて情報を取得する方法が紹介されました。さらに、カスタムエンティティの特定と統合、テキストからCypherクエリの生成など、応用的なトピックもカバーされています。

Takeaways

  • 📚 このウェビナーは、Llama IndexとNeo4jが共同で開催した「プロパティグラフインデックス」に関するワークショップであり、高度なナレッジグラフの構築方法を学ぶことができると紹介されています。
  • 🔍 プロパティグラフとは、ノードとリレーションシップの両方ともプロパティを持つグラフモデルであり、ISO委員会の一部である新しいGQL標準に基づいています。
  • 🛠️ Llama Indexでは、ドキュメントからテキストを取得し、グラフコンストラクタを使用してナレッジグラフを作成するプロセスが説明されています。
  • 🌟 Llama Indexは、初心者向けのアウトオブザボックスのツールと、上級者向けにカスタマイズ可能なモジュール化されたパイプラインを提供しています。
  • 🔧 3つの異なるグラフコンストラクタ(暗黙のパッド抽出、シンプルなLLMパターンセレクター、スキーマLLMパターンセレクター)がLlama Indexで利用可能な機能として紹介されています。
  • 🔗 エンティティの統合(エンティティの重複を解決するプロセス)は、ナレッジグラフの構造的整合性を高めるために重要なステップであることが強調されています。
  • 🔎 プロパティグラフリトリイバーには、ユーザーの質問に基づいてナレッジグラフから情報を取得するロジックがあり、Llama Indexでは複数のリトリイバーが提供されています。
  • 📝 ワークショップでは、カスタムエンティティ識別とカスタムリトリイバーの定義方法が実演され、これによりより柔軟性のある情報取得が可能になることが示されています。
  • 🤖 LLM(Large Language Models)を使用したグラフの構築とリトリイバルは、モデルの特性に応じて異なる結果を出すことがあり、適切なプロンプトとスキーマの定義が重要であることが触れられています。
  • 📈 ウェビナーでは、技術文書や特許、科学論文など、専門的な分野でのナレッジグラフの構築と活用方法についても議論されており、実際のビジネスや研究での応用が見込まれています。
  • 🔑 ワークショップの最後に、参加者が今後のアプリケーション開発やエンタープライズ開発者向けのブログ記事やケーススタディの機会について興味を持っていることが呼びかけられています。

Q & A

  • ラマインデックスとNeo4jのパートナーシップで提供されるこのウェビナーの主なテーマは何ですか?

    -このウェビナーでは、プロパティグラフインデックスとラマインデックスを組み合わせた高度なナレッジグラフの構築方法について学ぶことができます。

  • プロパティグラフとはどのようなデータモデルですか?

    -プロパティグラフは、ノードとリレーションシップの両方ともにプロパティを持つグラフモデルであり、ISO委員会の一部である新しいGQL標準に基づいています。

  • ラマインデックスで提供されているグラフコンストラクタは何種類ありますか?

    -ラマインデックスでは、暗黙のパッド抽出器、シンプルなLLMパッドセレクター、スキーマLLMパッド抽出器の3つのグラフコンストラクタが提供されています。

  • グラフコンストラクタはどのようにして文書からナレッジグラフを作成するのですか?

    -グラフコンストラクタは、与えられた文書からテキストを取得し、構造化された情報を抽出してナレッジグラフに保存します。

  • カスタムグラフインデックスを実装する際に、どのようなアプローチを取ることができますか?

    -カスタムグラフインデックスは、モジュラーでカスタマイズ可能であるため、初心者はすぐに使用できるものから始め、上級者は自分のニーズに合わせてパイプラインをカスタマイズすることができます。

  • ウェビナーで紹介されたエンティティの重複を解決する方法とはどのようなものですか?

    -ウェビナーでは、テキスト埋め込みとワード距離統計を使用して潜在的な候補を見つけ、それらをマージすることでエンティティの重複を解決する方法が紹介されました。

  • グラフリトリーブアはどのような機能を持っていますか?

    -グラフリトリーブアは、ユーザーの質問に基づいてナレッジグラフからデータを取得するためのロジックを持っており、4つの組み込みリトリーブアがあります。

  • テキストからCypherステートメントを生成するText2Cypherリトリーブアの利点とは何ですか?

    -Text2Cypherリトリーブアは、柔軟性が高いため、ユーザーがデータベース内の情報を正確に知っていれば、より複雑な問いに対する答えを得ることができます。

  • ラマインデックスで使用されるプロパティグラフインデックスの統合の流れを説明してください。

    -プロパティグラフインデックスの統合では、文書から始まり、グラフコンストラクタを使用してナレッジグラフを作成し、その後グラフリトリーブアを使用してユーザーの質問に基づいて情報を取得します。

  • ウェビナーで紹介されたカスタムリトリーブアの主な目的は何ですか?

    -カスタムリトリーブアは、テキストから特定のエンティティを検出し、それらのエンティティに基づいてナレッジグラフから情報を選択するように設計されています。

Outlines

00:00

📚 ラマインデックスとネオ4jによるプロパティグラフインデックスの紹介

この段落では、ラマインデックスとネオ4jが共同で開催するウェビナーシリーズの紹介が行われています。このウェビナーでは、高度なナレッジグラフの構築方法を学ぶことができます。プロパティグラフの概念が説明され、ノードやリレーションシップがプロパティを持つことが特徴であることが強調されています。また、ドキュメントからグラフへの変換プロセスや、グラフコンストラクタとグラフリトリーブャーの役割についても触れられています。

05:02

🛠️ プロパティグラフの構築とカスタマイズ可能性の説明

この段落では、プロパティグラフの構築プロセスが詳しく説明されています。ドキュメントからグラフコンストラクタを通じてナレッジグラフを作成する流れが紹介され、利用可能なグラフコンストラクタの種類も説明されています。また、初心者向けのアウトオブザボックスのオプションと、上級者向けのカスタムパイプラインの柔軟性が強調されています。

10:03

🔍 グラフコンストラクタの詳細と実践的な応用

ここでは、利用可能なグラフコンストラクタの詳細が説明されています。インプ利ットパッドエキストラクター、シンプルLLMベースのセレクター、そしてより高度なスキーマLLMベースのエキストラクターについて学びます。それぞれのコンストラクタがどのように機能し、どのようにカスタマイズできるかが具体的に紹介されています。

15:03

🤖 グラフリトリーブャーの種類と機能の概説

この段落では、グラフリトリーブャーの機能と種類について説明されています。ユーザーの質問に基づいてナレッジグラフから情報を取得するロジックを持つグラフリトリーブャーの概要が提供され、利用可能な4つのリトリーブャーについて概説されています。それぞれの方法の利点と欠点が比較され、柔軟性と正確性のトレードオフが議論されています。

20:05

📝 テキストからプロパティグラフへの変換デモンストレーション

この段落では、実際にテキストをプロパティグラフに変換するデモンストレーションが行われています。GPT-4を使用して、特定のスキーマに従ってグラフを作成し、その過程でエンティティの重複を解決する方法も紹介されています。また、実践的な例として、ニュース記事からビジネス関連の情報を抽出するプロセスが説明されています。

25:06

🔗 エンティティの統合とカスタムリトリーブャーの開発

ここでは、エンティティの統合の重要性と方法が議論されています。重複したエンティティを特定し、それらをマージすることでナレッジグラフの構造的整合性を高める方法が説明されています。さらに、カスタムリトリーブャーの開発方法が概説され、エンティティ抽出とベクターコンテキストリトリーブャーの組み合わせ方について学びます。

30:07

🚀 カスタムリトリーブャーの実装と応用

この最終段落では、カスタムリトリーブャーの実装方法とその応用が詳しく説明されています。エンティティ抽出から始まり、特定のエンティティに基づいた情報の取得方法が紹介されています。また、このプロセスを通じて、より正確で包括的な情報を取得することができることが強調されています。

Mindmap

Keywords

💡プロパティグラフ

プロパティグラフとは、ノードとリレーションシップがプロパティを持つグラフモデルのことです。この概念は、ビデオの主題である知識グラフの構築と検索に不可欠です。例えば、ビデオでは「プロパティグラフモデルではラベルがついている」と説明され、ノードをカテゴリーに分類するのに使われます。

💡グラフコンストラクタ

グラフコンストラクタは、テキストをもとにグラフに構造化された情報を保存するプロセスを指します。ビデオでは、グラフコンストラクタがドキュメントから情報を抽出し、知識グラフを作成する役割を果たしていることが強調されています。

💡グラフリトリーブア

グラフリトリーブアは、ユーザーの質問に基づいて知識グラフからデータを取得するロジックを持つもので、ビデオの検索プロセスに関する議論の中心概念です。例えば、ビデオでは「グラフリトリーブアはユーザーの質問に基づいてデータの取得に使う」と説明されています。

💡ラベル

ラベルは、プロパティグラフにおいてノードをカテゴリーに分けるための特別なプロパティです。ビデオでは「ノードラベル」として、特定のノードが属するカテゴリーを示すために使用されると説明されています。

💡ドキュメント

ドキュメントは、ビデオスクリプト中でテキストのラッパーとして機能し、グラフコンストラクタがテキストから情報を抽出する際の出発点となっています。ビデオでは「ドキュメントはテキストのラッパー」と定義され、知識グラフの作成プロセスにおける役割が強調されています。

💡エンティティ

エンティティは、知識グラフにおけるノードとして表される実在の物や概念です。ビデオでは、エンティティの重複を統合することが、グラフの構造的整合性を高めるために重要であると説明されています。

💡リレーションシップ

リレーションシップは、プロパティグラフでノード同士を繋ぐもので、それ自身にもプロパティを持つことができます。ビデオでは「リレーションシップはプロパティを持つ」とされ、知識グラフのリッチネスに寄与する要素とされています。

💡知識グラフ

知識グラフは、情報をグラフ構造で表現し、ノードとリレーションシップによって繋がった形でデータを視覚化したものです。ビデオの主題であるグラフの構築と検索において、知識グラフは中心的な役割を果たしています。

💡テキストチャンク

テキストチャンクは、ドキュメントのテキストを分割した単位で、グラフコンストラクタによって処理される基本となる要素です。ビデオでは「テキストチャンク」がグラフに保存されるとされ、知識グラフの構築プロセスにおける役割が説明されています。

💡カスタムリトリーブア

カスタムリトリーブアは、ユーザーが独自の検索ロジックを実装するための機能で、ビデオでは高度な検索ニーズに対応する際に重要な概念です。ビデオではカスタムリトリーブアの作成方法が議論されており、エンティティの特定やテキスト全体の検索を含む柔軟性の高い検索を可能にしています。

Highlights

ウェビナーはLlama IndexとNeo4jが共同で開催された、プロパティグラフインデックスに関するワークショップ。

プロパティグラフの概念と、ノードとリレーションシップがプロパティを持つことが標準化されたGQLについて紹介。

Llama Indexの新しいプロパティグラフインデックス統合の概要と、ドキュメントからグラフへの構造化情報の抽出方法。

グラフコンストラクタとグラフェトリバーのモジュール性とカスタマイズ可能性について。

ドキュメントをラップする際の「ドキュメント」の概念と、テキストをグラフに変換するプロセス。

Llama Indexで提供される3つのグラフコンストラクタの紹介とそれぞれの特徴。

カスタムエンティティ同定とグラフの構造的整合性向上のためのエンティティマージングの重要性。

カスタムエンティティ同定のためのテキスト埋め込みとワード距離統計を使用した方法。

プロパティグラフリトリバーの4つのタイプと、それぞれの検索方法の紹介。

カスタムリトリバーの作り方と、エンティティ抽出からグラフ検索までを一連のプロセスで行う方法。

LMS(Large Language Models)の選択と、知識グラフ構築に適したモデルの推奨。

グラフ構築におけるヒューマンインザループアプローチの欠点と、自動化された再パスの利点。

技術文書や特許、科学論文のような専門的なドキュメントに対するNeo4jの適応性と応用事例。

エンティティ重複除去のための複雑なCypherクエリの共有と、その有用性。

カスタムリトリバーのコード例と、テキストからのエンティティ抽出からリレーションシップの特定まで。

ウェビナーの総括と、Llama IndexとNeo4jが提供する知識グラフの潜在的な超セットとしてのポテンシャル。

ワークショップの終了と、今後の予定、ウェビナーのYouTubeでの公開予定についてのアナウンス。

Transcripts

play00:00

hey everyone uh welcome back to another

play00:02

episode of The Llama index webinar

play00:04

series uh today is straping up to be one

play00:06

of our most popular webinars workshops

play00:08

ever uh with uh property graph indexes

play00:12

in llama index in partnership with neo4j

play00:15

um and so this is going to be a special

play00:17

workshop on teaching you how to build

play00:19

Advanced Knowledge Graph Rag and you'll

play00:21

be able to learn how to use our brand

play00:22

new property graph abstractions both uh

play00:25

to construct an existing graph as well

play00:27

as a queria graph um and so we're

play00:29

excited to host toas from neo4j as well

play00:31

as Logan from our side and without

play00:33

further Ado um feel fre to kick it

play00:36

off okay so uh like I said yeah I'm

play00:39

happy to be here and talk about

play00:42

graphs so as mentioned today we're going

play00:44

to talk about the new property graph

play00:47

index integration in Lama

play00:50

index and if you might be wondering

play00:53

property graph what actually is that

play00:56

right because most of the time

play01:00

I don't know if I can remove this so

play01:04

most of the time um when dealing with GS

play01:08

especially in the rec like Frameworks

play01:11

what we see is usually triplers just

play01:13

like subject uh relationship type and

play01:17

then uh object and the other way around

play01:22

but actually property graphs have now

play01:26

got uh an actual um

play01:30

standard and there's the new gql

play01:33

standard which is uh part of the iso

play01:39

committee um and uh that's very exciting

play01:43

uh but basically property graph as you

play01:45

might metion means that nodes have

play01:49

properties and relationships have

play01:51

properties as well so for example U here

play01:56

we have one node and it has property is

play02:00

name Amy PS date of birth employee ID

play02:04

right but it also has

play02:06

this one special property that we call

play02:11

label uh in property graph uh models and

play02:15

as you can see for example the green uh

play02:18

node has an

play02:20

employee uh node label right and node

play02:22

labels are used

play02:24

to put notes into sets sets of

play02:28

categories so for example in this

play02:30

example we have three node labels so we

play02:34

have employee company and

play02:37

City and as mentioned before uh

play02:40

relationships can also have properties

play02:43

right so it's a

play02:44

slightly different um data

play02:48

representation that what you might be

play02:50

used

play02:52

to like from the previous let's say

play02:55

implementation in Lama index the

play02:58

knowledge implementation where we only

play03:01

only dealt with

play03:04

triers so that's about property graphs

play03:08

and now let's talk a little bit about

play03:10

property graph index

play03:13

integration so the

play03:16

flow that uh most of the time you will

play03:19

follow when you're using property graph

play03:21

index

play03:22

integration uh you start with a bunch of

play03:25

documents and Lama index has great

play03:28

support for various types of documents

play03:30

uh so I will not go into that but

play03:33

basically documents you can think of

play03:35

them as just wrappers around

play03:38

text and so basically we take a bunch of

play03:42

documents and we pass the text from

play03:45

those documents to graph

play03:48

Constructors so in the integration that

play03:52

Logan did you can use one or multiple uh

play03:56

graph

play03:58

Constructors uh to create a Knowledge

play04:01

Graph and we'll talk a little bit more

play04:04

about them later I'll show you what

play04:07

which sh available out of the

play04:09

box and so uh the property the graph

play04:14

Constructors extract like structured

play04:17

information from uh documents and store

play04:21

them um as a graph in the knowledge

play04:23

graph so there are a couple of

play04:26

Integrations uh that Lama index already

play04:29

has

play04:30

uh as I'm from newj will focus on the

play04:33

newj integration but there are others as

play04:36

well and probably more coming

play04:39

up so once you've built your knowledge

play04:43

graph on the other side we have the

play04:46

so-called graphet

play04:48

rers so the graph ret rers their job is

play04:53

to uh basically based on the user

play04:58

question they have some sort of logic

play05:02

they can use to retrieve data from the

play05:05

knowledge gra right uh and again there

play05:08

are a couple of out of the box that you

play05:12

can use and we'll also show how easy is

play05:16

to

play05:18

um uh Define a custom uh graph in the

play05:23

workshop lated so that's basically the

play05:26

kind of the flow that you

play05:30

can think of and what llama index does

play05:32

is it provides graph Constructors and

play05:35

graph retrievers I mean obviously also

play05:37

the other parts just that the graph

play05:40

Constructors and graph retrievers are

play05:42

part of the new property graph index

play05:46

integration and the idea is that they

play05:48

are very modular and

play05:51

customizable so that if you're beginner

play05:53

you can just use something out of the

play05:55

box but if you like Advanced user and

play05:58

you need something custom

play06:00

it's very easy to customize uh the

play06:04

pipeline for your

play06:06

needs so let's uh like uh like on at

play06:11

high level property graph construction

play06:13

as I said we take a bunch of documents

play06:15

and we can select a Knowledge

play06:17

Graph so here I have an example uh uh

play06:23

documents where like former open AI uh

play06:27

employees founded new companies right so

play06:31

for example here you you would have four

play06:34

different um documents me read some

play06:39

information right but the nice thing

play06:42

about knowledge graphs is that you kind

play06:46

of condense that

play06:50

information and unify it right so

play06:54

that

play06:56

basically the information that was

play06:58

previous spread across multiple uh

play07:02

documents is now easily accessible uh

play07:06

and uh nicely represented in a Knowledge

play07:12

Graph okay and then I prepared uh couple

play07:17

of slides what is available out of the

play07:20

box in Lama index so out of the box we

play07:24

have

play07:26

three uh graph Constructors

play07:30

the first one is the so-called implicit

play07:33

pad

play07:35

extractor and what it does it it just um

play07:39

uh we have a new word for it it's called

play07:42

lexical graph but what it actually does

play07:45

you just take the green note is the

play07:47

original

play07:49

document and what it does it just

play07:52

chunks uh the documents so the what's

play07:56

this like uh Gray is

play08:01

nodes are text chunks right and text

play08:04

chunks are connected to the source

play08:07

document and then also we have uh an

play08:10

ordered list of text chunks so that we

play08:13

know uh how do they follow each other so

play08:16

this is the implicit P extractor um

play08:20

graph Constructor and it doesn't require

play08:22

an LM because it's just uh basically

play08:27

chunking up and then creating

play08:30

a linked list of text

play08:32

chunks so and then the next one is the

play08:36

simple llm P selector and as the name

play08:39

would suggest you need an llm for

play08:44

that and uh from what I've been digging

play08:48

around in the in in the um

play08:53

implementation how the simple llm ptic

play08:56

sctor works is basically through prompt

play08:58

engineering

play09:01

uh so you in the prompt you define

play09:06

um what uh like how the output should

play09:11

look like and then you provide

play09:13

U uh a paring function that uh extracts

play09:19

that uh output from an LM and creates a

play09:22

Knowledge Graph so it's I would call it

play09:26

like a plump based uh solution uh and uh

play09:32

by default all nodes have the same label

play09:37

like the default implementation all no

play09:39

have the same label and again the purple

play09:43

one is the text Chunk so we always store

play09:47

the reference text in the graph as well

play09:51

and then the no the the entities that

play09:55

were mentioned in the text Chunk we have

play09:57

the mentioned relationships

play10:00

to those uh

play10:03

entities and then obviously entities can

play10:06

have relationships between each other

play10:08

right so for example

play10:11

Amilia uh aard was American Aviation

play10:16

pioner right and this is kind of the

play10:19

simpler uh version of graph extraction

play10:23

now obviously you can customize it and

play10:26

make it more uh

play10:30

Advanced but by default um all nodes

play10:34

have the same label

play10:37

U

play10:39

yeah and then uh the more

play10:43

advanced is the schema llm P

play10:47

extractor so here you have the ability

play10:50

to Define which nodes and uh node labels

play10:53

and relationship types you can extract

play10:56

um this one will also show use in the

play10:59

work shop so you'll get to see uh

play11:02

in practice how that how do we Define

play11:06

the

play11:06

schema but basically as you can see by U

play11:11

by different colors of noes means that

play11:16

they have different uh no labels and

play11:20

again the purple ones are the text

play11:23

chunks the defin text chunks and then

play11:26

the text

play11:27

chunks mention the entities that

play11:31

appeared in those text chunks and then

play11:34

obviously we have a bunch of U

play11:38

relationships between uh those uh

play11:42

entities as well and this one uh works

play11:46

best with llms that provide function

play11:50

calling like native function calling

play11:52

like the

play11:53

commercial LS like open

play11:56

Gemini misal

play11:59

uh probably some others Gro is a nice

play12:03

one as well Gro is actually a really

play12:06

nice one because uh it uh it's really

play12:10

fast and that's really nice for the

play12:12

knowledge

play12:15

construction uh but yeah uh so it works

play12:19

best with u l with Native function

play12:22

calling but Logan told me that it will

play12:25

also work with

play12:28

u models that don't provide native

play12:31

function calling just not as well or

play12:33

maybe the schema uh should be simpl in

play12:37

those using those LS but U as I said I

play12:42

haven't tested that out but uh maybe you

play12:46

can test it out and let me know or let

play12:48

us know how it

play12:50

works so this is basically

play12:53

the uh out of the box graph Constructors

play12:57

that you can use

play13:00

uh and then uh what's not in

play13:06

the uh uh llama index yet but since

play13:11

llama index provides a low level

play13:14

connections to graph stor near for this

play13:18

example if we can also come up with

play13:20

custom entity dis disfiguration which we

play13:23

will do in this Workshop as well and

play13:26

entity dis immigation just means

play13:31

that if you have multiple nodes in the

play13:34

knowledge

play13:35

graph excuse me that reference the same

play13:39

real word entity you kind of want to

play13:42

merge them together into a single noes

play13:45

so that you have a better structural

play13:48

Integrity right so for here in this

play13:51

example this note ends with a limited

play13:54

this one has abbreviation limited and

play13:56

this one doesn't have limited right but

play13:59

it all references the same

play14:02

note so that mean that's why we want to

play14:06

merge it into a single note so that we

play14:09

have better seal integrity and in the

play14:12

workshop we'll use a combination of text

play14:14

embeddings and uh word distance heris

play14:17

stics to find potential candidates and

play14:20

then merge them uh together here is

play14:24

basically if you're uh anxious you can

play14:28

follow this link and this is basically

play14:30

the notebook that we'll be using

play14:34

today okay and then on the other side we

play14:36

have property graph

play14:38

retrievers so as you see I just took the

play14:42

previous image and just slice it up

play14:45

because here we have the remaining of

play14:47

the arrow but the graph retriever as I

play14:50

said uh based on the user input they

play14:53

have some logic how to uh retrieve uh

play14:58

information from the knowled

play15:00

graph and then pass that information to

play15:03

an llm so that the llm can generate the

play15:06

final l so basically a typical like

play15:12

Pipeline and we have I think

play15:16

four uh out of the box retrievers that

play15:20

you can use so here I didn't have time

play15:22

to draw nice diagrams so I just uh sum

play15:27

summarize them

play15:30

quickly so the first one is the llm

play15:32

synonym retriever it takes the user

play15:36

query generates synonyms using an

play15:40

llm and then it finds relevant notes

play15:43

using exact keyword match so that's

play15:46

really

play15:48

important because llm is not aware of

play15:52

any values in the database when it's

play15:55

generating the

play15:57

synonyms so it's not

play15:59

not uh are given that the llm will know

play16:04

which like how to construct the keywords

play16:08

so that they will match any notes in the

play16:10

graph so because it uses at least the

play16:14

new forj um integration uses exact

play16:17

keyword match it's the least reliable

play16:20

right because it needs exact keyword

play16:22

match we

play16:25

could we could basically optimize this

play16:29

uh to allow some misspelling or stuff

play16:32

like that but at the moment it's using

play16:35

exact keyword match and then once it

play16:38

finds a relevant

play16:40

notes it returns basically the direct

play16:43

neighborhood or basically you have an

play16:46

option to

play16:48

decide uh how many like what's the

play16:51

distance what's the neighborhood size of

play16:54

the nose that you want to return so by

play16:56

default we return just Direct neighbors

play16:59

of that

play17:02

node and then the second one is the

play17:05

vector

play17:07

context so um in the previous one we

play17:11

used exact keyword search to find

play17:13

relevant notes but here in this example

play17:15

we using Vector search so that means

play17:19

it's more robust and less reliable on

play17:23

exact keyword match that because with

play17:26

keyword with Factor search

play17:29

you will always get some results from

play17:31

the database because you take top

play17:34

n uh and then uh hopefully some relevant

play17:39

notes are

play17:40

identified using Vector search and then

play17:44

uh we just do the same thing as we did

play17:46

in the previous one we just return the

play17:49

direct neighborhood uh of relevant nodes

play17:52

that were found using the vector

play17:56

search and then another one uh is the

play18:01

textto cipher

play18:03

so as the name

play18:06

implies we the take the text and use an

play18:11

llm to generate Cypher statement so this

play18:13

is kind of very

play18:16

um flexible approach right because the

play18:20

line can construct any sour of uh Cipher

play18:25

statements so for example um with oops

play18:29

how do I go back with Vector context

play18:32

right when you're just finding for uh

play18:35

searching for Relevant notes using

play18:38

Vector search and then returning direct

play18:40

neighborhoods it's very hard to answer

play18:44

questions like how many nodes are in the

play18:47

graph because it's like an aggregation

play18:52

query and Vector Contex is not suitable

play18:56

for aggregation queries

play19:00

uh at least not like on the global scale

play19:04

um but like for example with texer you

play19:07

could ask it questions like how many

play19:09

nodes are in the graph or like how many

play19:11

P are in the graph and the lln will

play19:14

generate a

play19:17

appropriate U Cipher statements and

play19:21

return that information for you that so

play19:24

text CER is much more flexible than the

play19:26

previous TOS but but the on there's

play19:31

obviously always a tradeoff that is less

play19:33

reliable because we're using Ln to

play19:36

generate saer statements and

play19:39

that's at the moment how is it it's

play19:41

mostly

play19:43

correct uh but that just but not like

play19:46

always so you kind of U trading

play19:51

of flexibility for a bit of accuracy um

play19:56

but then on the other side what you have

play19:59

is also

play20:00

uh some different uh like Tex CER allows

play20:05

you to also do aggregations and stuff

play20:07

like that which the

play20:10

previous uh uh R didn't allow

play20:14

you and then the last one is the

play20:16

so-called Cipher template

play20:20

R and here instead of generating Cipher

play20:25

statements with an llm you basically

play20:27

Define the CER statements you want to be

play20:30

executed and you just

play20:33

um parameterize or like provide like a

play20:37

parameterized cipher template so

play20:40

basically you have uh a cipher statement

play20:45

with like one or more um uh parameters

play20:50

and then at credit time basically the

play20:54

llm you provide uh instruction to to an

play20:58

llm how to populate those uh

play21:01

parameters and then at query time uh llm

play21:05

extracts relevant parameters it needs to

play21:08

use with the cipher

play21:10

template uh populates the template and

play21:13

then executes the predefined cipher

play21:16

template so that's where the template

play21:19

comes from because it's uh

play21:24

predefined and then uh here I have

play21:27

questions but let do a

play21:31

demo to I'm just going to read through

play21:33

some questions in the chat so far just

play21:35

to make sure we cover some of them um

play21:37

before uh before the workshop um the

play21:40

first is um I think one question is

play21:42

actually about using llms and um if you

play21:45

have a set of recommended LMS that you

play21:48

think are better for say like Knowledge

play21:49

Graph construction um as well as the

play21:51

cost of running llms across a large

play21:54

Corpus of documents to construct an

play21:55

allograph I'm curious to get your

play21:58

initial there um as well as like

play22:00

recommendations for some of these users

play22:02

some of them are thinking about using

play22:03

like rock for

play22:04

instance yeah so what you will see is

play22:07

that graph constru like

play22:10

LMS and graph construction it's very

play22:13

model dependent so different models will

play22:17

uh generate different graph graphs and

play22:21

it's very like given different versions

play22:24

of GPT 4 will behave

play22:27

differently so so I did some testing

play22:30

like I'm not like an expert in all of

play22:32

the LNS but for example what I will tell

play22:35

you when you're using like a predefined

play22:38

schema uh like the GPT 3.5 will try to

play22:42

fit all that information into the schema

play22:46

so that uh it kind of over fits

play22:50

information into the schema where

play22:53

uh uh uh it's not really where it

play22:57

wouldn't really fit in reality but then

play23:00

GPT for Turbo and the for are much

play23:04

better at like ignoring the information

play23:08

that is not part of the

play23:10

schema uh for example if you want to use

play23:15

I would really recommend using LMS that

play23:18

are fast so for example gp4 just throw

play23:21

it out of the window because it takes

play23:22

forever and it's costly right so in that

play23:26

like grock is really ni nice CU it's

play23:29

really fast but then the problem with

play23:31

Gro is they don't don't want to take our

play23:34

money just yet so hopefully when they

play23:37

will be taking out credit cards that's

play23:41

something I would definitely uh look

play23:45

into but like in general it's like the

play23:49

better the model the better will be the

play23:52

results no more it will follow um The

play23:55

Prompt instructions right so uh great

play23:58

and just following up with with just one

play24:01

more quick question is um uh you know

play24:03

this might not actually quite exist in

play24:05

in some of our obstructions right now

play24:07

but um one of the questions around like

play24:09

dealing with missing information from

play24:11

the graph which sort of implies this

play24:13

like maybe you do some LM construction

play24:15

pass it's not completely exactly where

play24:18

you want it to be and so you do some

play24:19

human um in the loop PA to try to like

play24:22

modify and shape the graph uh to you

play24:24

know better reflect what you want out of

play24:27

that data um have you seen that like

play24:29

kind of human in the loop approach

play24:30

towards like graph

play24:32

construction so it's not really human in

play24:34

the loop it's more like they have some

play24:37

characteristics because I didn't really

play24:39

mention but it

play24:41

the if you want to take a look U the

play24:45

grass rack by Microsoft paper is really

play24:49

nice and it deals with some of this uh

play24:53

questions so the first one is also like

play24:56

what types of what what's the size of

play24:59

text chunks you should use right and the

play25:01

thing is the it's kind of funny like the

play25:05

number of notes and relationships is

play25:08

kind of irrelevant to the chunk

play25:11

sizes so that just means if you're using

play25:14

smaller text chunks more information

play25:17

will be selected and if you're using

play25:19

bigger text chunks like this the overall

play25:23

number of extractions will be the um

play25:26

extracted information will be the the

play25:28

same but since you're using larger text

play25:31

chunks right on the like in summary less

play25:36

information will be extracted from

play25:39

larger text CHS so that's one one one

play25:42

thing they mention in the paper and then

play25:44

the second thing they mentioned they

play25:46

have some sort of

play25:48

characteristics where they can decide

play25:50

okay not enough information was

play25:52

extracted from the text and then they do

play25:55

a second R so basically instead of

play25:56

having a human in the it's kind of

play25:59

automated and saying okay you didn't do

play26:01

a good enough job now let's do a second

play26:05

pass on the

play26:07

graphic oh okay yeah super

play26:11

interesting any other questions

play26:14

so should be feel free to carry on

play26:17

there's a ton of questions but I think

play26:19

we'll

play26:22

mean because the extraction part will

play26:24

take a couple of minutes and we can uh

play26:28

answer questions um so here I I Define

play26:34

My Graph St

play26:37

so okay just second okay that's

play26:42

fine and uh one thing I also noticed is

play26:46

that people are sometimes confused by

play26:49

documents uh because like all Lama

play26:53

index um mostly deals with the document

play26:56

right but document is just a dier around

play26:59

the text so it's very easy to go from

play27:02

text to document we just instantiate a

play27:05

document with the text property and

play27:08

that's about it right so so here in this

play27:10

example we're going to create a bunch of

play27:15

documents based on the news so we have a

play27:18

bunch of news um and we're going to use

play27:21

GPT Pho and for example one thing that's

play27:24

also interesting there's like a lot of

play27:27

things

play27:29

uh that comes popping up and uh one

play27:32

thing I noticed today was today or

play27:35

yesterday somebody did some benchmarks

play27:38

and they said basically that if you're

play27:40

using slightly higher temperature than

play27:43

zero even for the the terministic tasks

play27:47

you get better the results and that's

play27:50

like and it was specifically for

play27:53

photo right so again that's kind of very

play27:56

interesting and we we all learning uh as

play27:59

we go along

play28:02

right but as as as mentioned we're going

play28:05

to use the schema llm pet instructor in

play28:09

this

play28:10

Workshop so with schema llm pet

play28:14

instructor you have to Define the types

play28:16

of notes do you want to selct so here I

play28:19

went for person location organization

play28:22

product and event so it's mostly a very

play28:26

um typical

play28:29

uh

play28:30

extraction and then there's the event

play28:33

which is kind of more ambiguous and in

play28:36

allows the llm to extract any type like

play28:39

a lot of information right because event

play28:42

can be anything

play28:45

basically and then I also we also have

play28:50

to Define the types of relationships we

play28:52

want to exct so here I focused more on

play28:55

the like organization business part

play28:59

where we have suppliers competitors

play29:01

Acquisitions subsidiaries CEOs stuff

play29:05

like that so we're going to hopefully

play29:07

extract some

play29:09

business relevant SL financial

play29:13

information hopefully from

play29:17

the knowled gra and then uh so this is

play29:22

the first part of the when we are

play29:25

defining the scheme and then the second

play29:26

part is we also have to

play29:30

Define uh which information which

play29:33

relationships is assigned to each person

play29:36

right because not all relationships can

play29:40

be part of all node labels right so we

play29:44

have to Define uh so for example a

play29:47

product only has provides uh

play29:50

relationship and then provides is only

play29:53

on the organization so then I um iide

play29:57

the would generate only provides

play30:00

relationships between organizations and

play30:03

produ

play30:04

because it's it doesn't really make

play30:07

sense to have uh provides relationship

play30:10

from location to let's say a product so

play30:13

this is a a little bit more granular

play30:16

um schema definition uh that we need to

play30:22

provide and then we just uh let's go

play30:26

with 100 uh and then we just pass the

play30:30

possible entities relationship

play30:32

validation schema to the llm and here

play30:36

you have the strict mode so strict mode

play30:39

like even if you

play30:43

provide instructions to the llm which

play30:46

types of noes and relationships it

play30:49

should use it doesn't really mean that

play30:52

it will follow them oops bad idea 100%

play30:55

correct right because l just LMS they do

play30:59

what they

play31:01

want and then Logan uh implemented a

play31:06

strict mode so it means that but since

play31:09

we know the types of relationships and

play31:12

noes we expect we can filter them out if

play31:16

we want to uh in the code right or we

play31:20

can leave any other nodes in the

play31:23

relationship that

play31:25

identified so in this case let's just I

play31:29

love any information uh

play31:33

the llm decides additionally to extract

play31:38

now gp4 is quite good at

play31:41

following um the provided schema but

play31:44

other models and this is also because as

play31:48

I said gbt for is an native function

play31:51

calling model so when you're using

play31:54

functions or tools to extract

play31:56

information s information

play31:58

it will have much better accuracy

play32:02

whereas like llama 3 which is not Gro so

play32:06

Lama 3 via orama right doesn't have

play32:08

function calling it's still a really

play32:10

good

play32:12

model but it might not follow the uh

play32:17

schema always right so that's why you

play32:19

have the option to filter it in

play32:24

postprocessing if you want to or not and

play32:26

here we'll go for

play32:29

not and we're going to exct information

play32:32

from 100

play32:34

articles and it's going to take like two

play32:38

three minutes I think so we have time

play32:41

for a couple of

play32:45

questions um yeah for sure I'm trying

play32:48

I'm trying to figure out what questions

play32:50

asked um maybe maybe one thing is um uh

play32:54

actually going back to the retrieval

play32:56

side so you know there is Vector search

play32:58

with the vector context Retriever and

play33:00

then there's also Tex de Cipher um you

play33:03

mentioned some limitations of uh Texas

play33:06

Cipher like in your mind like what are

play33:07

some of the maybe like tips and tricks

play33:09

you see in getting Texas Cipher working

play33:12

a little bit better for users um in

play33:14

terms of making it making sure it

play33:15

generates more reliable Cipher queries

play33:18

how to make sure it actually retrieves

play33:19

Val in

play33:22

context

play33:24

so but I mean this is kind of hard

play33:28

question

play33:29

so text Cipher works good for like the

play33:35

when the user knows what's in the

play33:39

database right and knows how to ask the

play33:43

questions that fits the

play33:45

schema right so that's one thing and

play33:48

then how do you achieve that so what you

play33:51

could do is you could have some qu re

play33:54

writing steps that take the user input

play33:58

and rewrites it into more of

play34:01

a a question that fits the graph schema

play34:06

and it's a little bit more verbos or

play34:09

implicit on how it wants the information

play34:12

to be

play34:13

retrieved so that's one thing uh

play34:16

obviously providing it with few short

play34:19

examples is very

play34:21

helpful because by default it uses zero

play34:25

short um generation right which we just

play34:28

give it the graph

play34:29

schema and then hope for the best but

play34:32

what you can also do is you can uh also

play34:36

provideed few short examples and then uh

play34:40

hope that it follows those examples and

play34:43

obviously the thing is like with more

play34:46

complex graph

play34:49

schemas uh there's like a the just how

play34:53

to describe those schemas takes a lot of

play34:56

tokens so when

play34:58

like and then maybe not linearly but the

play35:02

bigger the size of the schema the less

play35:04

it will I mean yeah the V the accuracy

play35:08

will be so what you can also do is then

play35:12

um just provide parts of the schema so

play35:15

instead of having one text to Cipher

play35:18

that deals with the whole graph schema

play35:21

what you can do is you can

play35:23

have like an

play35:26

agent with like several tools and then

play35:29

each of those tools uh focus on

play35:31

different parts of the schema right uh

play35:34

so you kind

play35:35

of lower the complexity of the

play35:40

task

play35:42

yeah yeah that makes a lot of sense um I

play35:45

know it's about to finish up um but that

play35:49

maybe just another question and and we

play35:51

can also carry this over after things

play35:53

are done but is ne4 designed to uh work

play35:57

with like uh like technical document use

play35:59

cases like patents and scientific papers

play36:02

um like will help in identifying and

play36:04

building relationships between you know

play36:07

science technical Concepts uh that's one

play36:09

of the questions from the

play36:11

audience yeah so um yeah how you say Neo

play36:15

is domain agnostic so you can store any

play36:18

information you want in it that being

play36:22

said it's quite quite funny that you

play36:24

mentioned patents and uh

play36:27

technical document documentation because

play36:30

that's really relevant or at least what

play36:33

we see a lot of pharmaceutical or

play36:37

biomedical companies right this is such

play36:41

there's a lot of money in patents and

play36:44

for me it was also

play36:46

interesting for example biomedical

play36:48

companies when they have this great idea

play36:51

what we should do or what we should

play36:53

research you know what they first do

play36:55

they check if there's already a patent

play36:57

right and then if it's already a patent

play37:00

they don't research it because it won't

play37:02

make money you can't patent it so and

play37:06

I've seen uh basically like big

play37:08

pharmaceutical

play37:10

companies they all have their patent

play37:12

graph they all like uh scrape like PM

play37:17

you don't actually have to scrape it

play37:19

because it has apis right but it's like

play37:23

you can think of it like biomedical

play37:25

technical documentation like with all

play37:27

the latest research and they generate

play37:30

knowledge graphs uh from those and then

play37:33

use it to inform or recommend so one

play37:37

thing that they do is basically they

play37:40

generate graph from all the latest

play37:42

research then they use

play37:45

recommendations uh to to recommend to

play37:48

doctors based on their specialization

play37:51

which articles they should read right so

play37:55

yes definitely NE can be used for um

play37:59

patents and is actually used by existing

play38:03

customers for patents and Technical

play38:12

documentation okay yeah so now that

play38:14

we've uh imported the graph we can also

play38:18

take a look at

play38:22

it not right graph is visualizations are

play38:27

quite

play38:29

nice so let's see we

play38:35

have okay so we can see that

play38:40

uh uh for

play38:44

example let's see why why we have an

play38:46

award we have two awards no LEL so

play38:49

that's kind of funny but it wasn't in

play38:53

our

play38:55

uh description right so even GPT 40 can

play39:00

decide oh a about really nice of the FA

play39:04

Cup and the English legal title so let's

play39:07

see basic probably will be who won so it

play39:12

was gold

play39:14

McQueen what the cup so probably there

play39:18

should be a football team in there but

play39:21

it's interesting you can see that even

play39:23

uh and we have a disease as well one

play39:27

this so even GPT

play39:29

photo can

play39:31

decide uh to add some information uh

play39:35

that wasn't in the uh schema so that's

play39:38

why we have the

play39:41

um the strict mode

play39:44

right if we use strict mode through we

play39:47

wouldn't see those these uh nodes in the

play39:51

graph because

play39:54

um obviously we would filter them out PR

play39:57

automatically right and then let's try

play40:04

to see if we

play40:06

have I'm trying to find if there's

play40:09

anything more connected but basically

play40:15

unfortunately okay

play40:18

cool and let's

play40:22

uh so we have for example United he

play40:25

group is a note and now we can see a

play40:28

bunch of competitors

play40:31

right and we can also see probably it's

play40:34

not doing so well because it had a stock

play40:37

sell off and stock prise decline right

play40:40

and this is as I mentioned event is kind

play40:43

of ambiguous and it can be a lot of

play40:45

things so in this case it was stock has

play40:49

stock

play40:50

price declined and JN X works at United

play40:54

H group that so like all over all uh the

play41:00

GPT for all uh followed

play41:04

uh the uh uh schema right quite nicely

play41:09

and we can see like a nice graph uh over

play41:13

here and let's

play41:16

go forward and then as I mentioned

play41:20

before entity duplication is kind

play41:24

of a must I think it's

play41:27

often Overlook but you kind of want to

play41:30

uh find uh

play41:32

entities like notes in the graph that

play41:35

reference the same uh real world entity

play41:39

and merge them and here we have a kind

play41:42

of involved Cipher query which took like

play41:45

eight hours to come off

play41:47

it by multiple people but in the end uh

play41:51

we found like a nice way of using um

play41:55

text and beding so here we have the code

play41:57

and similarity threshold and then weary

play41:59

distance so how many characters can you

play42:03

change in

play42:05

uh in the string to have it the same and

play42:08

you can see it works quite well like so

play42:10

for example Bank of America and Bank of

play42:12

America

play42:14

Corporation music music

play42:16

group like new newcast United coinbase

play42:21

so overall it works really nicely to

play42:25

find uh

play42:27

uh like this duplicates but obviously

play42:31

it's not perfect because nothing in life

play42:33

is perfect so for example this one is

play42:37

kind of I mean it's the same it's still

play42:42

Jeff vual space suit that but one is

play42:45

fire side chat which is what yeah maybe

play42:49

not really but fine and for example

play42:53

Baltimore this one also right okay I can

play42:57

understand that these two should be

play42:59

merged but maybe this is a city and

play43:02

shouldn't be merged together

play43:04

that so as always uh you have the option

play43:11

uh to uh tweak these two parameters and

play43:15

you also have the option to uh then do

play43:18

some manually like human in theop here

play43:21

human in the loop is kind of important

play43:24

to know what entities are you emerging

play43:27

in uh but I think like just having like

play43:31

some sort of Baseline to start with uh

play43:34

is really nice and I think this Cipher

play43:37

credit not really nice because you can

play43:38

see a lot of um entities that should be

play43:42

merged together that and let's just

play43:45

merge them

play43:48

together and then for the last part as

play43:51

we

play43:52

said we're going to implement a custom

play43:55

the

play43:57

and we have the four uh existing ones um

play44:01

but here we're going to implement um a

play44:05

ret that first

play44:07

identifies

play44:09

um all the relevant entities in the text

play44:13

because for example the vector

play44:15

context just takes the whole

play44:19

string uh embeds it and then finds

play44:22

relevant not that but what if multiple

play44:24

entities are mentioned in the text

play44:28

then like vector index might not be the

play44:32

greatest because if

play44:36

you if uh it will embed the both

play44:41

entities into a single uh embedding and

play44:45

then who I don't really know who really

play44:46

knows what happens with those numbers

play44:49

there's a bunch of zeros and ones and

play44:52

what do they actually represent who

play44:53

knows so what we'll do really quick

play44:57

before the retrieval piece um actually

play45:00

quick question on the entity uh disin

play45:03

viation um that Cipher query I mean

play45:06

given how involved it is but given the

play45:09

fact that I imagine like a lot of people

play45:11

probably need to do some sort of DD is

play45:13

this like a template that's just like

play45:14

shared publicly because it seems like it

play45:16

would be generally useful for a lot of

play45:18

people yeah yeah this is uh part of the

play45:21

blog is this is all available

play45:25

uh over

play45:28

I mean we can add a link in the webinar

play45:32

if you know how to but it's this one so

play45:39

if

play45:40

I I know how to do

play45:44

chat let me spam it a little bit

play45:47

uhhuh to everyone yeah no problem I

play45:50

think we shared the notebook yeah we

play45:51

Shar the notebook in the chat but

play45:52

basically like that basically your the

play45:54

to the audience it's like if you want

play45:56

just a nice Cipher query to do n d dup

play46:00

obviously there's some limitations you

play46:01

probably need to tweak the the word

play46:03

similarity and those types of things a

play46:04

little bit but like if you want an

play46:08

existing template to go off of you can

play46:09

just like copy and paste from this

play46:10

notebook right because it's a pretty

play46:12

long Cipher string so I would imagine a

play46:14

lot of people are gonna be able to write

play46:15

this it's also I I would I would make it

play46:19

a model in llama index it's just that

play46:23

then it's like read NE forj specific and

play46:25

then like it's it doesn't fit the best

play46:29

into llama index like because you guys

play46:30

want to have things that are I say

play46:33

integration agnostic so uh but maybe we

play46:36

can figure out that in the coming months

play46:40

how to add that because it would be nice

play46:43

to have those out of the box right uh

play46:45

you just

play46:47

expose these two parameters and uh let

play46:51

it do the magic right but maybe this is

play46:55

something yeah is I think even the raw

play46:58

Cipher is useful for for the audience um

play47:00

and then I just doing a quick check on

play47:02

time I know we have you know technically

play47:04

5 to 10 minutes left um but you know I I

play47:06

know the last section is just like the

play47:08

custom retrieval section but maybe we

play47:10

can can just like walk through the high

play47:11

level Concepts maybe just like go

play47:13

through the overall class and then and

play47:15

then that should be a good conclusion to

play47:17

to this Workshop yeah we can do this

play47:19

actually quite fast so as I said we

play47:22

extract entities uh from um the user

play47:26

input and we use a

play47:28

identic uh open identic Pro program so

play47:33

basically

play47:34

again I would imagine we we kind of use

play47:38

function calling behind the scenes right

play47:40

we

play47:41

say this is your input parameter and

play47:43

it's a list of named entities in the

play47:46

text and we we ask GPT for o to um

play47:51

select it uh so basically so then okay

play47:57

I'm rambling a bit but uh so how do you

play48:00

define your custom retri uh so your

play48:03

custom retri just needs two methods or

play48:06

actually just one but the in it is also

play48:10

quite nice if you want to uh instantiate

play48:13

for example some other functions or

play48:16

classes and in in in the in it here we

play48:20

uh instantiate entity extraction which

play48:23

is the open identic program right the

play48:27

to Define to extract relevant entities

play48:30

from

play48:31

text and then we also extract Define or

play48:36

instantiate um existing Vector Contex

play48:40

retriever uh we can use

play48:42

it and then u in the custom retriever

play48:46

it's actually the code is very simple

play48:48

right we just uh exct or

play48:53

find detect it maybe the best word uh if

play48:57

there are entities in the text so if

play49:00

there are entities in the text we just

play49:03

uh run a vector retriever for every

play49:06

entity in the text and if the llm

play49:10

doesn't find any uh specific entities we

play49:14

just use the vector to div on the whole

play49:16

text sline and that's basically it and

play49:20

then you you have a couple of

play49:24

options how you do you um on the

play49:28

structure or format of the results that

play49:31

you can pass back to dat L and here in

play49:35

this example we just pass back the text

play49:39

we can remove this because we don't need

play49:41

to change anything uh yeah

play49:46

and then we just basically instantiate

play49:49

the whole thing and let's see what

play49:54

happens so if you ask what do you know

play49:56

about Mal or

play49:58

data basically the Ln detects two

play50:01

entities right and then for each of the

play50:05

those two entities it R Vector retriever

play50:09

separately so that it kind of ensures

play50:13

that we will get both information for

play50:16

both for both entities right because if

play50:18

you just

play50:20

use Vector on the hor string or text

play50:24

anding of the on the hor string you

play50:27

might just get it for one entity but not

play50:30

the other right because if you use topk

play50:32

for maybe one is more significant uh in

play50:36

the text and bearings but with this

play50:39

approach we make sure to cover all the

play50:42

entities so we get a nice answers for

play50:46

both

play50:47

entities so yeah

play50:50

that's like a high level overview of the

play50:52

D and now we

play50:55

can uh answer a couple of questions

play50:59

again yeah and and maybe just to kind of

play51:02

like say a few words to um to help wrap

play51:05

this up I think you know what Tomas

play51:07

really showed you was an end to-end

play51:08

process of both like constructing a

play51:10

Knowledge Graph um and then also

play51:12

retrieving from it and not just that

play51:15

like showing both the high level API as

play51:18

well as the lower level API so whether

play51:19

you're a beginner user for nfts and llms

play51:22

and L index and neo4j you know you can

play51:24

basically get do all this stuff in about

play51:26

like five lines of code or if you really

play51:28

want to go in you're an advanced user

play51:29

you're pretty familiar with M crafts we

play51:31

offer a lot of opportunities for you to

play51:33

Define your own custom extractors right

play51:35

uh with our core abstractions um like a

play51:38

robust like property graph store like

play51:40

the underlying lowlevel like storage

play51:41

system to

play51:59

um I think a lot of people are

play52:01

interested in knowledge crafts we

play52:02

basically see it as like a superet a

play52:05

potential superet of existing rag

play52:07

Solutions especially if you're able to

play52:09

leverage these like properties and

play52:11

relations to help augment your retriable

play52:13

um and there's a lot of very

play52:20

interesting like an Enterprise developer

play52:23

You're Building all crafts within uh the

play52:25

company um feel free to you know reach

play52:27

out to one of us for any sort of like

play52:29

blog posts case study we're always happy

play52:31

to feature like really interesting use

play52:32

cases of like knowledge graphs llms like

play52:35

w index and and neo4j right um and so

play52:39

always happy to Showcase like very

play52:40

interesting applications um but

play52:42

hopefully this Workshop was was useful

play52:44

um to all of you today uh and you know

play52:47

we'll have this on our YouTube um

play52:49

Channel and then basically hopefully

play52:51

we'll do you know maybe even like a

play52:52

series covering like other types of

play52:54

topics um as we go forth but we're

play52:56

definitely looking forward to to new

play52:58

types of applications um built with like

play53:01

knowledge crafts kgs and and L so I

play53:04

think I think with that said it's

play53:05

probably a good time to generally wrap

play53:07

up and really sorry I think a lot of you

play53:09

had a lot of questions in the in the

play53:10

chat um we uh weren't able to get

play53:13

through all them but we'll have this

play53:15

YouTube uh video out and basically feel

play53:17

free to comment there as well so thank

play53:20

you everyone thank you to and thanks

play53:22

Logan for for hopping in

Rate This

5.0 / 5 (0 votes)

Related Tags
ナレッジグラフプロパティグラフグラフインデックスNeo4jLlama Indexデータ統合グラフデータベース機械学習テキスト抽出データ分析
Do you need a summary in English?