A basic introduction to LLM | Ideas behind ChatGPT
Summary
TLDRThe video discusses language models and large language models like GPT and ChatGPT. It explains how LMs work by predicting the next word in a sequence to model patterns in human language. As more training data and parameters are added, LMs become LLMs like GPT and can be used in solutions for tasks like question answering. The video also introduces concepts like prompt engineering, model security, giving LMs access to tools through APIs, reasoning in LMs, retrieval augmented generation, and model fine-tuning.
Takeaways
- 😀 Language models (LMs) predict the next word in a sequence based on patterns in training data
- 📚 LMs can be used to build solutions like question answering systems
- 🔬 Researchers use more data and parameters to create large LMs (LLMs)
- 💰 LLMs require lots of compute and are expensive to train
- 🤗 Some LLMs are open source and can run locally without APIs
- ✏️ Prompt engineering involves carefully crafting inputs to get desired LLM outputs
- 🔒 There are security concerns around malicious use of powerful LLMs
- ⚙️ LLMs can be given access to tools through APIs to take actions
- 🧠 Making LLMs exhibit reasoning is an area of research
- 📝 Fine-tuning trains parts of a model for specialized tasks
Q & A
What is a language model and how does it work?
-A language model (LM) takes a sequence of words as input and predicts the next word. It tries to model the patterns in human language based on the data it has been trained on.
How can language models be useful?
-Instead of giving an LM random sentences, we can give it questions and instructions to get useful outputs like answers. With enough data and model capacity, LMs can be used to build solutions.
Why do large language models require so much data and compute?
-To model the complexity of human language, LMs need to be trained on internet-scale data (10s of TBs). Bigger models with 100s of billions of parameters also require specialized GPUs for training over months.
What are some popular large language models?
-GPT by OpenAI, LLama by Meta, Falcon by Microsoft/FAIR, Bloom by Anthropic, and more. Many are now open source so you can run them locally.
What is prompt engineering for large language models?
-The way inputs are formatted and fed to LMs can greatly impact outputs. Prompt engineering studies how to frame prompts to get desired and accurate responses from LMs.
How can LMs access tools through APIs?
-LMs can be instructed to output API calls instead of just text. These payloads can then be used to actually invoke those APIs and take actions.
What security concerns exist around large language models?
-Potential issues include generating harmful text, prompt hacking to force unsafe outputs, and more. Work is being done to make LMs secure.
What does retrieval augmented generation mean?
-When an LM needs extra context documents to answer questions, relevant chunks can be retrieved and added to the prompt for better responses.
How does fine-tuning a large language model work?
-Task-specific layers can be added and trained on top of a pre-trained LM for customized performance on specialized datasets.
What other focus areas exist for improving large language models?
-Giving LMs reasoning abilities, tools access through APIs, prompt engineering for better responses, and security.
Outlines
🤓 What is a language model
A language model (LM) takes a sequence of words as input and predicts the next word based on patterns it has learned from training data. LMs are useful for building solutions like question answering by formatting prompts in certain ways. The more data used to train LMs, the better they get.
🌟Scaling up language models into LLMs
To improve language models, researchers use internet-scale data and increase model sizes into the billions of parameters. This requires a massive amount of compute and funding, resulting in large language models (LLMs) that only big organizations can train over months. Some LLMs are now open source.
🎯 Using and customizing LLMs
Pre-trained LLMs like LLaMA can be downloaded and run locally. Prompt engineering refers to formatting prompts to LLMs in ways that produce better, more accurate responses. Fine-tuning allows customizing LLMs for specific tasks by re-training only certain model layers.
🔎 Other areas around LLMs
Some other active areas around LLMs include retrieval augmented generation (RAG) for providing documents as context, ACT which gives LLMs access to take actions, security to prevent illicit content generation, jailbreaking LLMs, and trying to add reasoning and thinking capabilities.
Mindmap
Keywords
💡Language Model (LM)
💡Large Language Models (LLMs)
💡Transformers
💡Prompt Engineering
💡Fine-tuning
💡API
💡Retrieval Augmented Generation (RAG)
💡Security in LLMs
💡Open Source LLMs
💡Compute Resources
Highlights
Language models predict the next word in a sequence based on patterns in training data
LMs can be used to build solutions by prompting them with questions and getting back answers
Researchers increase training data and model parameters to improve LMs
Large LMs require massive compute and are expensive to train
Some large LMs are open sourced for anyone to use
Prompt engineering tunes inputs to get better LM outputs
Giving LMs access to tools enables them to take actions beyond just answering
Security is needed to prevent harmful LM responses
LMs currently lack reasoning and a thought process
Retrieval augmentation retrieves relevant context to answer questions
Fine tuning trains parts of a model for specialized tasks
RAG uses retrieval to provide context for generation
Fine tuning controls how LMs generate text
Pre-trained LMs can be fine-tuned for specific projects
Fine tuning is efficient as only parts of LM are retrained
Transcripts
hello everyone this is Yash and I'll be
talking about language models and large
language models some of the ideas that
encapsulate uh Chad GPD and uh other
similar products it's very hot these
days these products and uh I just want
to discuss the ideas or or the story
line uh that comes uh along with this
products uh we'll not be implementing or
coding anything in this video but maybe
in the in the next or the subsequent
videos we'll actually get get get our
hands dirty and we'll build something as
well so uh the thing that the central
idea around all of this is uh a language
model or an LM so what exactly is an LM
an LM takes a string as input so let's
say an example like she is and uh it
will sort of predict the next word so
she is uh let's say uh watching so let's
say this is what uh the LM has predicted
watching now uh there can be many uh
words that can come instead of watching
but it just uh predicts like the top
probable topmost probable word it highly
depends on the kind of data that rlm has
been trained
on uh you see it it it's not uh it's
like human language it's not very random
like the order of words it's not
extremely random there is some pattern
to it and uh the LM exactly tries to uh
model that
pattern uh it's like U she is uh like
running sleeping you can say these kind
of words uh maybe like nouns as well
like she's president or or some
adjective like she's beautiful or
something and uh but there won't be like
something like is again so let's say she
is is so is as a prediction is quite uh
less likely or or very rare uh you can
say it's almost
impossible uh or words like January so
uh so she is January uh January is also
possible so she is January born or
something uh we can say like in this
case it becomes a adverb of time January
so uh so it's it's it's possible uh but
it's still uh you know like uh very rare
Maybe uh or or or less likely not very
rare but uh less likely compared to
watching or or some verb like sleeping
or
something so this is how the LMS work it
highly depends on the data that it has
been trained on the data that is needed
for an LM to
train it's uh it's just a plain Text
data so uh so just the text uh like con
a book or something which has a human
language like a proper human
language uh which is readable and
understandable it's not like completely
random or anything so if you give that
to an LM model it will uh sort of model
uh the patterns that are there it'll try
to learn it the more the data the better
it is for the model uh if you want to
see an example of how an LM Works
actually and the some little bit of math
behind it I've made a video around
engram language model it's a very simple
uh model how we can uh build it uh there
are more sophisticated ones after that
which uh which uses neural network deep
learning and now uh there's a whole uh
bunch of models which uses
Transformers so Transformers have been
shown to perform the best uh and uh that
is what is driving the current State
ofthe art and chat GPD and all of those
things now if you if you think about it
uh if you really think about it uh
you'll ask uh how are these LMS even
useful right we are giving in some uh
input and uh we are getting the
completion basically like like the words
how is it even useful so uh what how we
can think about it is U we can start
using these LMS to actually build
Solutions so what do I mean is U rather
than uh you know giving it a random
sentence we can actually give it a like
a question so let's say I give a
question like
capital
of India so uh and and what I can do is
I can write answer and I can end the
string there so if I say if this is my
string qu I I I I write a q q for
question and I end it with an A so
hopefully it'll complete it with the
answer right so this is the idea like
we'll start getting answers we can start
getting answers as well but uh it really
depends on the data that LM has been
trained on if uh LM has never seen Delhi
or something in its text this might be
difficult uh to uh still give give the
solution for so this is how the story
continues and uh that is what
researchers uh uh do so what they do is
uh they they sort of
increase the
data uh which is used to train uh these
language models and uh when the data
goes up uh we uh increase the model size
as
well uh what do I mean by increase the
model size is we increase the learnable
parameters in in in the model so more
parameters can be learned or or more
weights are there and uh increase of
data means internet scale of data which
includes uh like a uh web crawl or or
you can say like a chunk of Internet
which uh whichever is possible like
publicly available and which can be
which can be crawled or scraped so for
instance uh this video I'm putting uh
maybe like the transcript of this video
can be taken and uh you'll get that text
to uh you know train your uh language
model uh the posts Maybe you uh might
have made some Facebook post or Reddit
post or something so those posts can
also be included in in within the data
so it's all on the internet people like
you have me have contributed all this
data over the past years and this data
can be used to train the language models
now when the data increases uh maybe in
the internet scale so it's like around
10 TBS of data that we are looking at
and uh the model parameter are in
billions so it's like tens or even
hundreds of billions of parameters are
present so these are like tens or even
hundreds of billions of
parameters uh we'll also need uh so uh
if if the model goes this big and if we
are training this large of data we will
also need more
compute very high compute and by compute
I mean
GPU so uh so that's where the whole
story comes to an
end uh we'll require a lot of GPU which
requires a lot of money so with this
money goes up uh and that's why all the
big organizations are able to train uh
these big
models and uh when you sort of train
these language models on this huge of of
of data which is actually so it's so big
then it's called as llm which is large
language model and uh all these big
organizations can do it because it costs
millions and it requires some time for
uh for it to be trained maybe like
months or so uh with a lot of gpus uh so
they can afford it so they train it and
uh yeah we don't do anything we only use
it I guess uh but yeah like uh some
organizations have been uh very kind now
to open source their models as well so
these llms um it's available the the
pre-trained llms so you don't have to do
do any training you can just pick the
model and run it and it runs on uh on
your local machine as well so there are
these uh llms that uh that come in
market right so everyone knows GPT or uh
GPT is the LM and uh Chad GPT is like
the product or or you can say like a
fine tune version of it uh this this
small model is proprietary it's not you
can access it only through API or U web
interface uh or or like the web browser
that you use right to use chat GPT or
something but you can't run GPT on your
local machine unless you work for open a
but uh some some companies some
organizations uh like meta has uh
launched llama uh it's a llama 2 or
llama or or you can say llama series of
models or uh there is other models like
Falcon and uh Bloom and other models are
there like lot lots of models are there
maybe depending on the time when you're
watching this videos there will be many
more new better models that come out
hopefully so uh for now these are the
this is the state and uh these models
you can actually download them and run
run it on your local you don't have to
use any internet to use them I mean once
you download it uh you you can just run
it and without any API or without any
web interface or anything as such you
can just use these models out of the box
we'll see how how to exactly use these
models in the next video but uh that's
uh about it that's like the whole idea
of U how the llm story comes into
picture uh there is uh lots of things
like around llm so I would like I I just
want to talk about prompt engineering as
a field that came up so uh when when we
give some input sentence or some
instructions uh to these
llms there are specific way in which uh
we can specify these inputs to llm so
that we get desired output or desired
answers so let's say sentiment uh
sentiment classification if I say if I'm
um asking the llm to do to tell me like
what sentiment this sentence belongs to
and uh I'm just passing it uh I'm just
passing in the
sentence uh so maybe rather than me
saying it that way it it makes more
sense to put the sentence first and then
say like what is the above sentence
sentiment for the above sentence
something like that so so the the way
you uh put in this input uh is called
prompt engineering and researchers have
found that uh if you uh for for certain
questions if you do prompting in certain
way uh it will give uh more accurate
answers so we'll see all that it's very
interesting uh the other part is uh act
which comes uh around U uh like when we
talk about llms uh these things uh come
around and uh what what do I mean by Act
is we get give llms access to tools so
we give some tools to these llms which
uh llms can access while apis what do I
mean by that is when we prompt something
like uh let's say book of flight so we
don't just want llm to uh you know tell
us something but we actually want LM to
do something so rather than just uh you
know uh printing the answer or something
it can print like API calls so API call
and uh it might say the source and
destination and it will uh oh I I I
think I should draw it here so it it
might say like source and destination
and and it can send the
API uh to some endpoint so as soon as we
see this text appearing API call we can
actually copy paste this payload and we
can actually make the API call as well
so that is what um it means by giving
access to tools to these llms and uh
yeah that is the whole field of act so
other than act there is also this field
of uh security or around llms uh like
how can we uh make llms not to generate
any profane kind of words or wrong
things so uh something like if someone
asks how to destroy this
planet uh we don't want llms to answer
that maybe so how do we stop llms to do
that and um uh and uh even if we stop
there is a whole field of jailbreaking
and uh you know prompt hacking
which comes like even if it's not
answering how to try and still get the
answer out using these llms so those are
some some more things uh security
related things yeah and there is uh one
more part like uh thinking part so it
comes uh around llm again so llms right
now is just generating uh the completion
or the answer we can say but it's not
really thinking it it doesn't have any
you know sense of knowledge so or or or
or I I shouldn't say knowledge I should
say like uh the reasoning it it doesn't
have any reasoning as such so how can we
uh you know make LMS U make like a
thinking tree that uh hey it's because
of this word that is present in the
sentence that is why the sentiment can
be like this or or it it should be able
to sort of reason these things and uh
you know like the whole thought process
should go on within lm's mind so this is
another field around llm that is there
uh these are some of the things that are
around llm prompt engineering security
acting thinking uh there can be many
more I know only these uh if you know
some you can put in the comments as well
uh so yeah in the next video we'll
actually see uh you know maybe we can
start with U llama or something and uh
how we can actually use uh llama models
or or some existing open source
uh llms um to get these answers and
all uh so yeah thanks for watching uh if
you have any queries you can put in the
comments or uh we can also connect on
LinkedIn uh I'll give the link in the
description if you want uh okay I'm
sorry uh there are two things I I also
wanted to uh you know I thought I'll say
it in in next videos but uh this seems
like a right place so I'll say it here
itself so there are also these concepts
of uh rag and uh fine tuning so they
also come along uh when we talk about
llms uh so along with prompt engineering
and uh security thinking acting uh we
also have these two things a rag uh this
is retrieval augmented generation so uh
let's say you want to you have a
document uh like like an internal
document uh which uh llm doesn't know
about
uh so uh like if you want to pass it
through llm this is the doc and uh if
you want to sort of uh you know talk to
this dog or do question answering from
this dog if the answer is contained
within this dog then uh llm might not be
able to answer it purely without any
context so what we do is we uh we
augment this document along with our
question and we say that uh hey can you
find the answer from from this document
and all the context is attached with it
now uh llms have like a fixed window of
input it can't take uh infinite input so
very long documents uh can't be
processed through llm so what we do is
we break the these documents uh into
chunks and one of the chunk is retrieved
where the answer is present and that
chunk is added as context while
prompting the llm for answer so this is
uh like we retrieve the most relevant
Chunk from this document and then we ask
the question so that the answer is
present in that context itself and a
whole document need not be provided to
the llm so that is what we call
retrieval augmented
generation uh we augment the generation
process the answer generation process
with uh this retrieval step we'll see
how it works and uh I'm planning to make
more videos on this in detail but uh I
thought I just at least introduce this
concept here in this video uh then we
have finetuning so uh let's say you have
like a particular data set uh with like
X and Y so okay this is X this is
y uh so you have uh like a proper input
output kind of pairs uh like a very
specific task for your use case your uh
maybe you're working on some project or
or as an organization you have some
project where uh you want to do very
very specific spe ific classification
kind of thing so maybe you have two
labels uh like like two intents or
something which is specific to your
project uh so llms might not be uh you
know like out of the box it might not be
able to classify them but uh if we have
the data set for those we can uh
finetune an
llm we can fine tune an llm and
uh uh then we can use the llm by giving
X as input and uh we hopefully get y as
output uh we'll see how everything works
uh in fine tuning there are there are a
lot of things in fine tuning it's not
just about X and Y pairs uh it's about
the way llm speaks as well uh like we
can sort of guide the llm to speak or or
to generate in a certain way uh in a
certain style or in certain formats so
that is where fine tuning uh becomes
very useful uh these these models that
that we spoke about llama Falcon they
are not fine-tuned models they are
pre-trained models and that those models
we use them as base base models and we
sort of U if this is the base model
let's say this llm then we add like a
small layer which we train so we need
not train the whole model we just do the
training of this layer or some or it's
called
fine-tuning Uh only only specific parts
of the model are trained so that it does
is it's not very computationally
expensive uh we'll see how how this is
done as well in subsequent videos I'm
planning to make this uh like way ahead
in future maybe not not the next video
right now but uh this is also very
interesting and uh we'll see how llm
generates with fine tuning as well yeah
that's all thanks
Voir Plus de Vidéos Connexes
Introduction to Generative AI
Why Everyone is Freaking Out About RAG
Introduction to Generative AI (Day 2/20) How are LLMs Trained?
A Practical Introduction to Large Language Models (LLMs)
Fine Tuning, RAG e Prompt Engineering: Qual é melhor? e Quando Usar?
Beyond the Hype: A Realistic Look at Large Language Models • Jodie Burchell • GOTO 2024
5.0 / 5 (0 votes)