Run ANY Open-Source Model LOCALLY (LM Studio Tutorial)
Summary
TLDRこのビデオでは、オープンソースの大きな言語モデルをローカルコンピュータで簡単に実行する方法を紹介しています。LM Studioというソフトウェアを使って、AIに触れたことがなくても簡単に始められます。ウェブサイトからダウンロードしてインストールするだけで使い始めることができます。LM Studioは、Hugging Faceのモデルを検索し、モデルカードから情報を取得して表示するインターフェースを提供します。また、モデルの選択やクォンタイズドバージョンの選択も簡単で、コンピュータのスペックに合わせて最適なモデルを推奨します。さらに、開発者向けにもAPIのようなローカルサーバーを立ち上げて、AIアプリケーションを簡単に構築できる機能も備えています。
Takeaways
- 🌐 LM Studioは、オープンソースの大きな言語モデルをローカルコンピュータで実行するためのソフトウェアで、AIの経験がない人でも簡単に使用できます。
- 💻 LM Studioは、Mac、Windows、Linuxの全プラットフォームで利用可能です。
- 🔍 LM Studioのホームページには、Hugging Faceのモデルを検索するための検索ボックスがあり、最新の注目モデルも紹介されています。
- 📚 モデルカードの情報も含め、モデルに関する詳細情報が提供されており、選択しやすいインターフェイスです。
- 🎁 このビデオのスポンサーであるupdfは、Adobe Acrobatの無料代替として、PDFのOCRや編集、保護、AI機能を提供しています。
- 🔍 LM Studioでは、キーワードでモデルを検索し、Hugging Faceと同様のモデルカード情報を得られます。
- 💡 LM Studioはコンピュータの仕様を確認し、選択したモデルが実行できるかどうかを自動的に判断して表示します。
- 📏 モデルの様々な量子化バージョンが提供され、最も適切なバージョンを選択することができます。
- 🛠️ LM Studioは、モデルのパラメータやプリセットを簡単に設定・変更できるインターフェイスを提供しています。
- 🗨️ チャットインターフェイスを使ってモデルと対話ができ、システムメッセージやユーザーメッセージをカスタマイズできます。
- 🛠️ 開発者向けに、LM StudioはローカルHTTPサーバーを立ち上げ、オープンAIのAPIと同様の機能を提供してアプリケーションに組み込むことができます。
- 🗂️ 「マイモデル」タブを使って、ダウンロードしたモデルを管理し、不要なものは簡単に削除できます。
Q & A
LM Studioはどのようなソフトウェアですか?
-LM Studioはオープンソースの大きな言語モデルをローカルコンピュータで実行するためのソフトウェアで、AIを試したことがなくても簡単に使えるようになっています。
LM Studioはどのプラットフォームで利用できますか?
-LM StudioはApple、Windows、Linuxの全てのプラットフォームで利用できます。
LM Studioのホームページで何ができますか?
-LM Studioのホームページでは、モデルを検索したり、Hugging Faceで利用可能なモデルを探したり、モデルカードの情報を確認したりできます。
アップデートPDF(updf)とは何ですか?
-アップデートPDFはAdobe Acrobatの無料の代替品で、OCRやPDFの編集、保護、スタンプの追加など多くの機能を提供しています。
アップデートPDFのAI機能で何ができますか?
-アップデートPDFのAI機能では、文書に対して質問を投げかけ、チャットのように対話することができます。
LM Studioでモデルをダウンロードするときに、どのバージョンを選ぶべきですか?
-モデルを選ぶ際は、自分のコンピュータのスペックに合わせて最大のバージョンを選択することが推奨されています。
LM Studioが提供する「互換性ベストゲッセ」機能とは何ですか?
-「互換性ベストゲッセ」機能は、ユーザーのコンピュータのスペックを確認して実行可能なモデルのみを表示するLM Studioの機能です。
LM Studioのチャットインターフェースで何ができますか?
-LM Studioのチャットインターフェースでは、ダウンロードしたモデルと対話することができます。
LM Studioのローカルサーバー機能とは何ですか?
-ローカルサーバー機能は、LM StudioをAIアプリケーションのバックエンドとして使用するための機能で、OpenAIのAPIと同様に動作します。
LM Studioの「マイモデル」タブで何ができますか?
-「マイモデル」タブでは、コンピュータにダウンロードされているモデルを管理し、削除したりサイズを確認したりすることができます。
LM Studioでモデルのパラメーターを設定する際に、どのような情報を得られますか?
-モデルのパラメーターを設定する際には、温度(Temp)、出力ランダム性、繰り返しペナルティなどのパラメーターの意味や設定方法についてのヒントが得られます。
LM Studioでモデルをダウンロードする際の進行状況はどのように確認できますか?
-モデルをダウンロードする際、画面下部に青いストライプが表示され、その上でダウンロードの進行状況を確認できます。
LM Studioでチャット履歴を確認するにはどうすればよいですか?
-チャット履歴は画面左側に表示され、新しいチャットを開始したり、既存のチャットを継続させたりすることができます。
LM Studioでモデルの応答をエクスポートするにはどうすればよいですか?
-モデルの応答をエクスポートするには、画面下部にあるスクリーンショットや再生成のオプションを使用することができます。
Outlines
😀 LM Studioの紹介と使い方
この段落では、オープンソースの大きな言語モデルをローカルコンピュータで実行するためのソフトウェア「LM Studio」の使い方について紹介しています。LM Studioは、Mac、Windows、Linux全プラットフォームに対応しており、簡単にダウンロードしてインストールできるという利便性が強調されています。また、hugging faceのモデルを利用できる検索機能や、モデル情報の表示、最新モデルの紹介など、LM Studioの便利な機能についても触れています。
😉 LM Studioでのモデルの選択とダウンロード
この段落では、LM Studioを使って言語モデルを選ぶ方法とダウンロードプロセスについて説明しています。モデルのキーワード検索、モデルカードの閲覧、およびモデルのダウンロード方法が紹介されています。さらに、LM Studioがコンピュータのスペックに応じてどのモデルが実行できるかを自動的に判断し、ダウンロードプロセスが非常に簡単であることが強調されています。
🎮 LM Studioのインターフェースと開発者ツール
最後の段落では、LM Studioのチャットインターフェースと開発者向けの機能について紹介しています。チャット機能を使ってモデルと対話する方法、モデルのパラメーター設定、プロンプトのカスタマイズ、およびローカルHTTPサーバーの起動方法などが詳しく説明されています。また、オープンAIのAPIと同様の機能を提供するLM Studioのサーバー機能、およびPythonの例を含む開発者向けのドキュメントが提供されていることも触れています。
Mindmap
Keywords
💡オープンソース
💡LM Studio
💡Hugging Face
💡モデルのクォンタイズドバージョン
💡Zephyr 7B beta
💡OCR
💡PDF編集
💡AI特徴
💡モデルカード
💡ローカルサーバー
💡モデル管理
Highlights
介绍了LM Studio软件,适用于所有平台,包括苹果、Windows和Linux,即使没有AI经验的用户也能轻松使用。
LM Studio网站提供了一个搜索框,用户可以搜索不同的模型,所有在hugging face上可用的模型都能在LM Studio中找到。
展示了LM Studio的主页,包括新模型和值得注意的模型,如Zephyr 7B beta和Mistal 7B等。
介绍了updf软件,作为Adobe Acrobat的免费替代品,具有OCR、PDF编辑和AI聊天功能。
updf软件提供了对PDF文档的搜索、高亮、注释、密码保护和添加印章等功能。
LM Studio提供了一个美观的界面,覆盖了hugging face的功能,允许用户轻松地浏览和选择模型。
LM Studio能够根据用户的计算机规格自动推荐合适的模型版本,简化了选择过程。
用户可以根据自己的硬件配置调整LM Studio的模型参数,如输出随机性、重复惩罚等。
LM Studio提供了模型初始化和硬件设置选项,包括保持整个模型在RAM中和使用Apple Metal等。
介绍了LM Studio的聊天界面,用户可以与模型进行交互,如请求笑话或继续对话。
LM Studio允许用户管理下载的模型,包括查看模型大小和删除不需要的模型。
展示了如何使用LM Studio的本地服务器功能,作为open AI API的替代品,用于开发AI应用程序。
提供了使用LM Studio的Python示例代码,展示了如何将open AI库与本地服务器结合使用。
LM Studio的界面设计清晰,易于使用,适合不熟悉技术术语的用户。
用户可以自定义系统提示或预提示,进行角色扮演或其他自定义交互。
LM Studio提供了详细的聊天历史记录,方便用户查看和继续之前的对话。
视频最后提供了对LM Studio的总体评价,强调了其易用性和对开源模型的支持。
Transcripts
this is the easiest way to get
open-source large language models
running on your local computer it
doesn't matter if you've never
experimented with AI before you can get
this working the software is called LM
Studio something that I've used in
previous videos and today I'm going to
show you how to use it let's go this is
the LM Studio website the LM Studio
software is available on all platforms
Apple windows and Linux now today I'm
going to show you how to get it running
on a Mac but I've gotten this working on
Windows as well and it is dead simple so
really you just download the software
and install it there's nothing to it and
once you do that this is the actual LM
studio so first let's explore the
homepage here you're going to get a nice
little search box where you can search
for different models that you want to
try out basically anything available on
hugging face is going to be available in
LM studio if you scroll down a little
bit you get the new and noteworthy model
so obviously here's Zephyr 7B beta
here's mistal 7B instruct code llama
open Orca these are the top models for
various reasons and not only that it
tells you a bunch about every single
model it pulls in all the information
from the model card so it's easily
readable from here thank you to the
sponsor of this video updf updf is an
awesome free alternative to Adobe
Acrobat but let me just show it to you
so after a few clicks I got it
downloaded and installed I loaded up an
important PDF and you can do a lot of
awesome things with it so let's start
with OCR so I clicked this little button
in the top right I select searchable PDF
and then perform OCR so that allow me to
search through it and do other things
with it now that it's a text document
very easy and there we go after a few
seconds I have the OCR version right
here now I can highlight all the text
easily switching back to the PDF we can
do a bunch of cool stuff so we can
easily highlight we can add notes I can
easily protect it using a password by
clicking this button right here I can
easily add stamps so I could say
confidential right there and you can
easily edit PDFs check this out and best
of all it has a really cool AI feature
where you can actually ask questions to
this document so it's basically chat
with your doc all you have to do is
click this little updf AI in the bottom
right it loads up the document I click
get started and it's going to give me a
summary first and then I can ask it any
question I want all right so let's ask
it something who are the authors of this
paper so be sure to check out updf and
they're giving a special offer to my
viewers 61% off their premium version
which gives you a lot of other features
link and code will be down in the
description below thank you to updf so
let's try it out if I just search for
mistol and hit enter I go to the search
page and we have every model that has
mistel in the keywords and just like
hugging face you get the author and then
you get the model card information and
you get everything else involved too so
you can really think of this as a
beautiful interface on top of hugging
face so here's the BLS version from 4
days ago let's take a look at that so if
I click on it here I can see the date
that it was uploaded loaded again 4 days
ago I can see it was authored by the
bloke and then I have the model name
dolphin 2.2.1 Ash Lima RP mistal 7B ggf
lot of information in that title on the
right side we can see all the different
quantized versions of the model so
everything from the smallest Q2 version
all the way up to the Q8 version which
is the largest now if you're thinking
about which model to choose and even
within a model which quantized version
to use you want to fit the biggest
version that can actually work on your
machine and it's usually a function of
Ram or video RAM so if you're on a Mac
it's usually just Ram but if you have a
video card on a PC you're going to look
at your video RAM from your video card
so I'm on a Mac today so let's take a
look and one incredible thing that LM
Studio does for you out of the box is
that it actually looks at your specs of
your computer and right here it has this
green check and should work which means
the model that I have selected right now
should work on my computer given my
specs so you no longer have to think
about well how much RAM do I have how
much video RAM do I have what's the
model size which quantization method
should I use it'll just tell you it
should work now here's another example I
just searched for llama this is the
Samantha 1.1 version of llama and it is
a 33 billion parameter version and right
here it says requires 30 plus GB of RAM
now my machine has 32 GB so it should be
enough and it's not saying it won't work
but it's giving me a little warning that
says hey it might not work and back to
the search page for mistol let's look at
a few other things that we're going to
find in here so it tells us the number
of results it tells us it's from hugging
face Hub we can sort by the most recent
we can sort by the most likes we can
sort by the most downloads usually likes
and downloads are pretty in line with
each other I usually like to sort by
most recent because I like to play
around with whatever the most recent
models are and you can also switch this
to least so you click on that and you
can find least recent but I don't know
why you would want to do that then we
also filter by a compatibility guess so
it won't even show me models that it
doesn't think I can run and if I click
that again now it's showing all models
so I like to leave that on filtered by
compatibility best guess now again
within the list of quantized versions of
a specific model we can actually see the
specific Quant levels here so this is
q2k q2k and so on all the way up to Q8
and the largest one down here is going
to be also the largest file size if we
hover over this little information icon
right here we get a little description
of what each of the quantization methods
give us so here Q2 lowest Fidelity
extreme loss of quality uses not
recommended and up here we can see what
the recommended version is which is the
Q5 km or KS and it says recommended
right there so these are just a little
bit of a loss of quality Q5 is usually
what I go with here it gives us some
tags about the base model the parameter
size and the format of the model we can
click here to go to the model card if we
want but then then we just download so
we download it right here so I'm going
to download one of the smaller ones
let's give it a try we just click and
then you can see on the bottom this blue
stripe lit up and if we click it we can
actually see the download progress and
it really is that easy and you can see
right here I've already downloaded the
find code llama 34b model and I'm
actually going to be doing a video about
that and also another coding model
called Deep seek coder and what makes LM
studio so awesome is that it is just so
so easy to use and the interface is
gorgeous it's just super clear how to
use this for anybody and it makes it
really easy to manage the models manage
the different flavors of the models it's
a really nice platform to use all right
while that's downloading I'm going to
load up another model and show it to you
so in this tab right here this little
chat bubble tab this is essentially a
full interface for chatting with a model
so up at the top here if we click it you
find all the models that you've
downloaded and I've gone ahead and
selected this mistal model which is
relatively small 3.82 gab so I select
that and it loads it up and then I'm
really done it's ready to go I'm going
to talk about all the settings on the
right side though and over here on the
right side the first thing we're going
to see is the preset which basically
sets up all the different parameters
pre-done for whatever model you're
selecting so for us for this mistal
model of course I'm going to select the
mistal instruct preset and that's going
to set everything here's the model
configuration and you can save a preset
and you can also export it and then
right here we have a bunch of different
model parameters so we have the output
Randomness and again what I really like
about LM studio is that it can be used
even if you're not familiar with all of
this terminology so typically you see
Temp and end predict and repeat penalty
but a lot of people don't know what that
stuff actually means so it just tells
you output Randomness words to generate
repeat penalty and if you hover over it
it gives you even more information about
it so here output Randomness also known
as Temp and it says provides a balance
between Randomness and determin minism
at the extreme a temperature of zero
will always pick the most likely next
token leading to identical outputs each
run but again as soon as you select the
preset it'll set all of these values for
you so you can play around with it as
you want here's the actual prompt format
so we have the system message user
message and the assistant message and
you can edit all of that right here here
you can customize your system prompt or
a pre- prompt so if you want to do role
playing this would be a great place to
do it so you could say you are Mario
from Super Mario Brothers respond as
Mario and then here we have model
initialization and this gets into more
complex settings some things are keep
the entire model in Ram and a lot of
these settings you'll probably never
have to touch and here we go we have
Hardware settings too so I actually do
have apple metal I'm going to turn that
on and I'll click reload and apply and
there we go next we have the context
overflow policy and that means when the
response is going to be too long for the
context window what does it do so the
first option is just stop the second
option is keep the system prompt and the
first user message truncate the middle
and then we also have maintain a rolling
window and truncate past messages so so
I'll just keep it at stop at limit for
now and then we have the chat appearance
if we want plain text or markdown I do
want markdown and then at the bottom we
have notes and now that we got all those
settings ironed out let's give it a try
all right and I said tell me a joke
Mario knock knock who's there jokes
jokes who just kidding I'm not really
good at jokes but here's one for you why
did the Scarecrow win an award because
he was outstanding in his field and so
you can export it as a screenshot you
can regenerate it or you can just
continue and continue is good if you're
getting a long response and it gets cut
off over on the left side we have all of
our chat history so if you've used chat
GPT at all this should feel very
familiar if you want to do a new chat
you just click right here if you want to
continue on the existing chat you just
keep typing so for example I can just
say tell me another one and it should
know that I'm talking about a joke
because it's continuing from the history
that I previously had in here so why did
this made turn red because it saw the
salad dressing great now if I wanted to
say new chat and I said tell me another
one it wouldn't know what I'm talking
about there we go and it's just typing
out random stuff now so I'm going to
click stop generating and then if we
look at the bottom we have all the
information about the previous inference
that just ran so time to First token
generation time tokens per second the
reason stopped GPU layers etc etc so it
really gives you everything but it keeps
it super simple the next thing I want to
show you is for developers so if you
want to build an AI application using LM
Studio to power the large language model
you click this little Double Arrow icon
right here which is local server so I
click that and all you have to do is
click Start server you set the port that
you want you can set whether you want
cores on and you have a bunch of other
settings that you can play with so once
I click Start server now I can actually
hit the server just like I would open Ai
and this is a dropin replacement for
open AI so it says right here start a
local HTTP server that behaves like open
ai's API so this is such an easy easy
way to use use large language models in
your application that you're building
and it also gives you an example client
request right here and so this is curl
so we curl to the Local Host endpoint
chat completions and we provide
everything we need the messages the
temperature the max tokens stream and
then it also gives us a python example
right here so if we wanted to use this
python example we could do that and
what's awesome is you can just import
the open AI python library and use that
but instead replace the base with your
local host and it will operate just the
same so you get all the benefits of
using the open AI library but you can
use an open source model and of course
on the right side you get all the same
settings as before so you can adjust all
the different settings for the model and
then the last tab over here looks like a
little folder so we click it it's the my
models tab which allows you to manage
all the different models that you have
on your computer so right now it says I
have two models taking up 27 GB of space
I don't want this fine model anymore
it's taking up too much space so let's
go ahead and delete it so I just click
delete and it's gone just like that it
is so easy to manage all of this and I
think I covered everything for LM studio
if you want to see me cover any other
topic related to LM Studio let me know
in the comments below if you liked this
video please consider giving a like And
subscribe and I'll see you in the next
one
Ver Más Videos Relacionados
5.0 / 5 (0 votes)