End of local AI Apps ?? HuggingFace Spaces + Docker + FastAPI
Summary
TLDRIn this tutorial, host Saruk demonstrates how to create a FastAPI application using the Google FLan T5 base model and Hugging Face's spaces, all hosted on Docker. The video guides viewers through setting up a Docker environment, coding the API with FastAPI, integrating the model via Hugging Face's pipeline, and deploying the app on Hugging Face spaces. It concludes with testing the application's functionality and addressing permission issues, resulting in a fully operational AI model accessible through a direct URL.
Takeaways
- 😀 The video demonstrates how to combine Hugging Face's Spaces, Google's Flan-T5 base LLM model, and FastAPI within Docker.
- 🛠️ The tutorial shows the creation of a FastAPI application with OpenAPI documentation hosted on Hugging Face Spaces.
- 🌐 The application includes a GET route for translation purposes, which utilizes the Flan-T5 model to translate text to German.
- 🔑 The process involves logging into Hugging Face, creating a new Space, and setting up a Docker environment from scratch.
- 💻 The video provides a step-by-step guide on setting up the Dockerfile, requirements.txt, and the app.py file for the FastAPI application.
- 📝 The script explains the importance of setting the correct app port as specified by Hugging Face to ensure proper hosting.
- 🔄 The tutorial covers rebuilding the Docker image and container each time changes are made to the requirements or code.
- 🔗 The video mentions the need to install additional packages like PyTorch for the model to function correctly.
- 👀 The script highlights the process of pushing the local changes to the Hugging Face repository and building the application there.
- 🚀 The video concludes with troubleshooting a permission error by updating the Dockerfile to include proper user permissions.
- 📢 The host encourages viewers to subscribe, like, and comment for more content and assistance.
Q & A
What is the main focus of the video by Bitfumes?
-The video focuses on combining Hugging Face's Spaces, Google's Flan-T5 base model, and FastAPI within Docker to create a hosted application.
What is the purpose of using FastAPI in the video?
-FastAPI is used to create an application with an open API that can be hosted on Hugging Face Spaces, allowing for easy deployment and access.
How does the video demonstrate the use of Hugging Face Spaces?
-The video shows the process of logging into Hugging Face, creating a new space, and deploying a FastAPI application using Docker.
What is the significance of Docker in the context of this tutorial?
-Docker is used to containerize the FastAPI application, making it portable and easy to deploy on Hugging Face Spaces.
What is the role of the 'requirements.txt' file in the video?
-The 'requirements.txt' file lists the Python packages needed for the application, such as FastAPI and Transformers, which are installed when building the Docker image.
How does the video handle the creation of a Dockerfile for the application?
-The video outlines the steps to create a Dockerfile, specifying the base Python image, copying files into the container, installing dependencies, and defining the command to run the FastAPI application.
What is the importance of setting the correct port in the Dockerfile?
-The correct port is crucial because Hugging Face Spaces requires the application to listen on a specific port (7860 in this case) for proper hosting and accessibility.
How does the video address the integration of the Google Flan-T5 model?
-The video demonstrates adding the Transformers library to the 'requirements.txt' and using the Hugging Face pipeline in the FastAPI application to integrate the Google Flan-T5 model.
What is the process for testing the FastAPI application locally before pushing to Hugging Face Spaces?
-The video describes building and running a Docker container locally, then accessing the FastAPI application through a web browser to test its functionality.
What issues were encountered when trying to deploy the application to Hugging Face Spaces, and how were they resolved?
-The video encountered a permission denied error, which was resolved by updating the Dockerfile to include the correct user permissions and environment variables.
How can viewers access the deployed application on Hugging Face Spaces?
-After the application is successfully deployed and running, viewers can access it through a direct URL provided by Hugging Face Spaces, which is obtained by embedding the space.
Outlines
🚀 Introduction to Combining Hugging Face Spaces with FastAPI and Docker
The script introduces a tutorial on integrating Hugging Face's spaces with Google's Flan-T5 base language model and FastAPI, all within a Docker container. The aim is to create a hosted application on Hugging Face spaces with an open API for translation tasks. The host, Saruk, demonstrates the setup process, beginning with logging into Hugging Face, creating a new space, and setting up a Docker environment. The initial steps include selecting a license, choosing a Docker option, and configuring the hardware settings.
📝 Setting Up the FastAPI Application and Dockerfile
The second paragraph details the creation of a FastAPI application and Dockerfile. It begins by outlining the Dockerfile setup with a Python 3.10 slim image and the necessary commands to copy files and install dependencies. The script then proceeds to create a 'requirements.txt' file for FastAPI and uvicorn, and an 'app.py' file that initializes the FastAPI instance and sets up a basic 'home' route for testing. The paragraph concludes with instructions on building and running the Docker container, and accessing the FastAPI documentation.
🔧 Integrating Google Flan-T5 Model and Debugging Docker Permissions
This paragraph focuses on integrating the Google Flan-T5 model using the Hugging Face Transformers library. The script explains the process of updating the 'requirements.txt' file to include the Transformers library and modifying the 'app.py' to include a new route for translation prompts. It also addresses the need for TensorFlow or PyTorch for model execution, choosing PyTorch in this case. The paragraph highlights the importance of rebuilding the Docker image after updating dependencies and running into permission issues, suggesting the creation of a new user within the Dockerfile to resolve these issues.
🌐 Finalizing the Application and Pushing to Hugging Face Spaces
The final paragraph wraps up the tutorial by demonstrating how to push the completed application to Hugging Face spaces. It includes steps for committing changes, pushing to the repository, and ensuring the Dockerfile builds the container correctly. The script also discusses troubleshooting permission issues by updating the Dockerfile with the correct user permissions. Upon successful deployment, the application is tested for functionality, and viewers are encouraged to subscribe and provide feedback. The host also invites questions and promises to respond, signaling the end of the tutorial.
Mindmap
Keywords
💡Hugging Face
💡Google Flan T5 Base LLM model
💡FastAPI
💡Docker
💡API
💡OpenAPI
💡Hugging Face Spaces
💡Transformers
💡requirements.txt
💡Dockerfile
💡GitHub
Highlights
Introduction of combining Hugging Face's Spaces, Google's Flan-T5 base LLM model, FastAPI, and Docker to create a hosted application.
Demonstration of creating a FastAPI application with OpenAPI documentation.
Tutorial on translating text to German using the application hosted on Hugging Face Spaces.
Explanation of logging into Hugging Face and navigating to Spaces for application creation.
Step-by-step guide to creating a new space on Hugging Face, including naming and choosing a license.
Emphasis on using Docker for the application, starting from a blank page.
Decision to use the free version of Hugging Face for the tutorial with an option to upgrade for faster performance.
Instructions on cloning the repository using git and setting up SSH for secure access.
Overview of creating a Dockerfile for the FastAPI application using a Python 3.10 slim image.
Description of setting up the work directory and installing dependencies from a requirements.txt file.
Details on running the FastAPI application with auto-reload and specifying the host and port as per Hugging Face's requirements.
Creation of the app.py file and initial setup for a FastAPI instance with a basic 'home' route.
Building and running the Docker container for the FastAPI application locally.
Integration of the Google Flan-T5 base model using the Hugging Face Transformers library.
Addition of a new route 'ask' to utilize the model for generating text based on prompts.
Process of updating the Docker image and container to include the new model and dependencies.
Explanation of installing PyTorch as a dependency for the Hugging Face pipeline.
Demonstration of the application's functionality with a live translation example.
Troubleshooting guide for permission issues and updating the Dockerfile with the correct user permissions.
Final steps to push the application files to the Hugging Face repository and monitor the build process.
Accessing the application through a direct URL provided by Hugging Face Spaces after setting the space to public.
Encouragement for viewers to subscribe and like the video for more content, and an invitation for questions and feedback.
Transcripts
hello and welcome to bitfumes I'm your
host saruk and this video is amazing
because we are going to combine the
hugging face is spaces hugging face
Google Fran T5 Bas llm model fast API
with Docker so basically a fast API
inside Docker using this llm model to
create something which will be hosted on
the hugging face spaces so let's see at
the end of this video what we are going
to create we will going to create this
fast API application which is having
this open API docs and here if I go
let's close this ask is a get route
where I can try it out and I'll just
translate this to German how old are you
I'll click on execute and this will
execute on my huging face spaces you can
see this is hugging face spaces hosting
this application right there here and
finally we are getting this result we
can verify just say translate this is a
Google translate and it says how are you
great exciting things really powerful
and yes all the source code will be
provided for this spaces the hugging
face spaces check the description let's
now begin so the very first thing we
need to do is obviously logging into the
hugging face I already logged in you can
can see my profile svi and I will go on
spaces so click on the spaces you will
land on this kind of page where all the
public spaces are available basically
the app people have built and shared
okay now it's your time because you can
also create the new application and host
it on the spaces the hugging face spaces
so click on the create new space give
the name of your application so I'll say
Google flan
with uh fast API simple so fast API
Google FL license I can choose MIT you
can choose anyone and the special thing
is we are going to use the docker so
click on the docker and choose the blank
page because we are going to start from
scratch Okay now what's the hardware we
are going to use obviously we will going
to use the free version but if you have
money and you can afford these then you
can try these out these are really fast
and you will save a lot of time but for
this tutorial going with the free deer
and I'm going to make it public so that
you can try it out click on create space
and this is just like a GitHub
repository if you are familiar with G
GitHub repositories it's exactly same
here we have a way to clone this
repository using git as I said it's
exactly the GitHub repository but I'm
going to use the SSH if you are using
SSH make sure to add your SSH public key
here on this thing okay so I already
have this right here which you can see
and I'm going to use that so I can just
copy this thing to clone this this line
okay and then I will go on my terminal
and I'll say hey get clone whatever the
path is okay let's now run this thing
and this is going to clone the
repository in my local machine once done
CD into Google flan fast API open this
with with VSS code boom so this just
have two files which is get ignore and
read me file and you can see all these
files in the file section so you can see
at the Top If you see app files
Community setting to we need to get into
the files here is the file you can
directly upload the file by obviously
pulling it and dragging do and create
new file directly from here but it's
really better to start from scratch from
your local you can try it out okay since
this is going to be a a Docker thing so
I'm going to start with a Docker file so
Docker file and inside this since I'm
going to use the fast API I need a
python image so I'll use the python 3.10
slim version okay and then I'm going to
copy
everything from here
to/ apppp so this is is what I'm going
to say like work dur is app and then
obviously once we have everything then
I'm going to run pip install recing
for
requirements.txt now you'll say where is
requirements.txt we're going to create
this very soon but finally we need to
run a command called CMD as fast API run
and we want this to reload every time we
change anything in our app.py which we
are going to create very soon and also
we need to provide a host which is
0.0.0.0 and the port is extremely
important you know why because if you go
on the hugging face at the bottom it
says app Port must listen on
7860 great so we have provided it now
it's time to create the requirements.txt
and the app do
py file so let's start with
app.py and also requirements.txt this is
going to be a fast API things so the
very first thing we need is fast API and
uvi con and that's it for now next is
from Fast API inside the app.py import
fast API create a new instance of fast
API like this and obviously create a
route out for get request to prove that
yes it's working fine you can name it
anything I have name it home and
returning
simple hello world but hello bitfumes
okay so our first API is working now
let's run this Docker container by using
Docker
run I'm going to name this as fast API
Port obviously to Define as uh what's
that Port name uh sorry number is 7860
and
7860 and
the image image we have not defined we
have not created so first we need to say
build and build X
as fast
API hugging face and then I'm going to
say hey buil from the dot that means the
current Dory and find the docker file so
yes it's going to create it and uh you
can see it's pulling 3.10 slim
installing the requirements which is
just fast API for
now and uh once this is done then we
will going to create a new container
using this image so I can copy this run
command for the docker and once this is
done you can see it's really created we
can say hey the image I'm using is fast
API hugging pH okay so it says uh Port
is like P and yes it's done so let's
open this in our server and yes you can
see fast API is working now since the
fast API is working we can go to the
docs because it provide the open API
documentation
great this is the default one but we are
interested in using this Google flan T5
base model and to use this model we can
use the Transformers so just like this
it's very very easy we need the
Transformers to be installed so go to
the requirements.txt add the
Transformers and here copy this thing go
to the app.py
and create something like this once this
is done let's create another route I'll
say app.get on ask you can name it
anything once again I'm going to give
ask and here is the prompt which is
going to be the string and then I'm
going to say we already have the pipe
here so I'm going to use this pipe
result result is equal to pipe with I
have defined everything like here next
is just providing the prompt once done
then I'll say result and give me the
first item from this result okay so this
is done but as we have made changes in
our requirements.txt file we need to
create our um image once again so that
it will take the advantage of this so
let's skill this and I'll say Docker
build once again and this is going to
once again going to start installing
from the requirements.txt because we
have changed that thing and it's going
to be first done very
easily okay that's once again done let's
run this container and now it says it's
already
there container is already running
so we say container where is the
container fast
API here is the fast API container so
Docker remove container fast
API and we can force it okay that's done
and let's run
it and now you can see since we are
using this hugging face pipeline thing
then it says hey you need the tensor
flow or py torch to be installed so
let's install the P torch so how we
install the P torch so let's say Pi
torch and it's actually given here so
either one so tensor flow or Pi torch so
let's click on the pie
torch and click on get started on this
it says Hey There is a pip thing and we
need these two packages to be
installed okay one more time so go with
this and this great now we need one more
building of the image so building the
image every time we have uh we add any
package on the requirements.txt we need
to add create the new image
because that's how it will get the fresh
packages on the image so once this is
done then we going to create the
new container okay so after 96 second it
is done and let's run the container one
more time and it says hey you already
have the container remember we need to
remove the container and run it once
again and yes the container is started
okay so now server is running let's open
this here and you can see now we have
another one for ask let's try it out and
I'm going to use the trans slate trans
translate thing so which I copied
obviously once again from here
so let's click on execute and this is
going to take little bit of time but
it's very fast now you can see it says
this is the translate of how old are you
let's try it out on the Google translate
and how old are you which is absolutely
perfect answer so we have successfully
created the container we have
successfully created the application
everything is there great now is the
time to push these things to our spaces
which we are working on so right now
files is just have git ignore and read
me so what I'm going to do uh I think it
doesn't have the git ignore but no
worries we don't want to ignore anything
so I'll go here and I say hey uh um add
or maybe I say use hugging face
models a simple model okay so that's
done and I'm going to hit the git push
now since the docker file is in the root
of our application it's very important
to see that it is started building by
default as we have pushed on our hugging
face repulse tree it's using this Docker
file to create the container and we can
check the log by clicking here you can
see it's doing exactly same thing which
we have said that hey it's going to be
work directory and then it's downloading
anything it required you can click on
this app to see the final result so
right now it's building once done then
it will say running on this place
instead of building and then we will be
able to interact with the application
directly from the hugging face spaces
now at this point we are getting error
error is for permission denied so the
docker file need to be updated with the
correct permission first of all I'm
going to create new user so I'll say run
user add new user you can name it
anything I name it new user change the
user so new user is here and obviously I
need to Define some environments so I'll
Define these two environments which is
like home is home/ user let's call it
user instead of new user to be at the
same pace so this is good now work
directory need to be dollar home/ app
and this also going to be dollar home/
app but when you copy we need to add the
permission so we need to Simply say c h
u n for the
user and that's the only thing we need
so these three four steps is going to
provide the correct permission for this
uh for this app directory okay so let's
once again commit
with
defining uh defining the
permissions and push it once again and
this is going to do all the magic we
want
okay once again we need to wait for this
to complete congratulation you can see
our application is running State and you
can see we have this hello bitfumes
because this is the homepage and on this
container log we can see it's running on
Port
7860 everything is there H that's good
but how I can use this ask or maybe just
the docks because there's no way to
write any URL okay for that you need to
go on this top side here is three dot
after the setting click on that and you
will see that embed this space this
thing will only be visible when you are
having your space as publicly defined
private one will not have embedding
thing okay now you have this direct URL
so click on this direct URL boom we have
this and now we can try the docks here
we have the docks this means we can
simply try it out by providing translate
to
German how are you once again click on
execute and this is taking time and this
Pro that yes it's working sometime
taking time is also good
thing and finally we are getting the
result generated text is here so this is
how you can actually going to uh have
all your uh models and this AI things
llm things on the hugging space spaces
which is really great so if you have
learned anything from this video please
go and click on the Subscribe button and
hit that like button so that I will get
the motivation to create more videos
just like this if you have any question
related to this video or any other video
just comment below I'll Surly going to
reply see you in the next video till
then see you goodbye
Weitere ähnliche Videos ansehen
Deploy Hugging Face models on Google Cloud: from the hub to Inference Endpoints
Hugging Face + Langchain in 5 mins | Access 200k+ FREE AI models for your AI apps
Hands-On Hugging Face Tutorial | Transformers, AI Pipeline, Fine Tuning LLM, GPT, Sentiment Analysis
Fine-Tune Your Own Tiny-Llama on Custom Dataset
Build AI Apps in 5 Minutes: Dify AI + Docker Setup
How to Run a Python Docker Image on AWS Lambda
5.0 / 5 (0 votes)