End of local AI Apps ?? HuggingFace Spaces + Docker + FastAPI

Bitfumes - AI & LLMs
25 Jul 202417:28

Summary

TLDRIn this tutorial, host Saruk demonstrates how to create a FastAPI application using the Google FLan T5 base model and Hugging Face's spaces, all hosted on Docker. The video guides viewers through setting up a Docker environment, coding the API with FastAPI, integrating the model via Hugging Face's pipeline, and deploying the app on Hugging Face spaces. It concludes with testing the application's functionality and addressing permission issues, resulting in a fully operational AI model accessible through a direct URL.

Takeaways

  • 😀 The video demonstrates how to combine Hugging Face's Spaces, Google's Flan-T5 base LLM model, and FastAPI within Docker.
  • 🛠️ The tutorial shows the creation of a FastAPI application with OpenAPI documentation hosted on Hugging Face Spaces.
  • 🌐 The application includes a GET route for translation purposes, which utilizes the Flan-T5 model to translate text to German.
  • 🔑 The process involves logging into Hugging Face, creating a new Space, and setting up a Docker environment from scratch.
  • 💻 The video provides a step-by-step guide on setting up the Dockerfile, requirements.txt, and the app.py file for the FastAPI application.
  • 📝 The script explains the importance of setting the correct app port as specified by Hugging Face to ensure proper hosting.
  • 🔄 The tutorial covers rebuilding the Docker image and container each time changes are made to the requirements or code.
  • 🔗 The video mentions the need to install additional packages like PyTorch for the model to function correctly.
  • 👀 The script highlights the process of pushing the local changes to the Hugging Face repository and building the application there.
  • 🚀 The video concludes with troubleshooting a permission error by updating the Dockerfile to include proper user permissions.
  • 📢 The host encourages viewers to subscribe, like, and comment for more content and assistance.

Q & A

  • What is the main focus of the video by Bitfumes?

    -The video focuses on combining Hugging Face's Spaces, Google's Flan-T5 base model, and FastAPI within Docker to create a hosted application.

  • What is the purpose of using FastAPI in the video?

    -FastAPI is used to create an application with an open API that can be hosted on Hugging Face Spaces, allowing for easy deployment and access.

  • How does the video demonstrate the use of Hugging Face Spaces?

    -The video shows the process of logging into Hugging Face, creating a new space, and deploying a FastAPI application using Docker.

  • What is the significance of Docker in the context of this tutorial?

    -Docker is used to containerize the FastAPI application, making it portable and easy to deploy on Hugging Face Spaces.

  • What is the role of the 'requirements.txt' file in the video?

    -The 'requirements.txt' file lists the Python packages needed for the application, such as FastAPI and Transformers, which are installed when building the Docker image.

  • How does the video handle the creation of a Dockerfile for the application?

    -The video outlines the steps to create a Dockerfile, specifying the base Python image, copying files into the container, installing dependencies, and defining the command to run the FastAPI application.

  • What is the importance of setting the correct port in the Dockerfile?

    -The correct port is crucial because Hugging Face Spaces requires the application to listen on a specific port (7860 in this case) for proper hosting and accessibility.

  • How does the video address the integration of the Google Flan-T5 model?

    -The video demonstrates adding the Transformers library to the 'requirements.txt' and using the Hugging Face pipeline in the FastAPI application to integrate the Google Flan-T5 model.

  • What is the process for testing the FastAPI application locally before pushing to Hugging Face Spaces?

    -The video describes building and running a Docker container locally, then accessing the FastAPI application through a web browser to test its functionality.

  • What issues were encountered when trying to deploy the application to Hugging Face Spaces, and how were they resolved?

    -The video encountered a permission denied error, which was resolved by updating the Dockerfile to include the correct user permissions and environment variables.

  • How can viewers access the deployed application on Hugging Face Spaces?

    -After the application is successfully deployed and running, viewers can access it through a direct URL provided by Hugging Face Spaces, which is obtained by embedding the space.

Outlines

00:00

🚀 Introduction to Combining Hugging Face Spaces with FastAPI and Docker

The script introduces a tutorial on integrating Hugging Face's spaces with Google's Flan-T5 base language model and FastAPI, all within a Docker container. The aim is to create a hosted application on Hugging Face spaces with an open API for translation tasks. The host, Saruk, demonstrates the setup process, beginning with logging into Hugging Face, creating a new space, and setting up a Docker environment. The initial steps include selecting a license, choosing a Docker option, and configuring the hardware settings.

05:03

📝 Setting Up the FastAPI Application and Dockerfile

The second paragraph details the creation of a FastAPI application and Dockerfile. It begins by outlining the Dockerfile setup with a Python 3.10 slim image and the necessary commands to copy files and install dependencies. The script then proceeds to create a 'requirements.txt' file for FastAPI and uvicorn, and an 'app.py' file that initializes the FastAPI instance and sets up a basic 'home' route for testing. The paragraph concludes with instructions on building and running the Docker container, and accessing the FastAPI documentation.

10:04

🔧 Integrating Google Flan-T5 Model and Debugging Docker Permissions

This paragraph focuses on integrating the Google Flan-T5 model using the Hugging Face Transformers library. The script explains the process of updating the 'requirements.txt' file to include the Transformers library and modifying the 'app.py' to include a new route for translation prompts. It also addresses the need for TensorFlow or PyTorch for model execution, choosing PyTorch in this case. The paragraph highlights the importance of rebuilding the Docker image after updating dependencies and running into permission issues, suggesting the creation of a new user within the Dockerfile to resolve these issues.

15:07

🌐 Finalizing the Application and Pushing to Hugging Face Spaces

The final paragraph wraps up the tutorial by demonstrating how to push the completed application to Hugging Face spaces. It includes steps for committing changes, pushing to the repository, and ensuring the Dockerfile builds the container correctly. The script also discusses troubleshooting permission issues by updating the Dockerfile with the correct user permissions. Upon successful deployment, the application is tested for functionality, and viewers are encouraged to subscribe and provide feedback. The host also invites questions and promises to respond, signaling the end of the tutorial.

Mindmap

Keywords

💡Hugging Face

Hugging Face is an open-source company that provides tools for developers to build, train, and deploy machine learning models. In the context of this video, Hugging Face is used as a platform to host applications built with machine learning models. The script mentions combining Hugging Face Spaces with other technologies to create a hosted application.

💡Google Flan T5 Base LLM model

Google Flan T5 Base is a large language model (LLM) developed by Google. It is a part of the T5 (Text-to-Text Transfer Transformer) family of models, which are designed for natural language understanding and generation tasks. The video discusses using this model within a FastAPI application hosted on Hugging Face Spaces for translation purposes.

💡FastAPI

FastAPI is a modern, fast (high-performance) web framework for building APIs with Python 3.7+ based on standard Python type hints. In the video, FastAPI is used to create an application that interfaces with the Google Flan T5 Base model to perform translations, demonstrating how to set up a route and execute the model's functionality.

💡Docker

Docker is a platform that uses containerization technology to make it easier to create, deploy, and run applications by using containers. The video script details using Docker to containerize a FastAPI application, which allows for easier deployment and scalability on Hugging Face Spaces.

💡API

API stands for Application Programming Interface, which is a set of rules and protocols for building and interacting with software applications. The video demonstrates creating a FastAPI application with endpoints (or routes) that can be used to perform actions such as translation, showcasing the use of APIs in web development.

💡OpenAPI

OpenAPI is a specification for building APIs that is based on the OpenAPI Specification (OAS). It allows for the creation of interactive and visual representations of an API's capabilities. The script mentions that the FastAPI application will have OpenAPI documentation, which is a key feature for developers to understand and use the API effectively.

💡Hugging Face Spaces

Hugging Face Spaces is a platform for hosting machine learning models and applications. The video script describes the process of creating a new space on Hugging Face to host a FastAPI application that uses the Google Flan T5 Base model, demonstrating the integration of various technologies on this platform.

💡Transformers

Transformers is a library developed by Hugging Face that provides a wide range of state-of-the-art pre-trained models for natural language processing tasks. In the context of the video, the Transformers library is used to integrate the Google Flan T5 Base model into the FastAPI application.

💡requirements.txt

A 'requirements.txt' file is a standard for specifying dependencies in Python projects. The video script includes creating a 'requirements.txt' file to list the necessary packages like FastAPI and Transformers, which are needed to build and run the application.

💡Dockerfile

A 'Dockerfile' is a text document that contains all the commands a user could call on the command line to assemble an image. In the video, a Dockerfile is created to define the environment for the FastAPI application, including the base Python image, dependencies, and startup commands.

💡GitHub

GitHub is a platform for version control and collaboration that allows developers to work on projects and contribute to each other's work. The script compares the process of creating and managing a space on Hugging Face to that of a GitHub repository, highlighting the version control aspect of the platform.

Highlights

Introduction of combining Hugging Face's Spaces, Google's Flan-T5 base LLM model, FastAPI, and Docker to create a hosted application.

Demonstration of creating a FastAPI application with OpenAPI documentation.

Tutorial on translating text to German using the application hosted on Hugging Face Spaces.

Explanation of logging into Hugging Face and navigating to Spaces for application creation.

Step-by-step guide to creating a new space on Hugging Face, including naming and choosing a license.

Emphasis on using Docker for the application, starting from a blank page.

Decision to use the free version of Hugging Face for the tutorial with an option to upgrade for faster performance.

Instructions on cloning the repository using git and setting up SSH for secure access.

Overview of creating a Dockerfile for the FastAPI application using a Python 3.10 slim image.

Description of setting up the work directory and installing dependencies from a requirements.txt file.

Details on running the FastAPI application with auto-reload and specifying the host and port as per Hugging Face's requirements.

Creation of the app.py file and initial setup for a FastAPI instance with a basic 'home' route.

Building and running the Docker container for the FastAPI application locally.

Integration of the Google Flan-T5 base model using the Hugging Face Transformers library.

Addition of a new route 'ask' to utilize the model for generating text based on prompts.

Process of updating the Docker image and container to include the new model and dependencies.

Explanation of installing PyTorch as a dependency for the Hugging Face pipeline.

Demonstration of the application's functionality with a live translation example.

Troubleshooting guide for permission issues and updating the Dockerfile with the correct user permissions.

Final steps to push the application files to the Hugging Face repository and monitor the build process.

Accessing the application through a direct URL provided by Hugging Face Spaces after setting the space to public.

Encouragement for viewers to subscribe and like the video for more content, and an invitation for questions and feedback.

Transcripts

play00:00

hello and welcome to bitfumes I'm your

play00:02

host saruk and this video is amazing

play00:04

because we are going to combine the

play00:06

hugging face is spaces hugging face

play00:10

Google Fran T5 Bas llm model fast API

play00:15

with Docker so basically a fast API

play00:20

inside Docker using this llm model to

play00:24

create something which will be hosted on

play00:27

the hugging face spaces so let's see at

play00:29

the end of this video what we are going

play00:30

to create we will going to create this

play00:32

fast API application which is having

play00:35

this open API docs and here if I go

play00:39

let's close this ask is a get route

play00:42

where I can try it out and I'll just

play00:45

translate this to German how old are you

play00:49

I'll click on execute and this will

play00:51

execute on my huging face spaces you can

play00:55

see this is hugging face spaces hosting

play00:57

this application right there here and

play01:01

finally we are getting this result we

play01:03

can verify just say translate this is a

play01:06

Google translate and it says how are you

play01:09

great exciting things really powerful

play01:13

and yes all the source code will be

play01:15

provided for this spaces the hugging

play01:18

face spaces check the description let's

play01:22

now begin so the very first thing we

play01:24

need to do is obviously logging into the

play01:27

hugging face I already logged in you can

play01:29

can see my profile svi and I will go on

play01:33

spaces so click on the spaces you will

play01:36

land on this kind of page where all the

play01:38

public spaces are available basically

play01:41

the app people have built and shared

play01:44

okay now it's your time because you can

play01:46

also create the new application and host

play01:49

it on the spaces the hugging face spaces

play01:52

so click on the create new space give

play01:54

the name of your application so I'll say

play01:57

Google flan

play02:00

with uh fast API simple so fast API

play02:04

Google FL license I can choose MIT you

play02:07

can choose anyone and the special thing

play02:08

is we are going to use the docker so

play02:10

click on the docker and choose the blank

play02:13

page because we are going to start from

play02:14

scratch Okay now what's the hardware we

play02:17

are going to use obviously we will going

play02:20

to use the free version but if you have

play02:23

money and you can afford these then you

play02:26

can try these out these are really fast

play02:28

and you will save a lot of time but for

play02:30

this tutorial going with the free deer

play02:33

and I'm going to make it public so that

play02:34

you can try it out click on create space

play02:38

and this is just like a GitHub

play02:40

repository if you are familiar with G

play02:43

GitHub repositories it's exactly same

play02:46

here we have a way to clone this

play02:48

repository using git as I said it's

play02:51

exactly the GitHub repository but I'm

play02:53

going to use the SSH if you are using

play02:56

SSH make sure to add your SSH public key

play03:00

here on this thing okay so I already

play03:03

have this right here which you can see

play03:05

and I'm going to use that so I can just

play03:08

copy this thing to clone this this line

play03:12

okay and then I will go on my terminal

play03:14

and I'll say hey get clone whatever the

play03:18

path is okay let's now run this thing

play03:21

and this is going to clone the

play03:22

repository in my local machine once done

play03:26

CD into Google flan fast API open this

play03:29

with with VSS code boom so this just

play03:33

have two files which is get ignore and

play03:35

read me file and you can see all these

play03:37

files in the file section so you can see

play03:40

at the Top If you see app files

play03:44

Community setting to we need to get into

play03:46

the files here is the file you can

play03:49

directly upload the file by obviously

play03:51

pulling it and dragging do and create

play03:53

new file directly from here but it's

play03:56

really better to start from scratch from

play04:00

your local you can try it out okay since

play04:03

this is going to be a a Docker thing so

play04:06

I'm going to start with a Docker file so

play04:09

Docker file and inside this since I'm

play04:12

going to use the fast API I need a

play04:15

python image so I'll use the python 3.10

play04:20

slim version okay and then I'm going to

play04:23

copy

play04:24

everything from here

play04:27

to/ apppp so this is is what I'm going

play04:30

to say like work dur is app and then

play04:34

obviously once we have everything then

play04:36

I'm going to run pip install recing

play04:42

for

play04:44

requirements.txt now you'll say where is

play04:47

requirements.txt we're going to create

play04:49

this very soon but finally we need to

play04:51

run a command called CMD as fast API run

play04:57

and we want this to reload every time we

play04:59

change anything in our app.py which we

play05:02

are going to create very soon and also

play05:05

we need to provide a host which is

play05:07

0.0.0.0 and the port is extremely

play05:12

important you know why because if you go

play05:14

on the hugging face at the bottom it

play05:16

says app Port must listen on

play05:21

7860 great so we have provided it now

play05:25

it's time to create the requirements.txt

play05:28

and the app do

play05:30

py file so let's start with

play05:32

app.py and also requirements.txt this is

play05:37

going to be a fast API things so the

play05:39

very first thing we need is fast API and

play05:42

uvi con and that's it for now next is

play05:48

from Fast API inside the app.py import

play05:53

fast API create a new instance of fast

play05:56

API like this and obviously create a

play05:59

route out for get request to prove that

play06:02

yes it's working fine you can name it

play06:05

anything I have name it home and

play06:08

returning

play06:10

simple hello world but hello bitfumes

play06:15

okay so our first API is working now

play06:18

let's run this Docker container by using

play06:21

Docker

play06:22

run I'm going to name this as fast API

play06:28

Port obviously to Define as uh what's

play06:32

that Port name uh sorry number is 7860

play06:37

and

play06:38

7860 and

play06:40

the image image we have not defined we

play06:44

have not created so first we need to say

play06:47

build and build X

play06:50

as fast

play06:52

API hugging face and then I'm going to

play06:56

say hey buil from the dot that means the

play06:59

current Dory and find the docker file so

play07:02

yes it's going to create it and uh you

play07:05

can see it's pulling 3.10 slim

play07:08

installing the requirements which is

play07:11

just fast API for

play07:13

now and uh once this is done then we

play07:16

will going to create a new container

play07:18

using this image so I can copy this run

play07:23

command for the docker and once this is

play07:26

done you can see it's really created we

play07:29

can say hey the image I'm using is fast

play07:33

API hugging pH okay so it says uh Port

play07:39

is like P and yes it's done so let's

play07:45

open this in our server and yes you can

play07:48

see fast API is working now since the

play07:51

fast API is working we can go to the

play07:54

docs because it provide the open API

play07:58

documentation

play08:00

great this is the default one but we are

play08:02

interested in using this Google flan T5

play08:08

base model and to use this model we can

play08:12

use the Transformers so just like this

play08:15

it's very very easy we need the

play08:18

Transformers to be installed so go to

play08:21

the requirements.txt add the

play08:24

Transformers and here copy this thing go

play08:28

to the app.py

play08:30

and create something like this once this

play08:34

is done let's create another route I'll

play08:37

say app.get on ask you can name it

play08:40

anything once again I'm going to give

play08:42

ask and here is the prompt which is

play08:45

going to be the string and then I'm

play08:47

going to say we already have the pipe

play08:50

here so I'm going to use this pipe

play08:55

result result is equal to pipe with I

play09:00

have defined everything like here next

play09:03

is just providing the prompt once done

play09:07

then I'll say result and give me the

play09:10

first item from this result okay so this

play09:13

is done but as we have made changes in

play09:16

our requirements.txt file we need to

play09:19

create our um image once again so that

play09:23

it will take the advantage of this so

play09:27

let's skill this and I'll say Docker

play09:31

build once again and this is going to

play09:35

once again going to start installing

play09:38

from the requirements.txt because we

play09:40

have changed that thing and it's going

play09:43

to be first done very

play09:47

easily okay that's once again done let's

play09:50

run this container and now it says it's

play09:53

already

play09:55

there container is already running

play09:59

so we say container where is the

play10:02

container fast

play10:04

API here is the fast API container so

play10:07

Docker remove container fast

play10:11

API and we can force it okay that's done

play10:16

and let's run

play10:18

it and now you can see since we are

play10:22

using this hugging face pipeline thing

play10:26

then it says hey you need the tensor

play10:28

flow or py torch to be installed so

play10:33

let's install the P torch so how we

play10:35

install the P torch so let's say Pi

play10:37

torch and it's actually given here so

play10:40

either one so tensor flow or Pi torch so

play10:44

let's click on the pie

play10:46

torch and click on get started on this

play10:50

it says Hey There is a pip thing and we

play10:53

need these two packages to be

play10:56

installed okay one more time so go with

play11:00

this and this great now we need one more

play11:05

building of the image so building the

play11:09

image every time we have uh we add any

play11:13

package on the requirements.txt we need

play11:15

to add create the new image

play11:18

because that's how it will get the fresh

play11:21

packages on the image so once this is

play11:24

done then we going to create the

play11:29

new container okay so after 96 second it

play11:33

is done and let's run the container one

play11:36

more time and it says hey you already

play11:38

have the container remember we need to

play11:41

remove the container and run it once

play11:43

again and yes the container is started

play11:48

okay so now server is running let's open

play11:51

this here and you can see now we have

play11:54

another one for ask let's try it out and

play11:57

I'm going to use the trans slate trans

play12:01

translate thing so which I copied

play12:04

obviously once again from here

play12:07

so let's click on execute and this is

play12:11

going to take little bit of time but

play12:14

it's very fast now you can see it says

play12:17

this is the translate of how old are you

play12:21

let's try it out on the Google translate

play12:24

and how old are you which is absolutely

play12:27

perfect answer so we have successfully

play12:30

created the container we have

play12:32

successfully created the application

play12:34

everything is there great now is the

play12:38

time to push these things to our spaces

play12:42

which we are working on so right now

play12:45

files is just have git ignore and read

play12:48

me so what I'm going to do uh I think it

play12:51

doesn't have the git ignore but no

play12:53

worries we don't want to ignore anything

play12:56

so I'll go here and I say hey uh um add

play13:01

or maybe I say use hugging face

play13:05

models a simple model okay so that's

play13:08

done and I'm going to hit the git push

play13:11

now since the docker file is in the root

play13:14

of our application it's very important

play13:17

to see that it is started building by

play13:20

default as we have pushed on our hugging

play13:25

face repulse tree it's using this Docker

play13:28

file to create the container and we can

play13:31

check the log by clicking here you can

play13:34

see it's doing exactly same thing which

play13:36

we have said that hey it's going to be

play13:39

work directory and then it's downloading

play13:43

anything it required you can click on

play13:46

this app to see the final result so

play13:48

right now it's building once done then

play13:51

it will say running on this place

play13:54

instead of building and then we will be

play13:56

able to interact with the application

play13:58

directly from the hugging face spaces

play14:01

now at this point we are getting error

play14:04

error is for permission denied so the

play14:07

docker file need to be updated with the

play14:10

correct permission first of all I'm

play14:12

going to create new user so I'll say run

play14:16

user add new user you can name it

play14:19

anything I name it new user change the

play14:22

user so new user is here and obviously I

play14:26

need to Define some environments so I'll

play14:29

Define these two environments which is

play14:31

like home is home/ user let's call it

play14:35

user instead of new user to be at the

play14:38

same pace so this is good now work

play14:41

directory need to be dollar home/ app

play14:46

and this also going to be dollar home/

play14:49

app but when you copy we need to add the

play14:52

permission so we need to Simply say c h

play14:57

u n for the

play15:00

user and that's the only thing we need

play15:03

so these three four steps is going to

play15:06

provide the correct permission for this

play15:09

uh for this app directory okay so let's

play15:13

once again commit

play15:15

with

play15:17

defining uh defining the

play15:21

permissions and push it once again and

play15:25

this is going to do all the magic we

play15:28

want

play15:29

okay once again we need to wait for this

play15:31

to complete congratulation you can see

play15:34

our application is running State and you

play15:37

can see we have this hello bitfumes

play15:39

because this is the homepage and on this

play15:42

container log we can see it's running on

play15:45

Port

play15:47

7860 everything is there H that's good

play15:50

but how I can use this ask or maybe just

play15:53

the docks because there's no way to

play15:57

write any URL okay for that you need to

play16:00

go on this top side here is three dot

play16:04

after the setting click on that and you

play16:07

will see that embed this space this

play16:10

thing will only be visible when you are

play16:13

having your space as publicly defined

play16:16

private one will not have embedding

play16:18

thing okay now you have this direct URL

play16:21

so click on this direct URL boom we have

play16:24

this and now we can try the docks here

play16:27

we have the docks this means we can

play16:29

simply try it out by providing translate

play16:33

to

play16:35

German how are you once again click on

play16:38

execute and this is taking time and this

play16:41

Pro that yes it's working sometime

play16:44

taking time is also good

play16:46

thing and finally we are getting the

play16:48

result generated text is here so this is

play16:52

how you can actually going to uh have

play16:56

all your uh models and this AI things

play17:00

llm things on the hugging space spaces

play17:04

which is really great so if you have

play17:07

learned anything from this video please

play17:09

go and click on the Subscribe button and

play17:12

hit that like button so that I will get

play17:15

the motivation to create more videos

play17:17

just like this if you have any question

play17:19

related to this video or any other video

play17:21

just comment below I'll Surly going to

play17:23

reply see you in the next video till

play17:25

then see you goodbye

Rate This

5.0 / 5 (0 votes)

الوسوم ذات الصلة
AI DevelopmentFastAPIHugging FaceDockerTutorialsMachine LearningAPI IntegrationGoogle FLanModel DeploymentOpen SourceTech Education
هل تحتاج إلى تلخيص باللغة الإنجليزية؟