How to Fine Tune GPT3 | Beginner's Guide to Building Businesses w/ GPT-3

Liam Ottley
25 Jan 202314:42

Summary

TLDRIn this instructional video, Liam Motley guides entrepreneurs through the process of fine-tuning GPT-3 using NBA player performance data. He demonstrates how to prepare data, create prompt-completion pairs, and execute the fine-tuning process to build a customized AI model. The tutorial covers technical steps, including using Python scripts and the OpenAI API, and emphasizes the importance of understanding AI model fine-tuning for business opportunities in the AI industry.

Takeaways

  • 📚 The video provides a step-by-step guide on how to fine-tune GPT-3 using specific data sets, such as NBA player performance data.
  • 🛠️ Fine-tuning GPT-3 is a significant opportunity for entrepreneurs in the AI industry, allowing them to build businesses and applications tailored to their needs.
  • 💡 Understanding the fine-tuning process is crucial for entrepreneurs looking to leverage AI models for business advantages.
  • 📈 The script walks through the process of downloading and preparing data, creating prompt and completion pairs, and using them to fine-tune a model.
  • 🔍 Data is manipulated in Google Sheets and then converted into a CSV format for use in the fine-tuning process.
  • 📝 The script demonstrates how to use Python scripting to automate the creation of prompt and completion pairs from a CSV file.
  • 🔑 An API key from OpenAI is required to access the fine-tuning functionality and to interact with the GPT-3 model.
  • 📝 The video script includes a method to generate a Python script that automates the process of creating prompt and completion pairs, making it scalable.
  • 🖥️ The script details the use of Visual Studio Code for editing and running the Python script that formats the data for fine-tuning.
  • 🔧 The video mentions the use of a GUI for interacting with the fine-tuned model, allowing users to input prompts and receive responses.
  • 🚀 The presenter emphasizes the importance of having a deep understanding of the fine-tuning process to stay ahead in the competitive AI industry.

Q & A

  • What is the main topic of the video?

    -The video is about the step-by-step process of fine-tuning GPT3 using a dataset, specifically NBA players' performance data, to create a fine-tuned AI model for entrepreneurial purposes.

  • Why is fine-tuning GPT3 considered a significant opportunity in business?

    -Fine-tuning GPT3 is seen as a significant opportunity because it allows entrepreneurs to leverage the power of advanced AI models to build on top of them and create valuable businesses.

  • What is the first step in the fine-tuning process as described in the video?

    -The first step is to find a set of data, in this case, NBA players' performance data, which will be used to train the GPT3 model.

  • How is the data manipulated before being used for fine-tuning?

    -The data is imported into Google Sheets, where unnecessary rows are removed, filters are applied to remove blanks, and the data is formatted into a CSV file ready for processing.

  • What tool does the video suggest using to visualize CSV files more easily?

    -The video suggests using a tool called 'Rainbow CSV' to visualize and understand the structure of CSV files more easily.

  • What programming language is used in the script to generate prompt and completion pairs?

    -Python is used in the script to generate prompt and completion pairs from the CSV data.

  • What is the purpose of the script provided by the video?

    -The script automates the creation of prompt and completion pairs from the CSV data, which is necessary for the fine-tuning process of GPT3.

  • What is the significance of the 'Max tokens' parameter in the completion creation process?

    -The 'Max tokens' parameter is a safety feature to limit the amount of text generated in a response, which in turn helps to control the usage and cost associated with the API.

  • Why is it necessary to have a unique suffix in the completions during the fine-tuning process?

    -A unique suffix helps differentiate the fine-tuned model's outputs from other texts, ensuring that the model's responses are correctly identified and processed.

  • What platform is used to interact with the fine-tuned GPT3 model after the process is complete?

    -A graphical user interface (GUI) is used to interact with the fine-tuned GPT3 model, allowing users to input prompts and receive responses.

  • What is the recommended model to use for fine-tuning according to the video?

    -The video suggests retraining with the 'DaVinci' model for better text recognition capabilities, after initially training with the 'Curie' model.

  • What advice does the video give for entrepreneurs looking to understand AI fine-tuning processes?

    -The video advises entrepreneurs to invest time in understanding the fine-tuning process to gain a competitive edge, identify data sources, and integrate them into AI models like GPT3 or the upcoming GPT4.

Outlines

00:00

📚 Introduction to Fine-Tuning GPT3 for Entrepreneurs

This paragraph introduces a video tutorial focused on fine-tuning GPT3 for beginners, especially entrepreneurs interested in AI. The speaker plans to demonstrate the step-by-step process of fine-tuning a model using NBA player performance data. The tutorial aims to provide a basic understanding of the fine-tuning process, enabling viewers to apply this knowledge to their AI ventures. The speaker emphasizes the importance of understanding these processes to identify opportunities and build valuable businesses in the 'AI Gold Rush'.

05:02

🔍 Step-by-Step Guide to Fine-Tuning with NBA Data

The speaker provides a detailed walkthrough of the fine-tuning process using NBA players' performance data. The process begins with downloading the dataset and manipulating it in Google Sheets to prepare it for fine-tuning. The data is then formatted into prompt and completion pairs, which are necessary for training GPT3. The speaker also discusses the creation of a Python script to automate the generation of these pairs, ensuring scalability. The paragraph concludes with the execution of the script to produce a JSON file containing the prompt and completion pairs.

10:02

🤖 Fine-Tuning Execution and Model Interaction

This paragraph details the execution of the fine-tuning process, starting with obtaining an API key from OpenAI and installing necessary libraries. The speaker explains how to prepare the data in JSONL format and fine-tune the model using a specific base model. After the fine-tuning process, a unique model name is provided, which can be used to interact with the model through a graphical user interface (GUI). The speaker encounters an issue with the completion output but identifies a fix by adjusting the 'Max tokens' parameter. The paragraph concludes with the speaker retraining the model with a different model for improved results and demonstrating the interaction with the fine-tuned model.

🚀 Conclusion and Encouragement for Entrepreneurs

The final paragraph wraps up the video by summarizing the process of fine-tuning a GPT3 model with a dataset and emphasizes the importance of understanding this process for entrepreneurs. The speaker encourages viewers to explore the provided resources, such as the Google doc with prompts and code, and to engage with the community for further learning. The speaker also highlights the potential of fine-tuned models to answer specific questions about the dataset and suggests that with more data, the model's understanding can become more flexible. The video concludes with an invitation to subscribe for more content on AI entrepreneurship and a reminder of the ongoing opportunities in the AI industry.

Mindmap

Keywords

💡Fine-tuning

Fine-tuning refers to the process of adapting a pre-trained machine learning model to a specific task by retraining it on a smaller, more focused dataset. In the context of the video, fine-tuning is applied to GPT-3, an advanced language model, to specialize it in understanding and generating content related to NBA player performance data. The script illustrates this by showing how to take a dataset and use it to train the model to produce specific outputs.

💡GPT-3

GPT-3 is a state-of-the-art language model developed by OpenAI, capable of generating human-like text based on the input it receives. The video's theme revolves around customizing GPT-3 through fine-tuning so that it can generate specific insights about NBA players. The script demonstrates the process of interacting with GPT-3 to prepare data and generate prompting completion pairs for fine-tuning.

💡Prompting Completion Pairs

Prompting completion pairs are sets of input-output examples used to train a language model. In the video, these pairs are essential for teaching the fine-tuned GPT-3 model how to respond to specific queries about NBA players. The script includes a method to programmatically generate these pairs from the NBA dataset, showing how to format them for training the model.

💡NBA Players Performance Data

NBA Players Performance Data refers to a dataset containing statistics and information about the performance of basketball players in the National Basketball Association. The video uses this dataset as the basis for fine-tuning GPT-3, aiming to create a model that can provide detailed statistics and summaries about NBA players. The script demonstrates how to manipulate this data to create prompting completion pairs.

💡CSV Format

CSV stands for Comma-Separated Values, a widely used format for spreadsheets and databases that stores data in plain text with commas separating each field. In the video, the NBA performance data is initially in CSV format, which is then manipulated and used to create prompting completion pairs for the fine-tuning process. The script shows how to read and process this CSV data.

💡JSON

JSON, or JavaScript Object Notation, is a lightweight data interchange format that is easy for humans to read and write and for machines to parse and generate. In the context of the video, JSON is the format used for the prompting completion pairs after they are generated from the CSV data. The script includes a step to convert the data into JSON L (JSON Lines), which is suitable for the fine-tuning process.

💡API Key

An API key is a unique code used to authenticate requests to an API (Application Programming Interface). In the video, the presenter mentions obtaining an API key from OpenAI to access their services for fine-tuning the GPT-3 model. The script includes steps to securely use this key within the program to interact with the OpenAI API.

💡Entrepreneur

An entrepreneur is an individual who creates a new business, bearing most of the risks and enjoying most of the rewards. The video is aimed at entrepreneurs interested in AI and how to leverage fine-tuning processes to build businesses in the 'AI Gold Rush'. The script emphasizes the importance of understanding the fine-tuning process for entrepreneurial opportunities in AI.

💡AI Gold Rush

The term 'AI Gold Rush' is used in the video to describe the current era of rapid development and commercial opportunity in artificial intelligence. The presenter suggests that understanding and utilizing AI technologies like fine-tuning GPT-3 can provide significant advantages to entrepreneurs in this competitive landscape.

💡Data Manipulation

Data manipulation involves altering or transforming data to fit a specific purpose or format. In the video, data manipulation is shown through the process of cleaning and formatting the NBA player performance data in Google Sheets before converting it into prompting completion pairs. The script details the steps taken to filter out blank entries and prepare the data for fine-tuning.

💡Python Scripting

Python scripting refers to writing code in the Python programming language to automate tasks or process data. The video demonstrates the use of Python scripting to generate prompting completion pairs from the CSV data and to interact with the OpenAI API for fine-tuning the GPT-3 model. The script provided is a practical example of Python's capabilities for data manipulation and API interaction.

Highlights

Introduction to a step-by-step video on fine-tuning GPT3 and building an AI startup.

Explanation of the necessity for entrepreneurs to understand fine-tuning processes in the AI gold rush.

Starting with NBA players' performance data set for the fine-tuning example.

Importing and manipulating data in Google Sheets to prepare for fine-tuning.

Downloading the data in CSV format for further processing in Visual Studio Code.

Using Chat GPT to generate a script for creating prompt and completion pairs.

The process of string interpolation and F-strings in Python for automating data processing.

Installing and using the rainbow CSV extension for better CSV visualization.

Running the Python script to generate prompt and completion pairs in JSON format.

Using OpenAI's API key for accessing fine-tuning capabilities on beta.openai.com.

Preparing data in JSON L format for the fine-tuning process.

Initiating the fine-tuning process with a specified base model and data file.

Creating a GUI for interacting with the fine-tuned GPT3 model.

Troubleshooting the GUI for displaying the entire completion from the fine-tuned model.

The importance of training with a DaVinci model for better text recognition.

The potential of fine-tuned models to answer specific and complex questions with enough training data.

Final thoughts on the importance of understanding fine-tuning for entrepreneurs in the AI industry.

Resources provided in the video description for further learning and experimentation.

Encouragement for viewers to subscribe and engage with the channel for AI entrepreneurship content.

Transcripts

play00:00

my recent video about how to fine-tune

play00:01

gpt3 and build an AI startup in a few

play00:03

minutes got a lot of questions about the

play00:05

Nitty Gritty details on how you can

play00:06

actually do this fine tuning process

play00:07

rather than try to answer you all in the

play00:09

comments I thought I'd pop back on here

play00:10

and make a quick video showing you the

play00:11

step-by-step process of how to go from

play00:13

data to prompting completions to a fine

play00:14

tune model that you can interact with

play00:16

I'm going to be breaking it down a lot

play00:17

more granularly in this video and going

play00:18

A to Z on how any beginner can come in

play00:20

and use data to fine tune it and get a

play00:22

fine-tuned version of gpt3 to use for

play00:23

their own purposes if you're an

play00:25

entrepreneur and are looking to start

play00:26

making money and building businesses in

play00:27

the AI Gold Rush you need to be aware of

play00:29

how these fine-tuning processes work so

play00:31

this video is exactly what you're going

play00:32

to need to learn to get to a basic level

play00:34

of understanding so that you can move

play00:35

forward and start hiring people and know

play00:36

what the process is and be alert and

play00:38

aware of what's going on so we're going

play00:39

to jump back in here where we left off

play00:40

with the NBA players performance data

play00:42

set I'm going to do this all again so

play00:43

we're starting off by downloading the

play00:44

data set links to this will be in the

play00:45

description below fine-tuning these

play00:47

models is one of the biggest

play00:47

opportunities in business of 2023 and

play00:49

I'd say for the next few years because

play00:51

of the power of these models and being

play00:52

able to build on top of them one of the

play00:54

most powerful skills that you can have

play00:55

as an entrepreneur right now is

play00:56

understanding how this process works so

play00:58

that you can start to see opportunities

play00:59

way you can apply data to these models

play01:01

and build a valuable business so don't

play01:03

go anywhere I'm going to be going step

play01:04

by step explaining this whole process

play01:05

and you're not going to want to miss any

play01:07

of it enough talk let's get stuck into

play01:08

it a quick overview of this process from

play01:10

start to finish to start off we need to

play01:11

find a set of data which in this case

play01:12

we're using the NBA players performance

play01:14

started from my previous video then

play01:15

we're going to take that data and create

play01:16

prompt and completion pairs which is the

play01:18

format that we need to provide to gpt3

play01:20

in order to fine tune it and finally

play01:21

we're going to be able to run a basic

play01:23

app on our computer that is a fine-tuned

play01:25

version of gpt3 that understands the

play01:27

starter that we fit to it to start off

play01:28

we're going to download the NBA player

play01:29

performance data from kaggle the dataset

play01:31

link will be in the description next

play01:32

we're going to import the data to Google

play01:33

Sheets so that we can manipulate it a

play01:35

bit we have an index row here on the

play01:36

left that we can remove

play01:38

now I'm just going to put a filter on

play01:39

this so that we can remove the blanks as

play01:41

I did in the previous video

play01:43

copy this

play01:44

create a new sheet and paste it in now

play01:46

we have it all formatted nicely and

play01:48

Compact and ready to go we're just going

play01:49

to download this data in a CSV format

play01:51

and then open it in Visual Studio code

play01:53

so we can have a look at it now what you

play01:54

need to do is you don't have to visual

play01:55

studio code or your favorite code editor

play01:56

what I'm going to do here it's already

play01:58

opened up but just to help you guys out

play01:59

I'm going to actually close that and

play02:01

then show how you open it so I'm going

play02:02

to go open folder

play02:03

and here is the folder that I've created

play02:05

you need to create a new folder anywhere

play02:07

on your desktop will work and then you

play02:09

need to open that folder by clicking

play02:10

open here and then inside it I've got

play02:12

the I've dragged in the CSV file that we

play02:13

just downloaded and in here's also some

play02:15

of the uh all the data that we download

play02:17

from the kaggle set which we don't

play02:18

really need at the moment because we've

play02:19

got the one that we need which is the

play02:20

scoring then you're going to want to

play02:21

copy the header row and take it over to

play02:23

chat GPT

play02:25

and then we're going to give it just an

play02:26

example row of data so it kind of knows

play02:28

what format works I'll give it a couple

play02:29

actually let's give it four rows of data

play02:31

to have a look at so this is the prompt

play02:33

I'm ready to get started I have a

play02:34

spreadsheet of basketball data these are

play02:35

the column headings in CSV format and

play02:36

I've pasted in the column headings and

play02:38

I've gone here are the rows of data in

play02:39

CSV format and pasted in a few example

play02:40

rows for it to use finally I'm asking it

play02:42

do you understand and I'm hoping after

play02:44

this prompt is sent it's going to say

play02:45

okay I understand what you're trying to

play02:47

do here I've read the CSV and I

play02:48

understand the format of the file great

play02:49

tattoo BT is understood what we've put

play02:51

into it and now we need to ask it for

play02:52

some prompting completion pairs the

play02:54

method I'm going to show you here is

play02:55

actually different to the one I did in

play02:56

my previous video in my last video the

play02:57

way I showed you how to generate these

play02:59

promptly completion pairs using chat GPT

play03:00

is actually not that scalable so I've

play03:02

done a little bit of research and

play03:03

figured out how to get chat GPT to write

play03:04

us a script to generate these pairs and

play03:06

so we can do that with like hundreds of

play03:08

thousands of different pairs so we're

play03:10

going to get stuck into that now I'm

play03:11

telling chat GPT that I want to

play03:12

fine-tune gpt3 using the starter write

play03:15

me a script to create prompted

play03:17

completion pairs within this format and

play03:18

then pasting some examples of the format

play03:19

that I want it back in I'm actually

play03:21

going to insert python in here just to

play03:22

be extra clear okay now after a bunch of

play03:23

missing around and tweaking this just to

play03:25

get it just right so you only have to do

play03:27

it once this is the exact prop that you

play03:29

guys will need to put in I had to really

play03:30

coach it in order to give the right

play03:31

script out the first time so that you

play03:32

guys didn't have to mess around with it

play03:33

so if you're following along head down

play03:35

below to the description I'm going to be

play03:36

pasting these prompts into a Google doc

play03:37

and leaving the share link for you guys

play03:39

to check out down below so if you want

play03:40

to get this entire prompt here that I've

play03:41

had to tweak quite a lot and you can

play03:43

follow along then just kid down below

play03:44

and you can get it off Google doc after

play03:46

submitting that prompt it gave me back

play03:47

the script that I need to get this data

play03:49

into the promptly completion pairs so

play03:50

once you have this code all you need to

play03:52

do is copy it and head back over to vs

play03:53

code right click in here and create a

play03:55

new file and we're going to call it

play03:56

main.pi

play03:58

paste this in save the file and then

play04:01

you're going to want to make sure up

play04:02

here this basketball data.csv is the

play04:04

same name as this so you can just click

play04:06

on here press enter paste it in there

play04:07

and save the file so now we have the

play04:09

name thing all ready to go and then we

play04:11

have it the script all ready to go as

play04:13

well if you're not too familiar with

play04:14

programming and and this basic sort of

play04:16

python scripting I'm going to give you a

play04:17

quick run through of what's Happening

play04:18

Here so that you're not completely blind

play04:19

here we have our CSV data and this is

play04:21

comma separated value so each one of

play04:23

these at the top you have the header and

play04:25

a comma separates it to the next column

play04:26

so this is basically a condensed version

play04:28

of a spreadsheet that a computer can

play04:29

read we have all of the headings up here

play04:30

and they're separated by commas so the

play04:32

computer can read along and say okay

play04:33

this is a column and I've got a uh

play04:36

extension installed on this computer

play04:38

that allows me to see these a lot more

play04:39

easily I'm actually just going to show

play04:40

you that now

play04:42

and here it is it's called rainbow CSV

play04:44

if you just install this quickly then

play04:45

it's going to help you visualize the CSV

play04:47

a lot more easily like I am here so it's

play04:49

pretty straightforward we have the

play04:50

header row and then underneath it is

play04:51

each of the data points for that same as

play04:53

a spreadsheet but it's just not

play04:54

formatted as nicely so by using a method

play04:56

called string interpolation and F

play04:57

strings as you can see here this f means

play04:59

that anything that you put inside of

play05:01

these curly brackets here is going to be

play05:03

uh the value of this player here is

play05:06

index player so this is referencing the

play05:08

player column here right at the end so

play05:09

this player one when you see that key

play05:11

player it's referencing that in the

play05:13

script here so for every row of data

play05:15

that we have coming along here it's

play05:17

going to take this player column which

play05:19

is the green thing here and it's going

play05:21

to say okay I'm gonna because we're on

play05:22

this row and it's going to work down

play05:24

every single row through the whole sheet

play05:25

it's going to take the player value

play05:27

there

play05:28

and write the prompt with it write a

play05:30

summary of player values statistics and

play05:32

then it's going to start building the

play05:33

completion for that row and once again

play05:35

it inserts the player's name and it says

play05:36

player name played games played games

play05:39

starting game started of them so it

play05:42

writes a big long sentence and summary

play05:44

of what the player's data is and every

play05:46

time it comes to one of these blocks

play05:47

here which has got the curly brackets

play05:49

it's reaching into this file and

play05:50

plucking out the right value as the

play05:52

script has made it then all it needs to

play05:53

do is append these prompting completion

play05:55

pairs together into the format that we

play05:56

asked it for back in chat GPT here which

play05:59

is this format

play06:00

and then it just saves it to a Json and

play06:02

dumps it out for us to look at if you're

play06:03

following along I'm assuming you've got

play06:04

python installed on your machine if not

play06:06

head to their website and you can

play06:08

download the latest release of python

play06:09

now that you understand the script let's

play06:10

actually run it and then see what we get

play06:12

the command to run a script and platform

play06:13

is Python and then the name of the

play06:15

script so this case is main.pi and hit

play06:17

enter and then just like that we have a

play06:19

prompt and completion peers.json which

play06:21

looks like a complete mess but it is a

play06:23

ton of data all formatted exactly how we

play06:24

wanted it if you'd like to see it a

play06:25

little bit more pretty what we can do

play06:27

here is type our pritia

play06:30

and this one here if you just install

play06:32

this quickly and then you head back to

play06:34

your uh prompt and completion pairs you

play06:36

can press option shift and if and then

play06:39

it will format it all up nicely like

play06:41

this it looks really good and actually

play06:42

color codes and understand it as Json so

play06:44

you're actually probably going to need

play06:45

to do this so head over grab that

play06:46

extension and come back and press option

play06:48

shift and F now we can see the result of

play06:50

our hard work so now if every player on

play06:51

that spreadsheet that we started with we

play06:53

have the prompt which is write a summary

play06:54

of Luca don't shoot your statistics and

play06:55

in the completion which uses check TPT

play06:57

summary structure and then it's simply

play07:00

done it programmatically and use string

play07:01

installation which is a python feature

play07:03

to pluck data out of that CSV file and

play07:06

put it into the correct place to create

play07:08

this completion I've just taken all of

play07:09

the code out of the main.pi and put it

play07:10

into a new file called generate.pi and

play07:12

saved it so that we can play around in

play07:13

the main.pi file for the next step

play07:15

another fun part of taking all of these

play07:17

pairs and funneling them into gpt3 and

play07:19

fine-tuning it begins to get started

play07:21

you're going to want to head to

play07:22

beta.openai.com and you're going to want

play07:24

to head over to your personal section

play07:25

and view your API key to create an

play07:27

account if you haven't already it's free

play07:28

and then you're going to want to create

play07:30

a new API key here I am just creating a

play07:32

new one now you can see all that on

play07:34

screen but I'm going to delete it you

play07:35

cheeky guys in the in the comments I'm

play07:37

just going to put that there to save it

play07:38

for later now we're on the documentation

play07:39

page for fine tuning bio open AI so

play07:41

we're going to head down to the

play07:42

installation we're going to copy this

play07:44

and head over to our terminal and paste

play07:46

it in

play07:47

I've already got everything downloaded

play07:49

of course so the requirements are

play07:50

already satisfied for me but it should

play07:52

start installing and show a progress bar

play07:53

for you now you need to copy this export

play07:54

string and just bring it back over to

play07:56

your main.pi file to copy this entire

play07:58

API key and then paste it within these

play08:01

quotations

play08:03

now we need to copy this entire thing

play08:04

hit back and paste this in and this

play08:07

means that it's worked we've exported

play08:09

the open AI key so now it's ready to use

play08:10

for later next we're going to have to

play08:11

prepare our data so you can copy this

play08:13

head over to your terminal and delete

play08:15

all this and this local file means we

play08:18

need to reference our content completion

play08:19

file and what we can do is come over

play08:21

here copy all of this head back paste it

play08:25

in there and hit enter I've just taught

play08:27

it to prepare my data and it says here

play08:28

your file appears to be in Json will be

play08:31

converted to Json L which is the format

play08:32

it needs your file contains 250 prompted

play08:34

completion pairs which is a pretty good

play08:36

starting little batch of data we have

play08:38

here now it gives you me a whole bunch

play08:39

of tips and tricks on how it can make

play08:42

the data better and get better results

play08:43

out of it so you probably want to have a

play08:44

read through this whenever you do this

play08:45

again and uh follow all of the

play08:47

instructions here because it's going to

play08:48

make your model a whole lot better for

play08:50

all of these pairs we should really have

play08:51

a suffix on it that is really unique and

play08:53

like a bunch of slashes and hashtags and

play08:55

stuff too there's a few things here like

play08:57

starting all of your completions with

play08:58

the white space character using a unique

play09:00

ending like pound signs on the end of

play09:02

your completions all of these are really

play09:04

important to do we don't have time in

play09:05

the scope of this video if you'd like a

play09:06

little bit more on that I can shoot a

play09:08

quick loom for you guys and put it in

play09:09

the in the comment section below but for

play09:11

all purposes of this video we're good to

play09:13

go and we can hit in on this and it's

play09:14

going to convert it from Json into Json

play09:16

l

play09:18

and again add a white space character to

play09:19

the beginning of the completion it's

play09:20

going to do it for us which is great

play09:22

and yes

play09:25

and just like that we have it all made

play09:26

up into Json L format ready to put into

play09:29

the fine tuning process now we can

play09:30

actually fine tune our model uh you need

play09:32

to head over here and copy this and note

play09:34

that you can change the name of the base

play09:35

model that you're starting from so I've

play09:38

got this put in here open AI fine

play09:39

tunes.create I've put Curie at the end

play09:41

to specify the model that we want to use

play09:42

and now I also need to put in the path

play09:44

to the file which is all the data that

play09:46

it's going to use to fine-tune so I've

play09:48

got to go over here copy this entire

play09:50

thing including the suffix and then

play09:53

paste this in here and then hit enter so

play09:55

what this is now going to do is upload

play09:56

all of that data and put you in the

play09:57

queue to fine tune and then it's going

play09:59

to put all that data through their fine

play10:01

tuning method and then the result is

play10:02

going to be a fine-tuned version of gpk3

play10:04

that is familiar with all of this

play10:05

basketball information and in just a few

play10:07

minutes we've got our fine tune model

play10:08

complete and here down on the bottom of

play10:10

the screen you can see what the name of

play10:12

the model is called you're going to want

play10:13

to copy that and save it for later now

play10:14

we need to head back to chat gbt so that

play10:16

we can get a graphical user interface or

play10:17

GUI so that we can interact with this

play10:19

now to make things super easy I'm

play10:20

actually just going to grab the GUI

play10:21

script from the previous video that I

play10:23

did so I'm going to copy this this is

play10:24

going to be available in the Google talk

play10:26

down in the description so head over

play10:27

there and grab that this is going to

play10:28

have all this code that you need in

play10:30

order to run a basic GUI so that you can

play10:32

interact with your fine tune model and

play10:34

we're going to head back and here we

play10:36

have the name of our model that we want

play10:37

to cut that out of there and replace it

play10:39

within these uh quotations here now

play10:42

before you try to start this app up make

play10:44

sure you save your main.pi file before

play10:45

you start running it python main.pi

play10:47

function ran and just like that we have

play10:48

a fine tune gpt3 window up and we can

play10:50

start giving it prompts if I paste in

play10:52

one of these prompt here Jason Tatum

play10:55

statistics and paste them in it's going

play10:57

to give me out what is the beginning of

play10:58

this completion now it seems to be

play11:00

having an issue where it's not writing

play11:01

out the entire completion I'm not sure

play11:02

if that's an issue with my API key or a

play11:04

limit on the API request or it's an

play11:07

issue within the GUI itself but I'm

play11:09

going to have a dig into that over the

play11:10

next couple hours and get back to you

play11:12

guys in a bit hey guys I just took a

play11:13

little look into why it's not completing

play11:15

the rest of it and turns out it's a

play11:16

pretty simple fix in the completion line

play11:18

here

play11:19

openai.completion.create there's

play11:20

actually a parameter that is a Max

play11:23

tokens that is usually set so 10 or

play11:25

something just so that it limits how

play11:27

much you're charged through the API so

play11:28

it's a built-in safety feature to stop

play11:30

you spending too much money all you need

play11:31

to do is come in and change us here by

play11:33

adding in this comma Max tokens equals

play11:35

I've done 150 and that about that's

play11:37

about right for what we need so if I

play11:38

just run the Python program again here

play11:42

bring it over another thing you need to

play11:44

note as well is that in the preparing

play11:46

process I didn't notice it at the time

play11:48

but what it did is trim out the writer

play11:50

summary part so as you can see here it

play11:51

says write a summary of look Advantage

play11:53

statistics but what it did is because it

play11:55

was shared with all of the different

play11:57

prompts it actually cut it out in the

play11:58

preparation process so all our prompts

play12:01

are really is just their name and their

play12:02

statistics afterwards so if I take that

play12:04

over to our app here and put it in load

play12:07

for a bit now I've actually retrained

play12:08

this on a DaVinci model off camera just

play12:10

because DaVinci is actually a lot better

play12:12

for recognizing what text you want so if

play12:14

you train with Curie make sure you go

play12:15

back and retrain with DaVinci it cost me

play12:17

about three bucks in order to get it

play12:18

retrained but definitely worth it and as

play12:20

you can see here we've got the entire

play12:22

prompt here which we expected and it's

play12:24

actually started giving us information

play12:25

on Karl Anthony towns and statistics so

play12:27

I'm not 100 sure why it's continuing

play12:29

down the list there but we got our

play12:31

result we got our entire print out of

play12:33

janus's Statistics so I've been a result

play12:35

the last episode thanks for sticking

play12:37

around and we'll get back to the video

play12:37

for the purpose of this video I've shown

play12:39

you how to go from start to finish how

play12:40

you can take your data you can prepare

play12:42

that data you can get it put into a

play12:44

promptly completion Pairs and then

play12:46

finally fine-tuned your version of gpt3

play12:48

so that you can start interacting with

play12:49

it of course this is an extremely basic

play12:51

example and the understanding that this

play12:52

gpt3 fine-tuned model has of this

play12:54

basketball topic is very limited and you

play12:56

need to give it uh probably thousands

play12:58

and thousands more variations of these

play13:00

prompts once you've given it enough data

play13:01

its understanding will be flexible of

play13:03

the topic and you'd actually be able to

play13:05

ask us basically any question and start

play13:06

being sort of specific about it and

play13:07

saying hey what are the three players

play13:09

playing for the XYZ team who have the

play13:12

highest field goal percentage and that's

play13:13

the kind of stuff that you'd eventually

play13:15

be able to get to given enough training

play13:16

data so that's all for the video guys

play13:17

I've shown you how to go from start to

play13:18

finish and fine tuning a model with a

play13:20

bit of data so if you have any questions

play13:21

about this or you've got stuck or

play13:23

something's not working on your computer

play13:24

Drop it Down Below in the comments

play13:25

either I'll help you or someone else in

play13:26

the community will the important thing

play13:28

about learning this process is that as

play13:29

an entrepreneur you need to understand

play13:30

that this is what is going on behind the

play13:32

scenes for a lot of the startups that

play13:33

you're seeing just spending half an hour

play13:34

or an hour trying to understand this

play13:36

process is going to put you leagues

play13:37

ahead of other entrepreneurs and other

play13:38

people trying to make money in this AR

play13:40

gold rush because you understand the

play13:41

underlying buying technicalities of how

play13:44

these models are being created and

play13:45

fine-tuned but the understanding of this

play13:46

process you are going to be keeping a

play13:47

close eye out for data sources and

play13:49

understanding how you can get that data

play13:51

source and integrate it into a gpt3

play13:53

model or a gpt4 model which is coming up

play13:55

very soon so I hope you got something

play13:56

out of this remember that down in the

play13:58

description I'm going to have a Google

play13:59

doc having all of the prompts that I

play14:01

sent to church EBT and also the code so

play14:02

it should be pretty straightforward for

play14:04

you guys to have a play around with us I

play14:05

really hope you enjoyed and got

play14:06

something out of it if you like content

play14:07

like this my name is Liam Motley and I'm

play14:09

a self-made serial entrepreneur from New

play14:10

Zealand but now I'm living in Dubai I

play14:12

make AI entrepreneurship focused content

play14:13

at least three times a week for aspiring

play14:15

and established entrepreneurs looking to

play14:16

get into the AI industry and make money

play14:18

in this hour Gold Rush it's happening

play14:19

right in front of us so if that kind of

play14:21

stuff sounds interesting to you hit down

play14:22

below and subscribe to the channel hit

play14:24

the Bell so you don't miss my next one

play14:25

if you've got something out of this

play14:26

video please drop a like it really helps

play14:27

my channel a lot and of course leave

play14:29

your comments down below and I'll be

play14:30

answering as many as I can that's all

play14:31

for today thank you so much for watching

play14:32

and the best of luck to you as you

play14:33

navigate This Hour Gold Rush I'll see

play14:35

you next time

Rate This

5.0 / 5 (0 votes)

Related Tags
AI Fine-TuningGPT3 ModelStartup GuideData ManipulationEntrepreneurshipAI Gold RushTech TutorialCSV FormatPython ScriptingAPI Integration