Finetuning Flux Dev on a 3090! (Local LoRA Training)

promptcrafted
11 Aug 202419:24

Summary

TLDRDans cette vidéo, l'orateur explique comment entraîner le modèle Flux Dev en utilisant l'outil AI d'OST. Il souligne l'importance d'avoir 24 Go de VRAM et partage des étapes détaillées sur la configuration du dossier de travail, la préparation du fichier YAML et l'intégration des dépendances. L'utilisateur apprend à configurer des hyperparamètres, des jeux de données et à générer des images de test. Le formateur explique également les cycles d'apprentissage du modèle, tout en donnant des conseils pour éviter un surapprentissage. En conclusion, il mentionne l'importance de la patience et de l'expérimentation dans l'entraînement des modèles.

Takeaways

  • 💻 Assurez-vous d'avoir au moins 24 Go de VRAM, idéalement sur une carte graphique comme la 3090 Ti, pour faire fonctionner l'outil d'IA OST.
  • 📂 Organisez vos fichiers dans des répertoires dédiés, comme un répertoire de travail 'work' avec un sous-dossier 'train' pour y regrouper tous les fichiers nécessaires.
  • ⌨️ Suivez les étapes de configuration ligne par ligne pour installer les dépendances dans l'environnement de commande.
  • 📑 Le fichier YAML est crucial pour la configuration des hyperparamètres, assurez-vous de copier et modifier l'exemple 'train_lora_flux_24g.yaml' pour adapter votre formation.
  • 🖼️ Préparez correctement votre dataset avec des fichiers de texte et des images correspondants, et configurez la génération de fichiers tensor pour la sauvegarde régulière pendant l'entraînement.
  • 🎨 Le modèle peut entraîner des styles différents, ici un style anime, avec des étapes régulières pour évaluer les images générées et ajuster les paramètres.
  • ⚙️ L'outil permet l'utilisation de différentes tailles d'images grâce à la fonction de 'buckets', ce qui améliore l'efficacité du processus de formation.
  • 🚀 Activez les options de 'shuffle tokens' et 'gradient checkpointing' pour optimiser l'entraînement sur les ressources VRAM limitées.
  • 🔑 Configurez un fichier `.env` avec votre clé Hugging Face pour que le modèle puisse accéder aux ressources nécessaires pour la formation.
  • ⏳ L'entraînement peut prendre environ 4 heures et vous pouvez observer des cycles d'apprentissage et de régression dans le processus, nécessitant plusieurs ajustements.

Q & A

  • Quelle est la configuration matérielle requise pour entraîner Flux Dev avec l'outil d'IA d'OST?

    -Il est recommandé d'avoir une carte graphique avec 24 Go de VRAM, comme la 3090 Ti, pour que le processus d'entraînement fonctionne correctement sur une machine locale.

  • Pourquoi l'utilisateur doit-il configurer un fichier YAML?

    -Le fichier YAML contient les hyperparamètres nécessaires à l'entraînement, comme le nom du modèle, les dossiers de sortie, et les paramètres de performance. Il est essentiel pour personnaliser et optimiser le processus d'entraînement.

  • Quels sont les principaux hyperparamètres à ajuster dans le fichier YAML?

    -Les principaux hyperparamètres incluent le nom du fichier final, le dossier de sortie, les paramètres de la dimension (rank et dimension), la fréquence de sauvegarde des tensors, et la configuration des ensembles de données, entre autres.

  • Quel est l'intérêt d'utiliser des images de tailles différentes pendant l'entraînement?

    -L'utilisation d'images de différentes tailles permet au modèle de s'entraîner avec des aspects variés, ce qui améliore sa flexibilité et sa capacité à générer des images dans plusieurs résolutions, rendant l'entraînement plus efficace.

  • Pourquoi l'utilisateur mentionne-t-il l'activation du mode 'gradient checkpointing'?

    -L'activation du 'gradient checkpointing' permet de gérer les limitations de VRAM en économisant de la mémoire lors de l'entraînement, ce qui est utile pour les utilisateurs ayant des ressources limitées.

  • Pourquoi l'utilisateur saute-t-il certaines étapes d'échantillonnage au début de l'entraînement?

    -L'utilisateur saute les échantillons initiaux pour éviter de perdre du temps et des ressources, car ces échantillons ne sont pas essentiels dans les premières étapes d'entraînement. Il préfère générer des échantillons à intervalles réguliers plus tard.

  • Comment l'utilisateur gère-t-il les espaces de stockage pendant l'entraînement?

    -L'utilisateur configure le système pour conserver un nombre limité de fichiers de tensor (quatre dans cet exemple), en supprimant les plus anciens pour économiser de l'espace disque.

  • Pourquoi est-il important de configurer un token Hugging Face dans le processus?

    -Le token Hugging Face est nécessaire pour accéder aux modèles et aux ressources externes hébergés sur Hugging Face, ce qui est crucial pour l'entraînement des modèles en utilisant les bibliothèques disponibles.

  • Quelles sont les étapes à suivre si l'on souhaite reprendre un entraînement interrompu?

    -Il est possible de reprendre un entraînement interrompu en utilisant la fonctionnalité de reprise automatique à partir du dernier point de sauvegarde. Cette option peut être configurée dans le fichier YAML.

  • Quels sont les défis mentionnés lors de l'entraînement de modèles dans des cycles d'apprentissage?

    -L'utilisateur observe que le modèle passe par des cycles d'apprentissage et de régression, où il apprend puis oublie certaines caractéristiques. Il recommande de ne pas arrêter l'entraînement trop tôt pour éviter une perte de nuances.

Outlines

00:00

🖥️ Introduction à l'entraînement de Flux Dev avec l'outil AI d'OST

Le paragraphe introduit le processus d'entraînement de Flux Dev à l'aide de l'outil AI d'OST. Il met l'accent sur la nécessité de disposer de 24 Go de VRAM, soulignant l'utilisation d'une 3090 Ti pour ce projet. L'auteur partage ses préparatifs, notamment la création d'un répertoire de travail appelé 'train', puis mentionne la progression ligne par ligne du processus et les outils disponibles. L'auteur envisage de mettre certaines tâches en pause pour rendre l'expérience plus fluide pour les utilisateurs.

05:02

📝 Configuration et modification du fichier YAML pour l'entraînement

Ce paragraphe explique comment configurer un fichier YAML pour l'entraînement. L'auteur détaille les paramètres importants, comme la modification du nom du fichier et des dossiers de sortie. Il aborde également le changement des hyperparamètres tels que le 'rank' et la 'dimension' pour une meilleure optimisation. D'autres ajustements incluent la fréquence de sauvegarde des fichiers tensoriels et la préparation du dataset pour l'entraînement. Le paragraphe se termine par une explication sur l'activation de certaines fonctionnalités comme le 'shuffle token' et la résolution multiple pour les images.

10:03

🔐 Configuration des clés Hugging Face pour l'entraînement

Ce paragraphe explique comment configurer une clé Hugging Face pour appeler les modèles nécessaires à l'entraînement. L'auteur montre comment créer un fichier '.env' dans le dossier racine de l'outil AI d'OST et y insérer la clé secrète de Hugging Face. Cela permet à l'environnement de télécharger et utiliser les modèles pré-entraînés. Une note supplémentaire est ajoutée concernant les ajustements nécessaires à apporter sous Windows pour faire fonctionner correctement la commande YAML.

15:04

🚀 Démarrage et progression de l'entraînement du modèle

Le paragraphe décrit le processus de démarrage de l'entraînement du modèle après avoir configuré l'environnement. L'auteur explique qu'il a déjà téléchargé les modèles nécessaires pour accélérer le processus, évitant ainsi un long temps d'attente. Il souligne l'importance des tailles d'image multiples dans l'entraînement et évoque l'utilisation des différentes résolutions pour améliorer la flexibilité du modèle. Finalement, il montre les progrès après 200 étapes, soulignant que bien que le modèle commence à adopter le style visuel, il reste encore du chemin à parcourir.

🎨 Observation des résultats intermédiaires et conseils d'entraînement

Dans ce dernier paragraphe, l'auteur présente les résultats obtenus après 600 étapes d'entraînement, soulignant l'évolution notable du style visuel. Il recommande de laisser l'entraînement se poursuivre sur plusieurs milliers d'étapes pour obtenir des résultats plus affinés et éviter le surapprentissage. L'auteur conclut en promettant une vidéo plus détaillée pour analyser les cycles d'apprentissage et fournir des comparaisons côte à côte, tout en rappelant que le modèle sera disponible dans quelques jours après avoir terminé les tests finaux.

Mindmap

Keywords

💡VRAM

La VRAM (Video Random Access Memory) est un type de mémoire utilisé pour stocker des données graphiques temporaires, essentielles pour exécuter des tâches lourdes en termes de calcul graphique, comme l'entraînement de modèles d'intelligence artificielle. Dans la vidéo, l'auteur précise qu'il utilise une carte graphique avec 24 Go de VRAM (comme la 3090 Ti), car un tel volume de mémoire est nécessaire pour mener à bien les étapes de formation du modèle sans erreurs de performance.

💡Commande prompt

Le 'commande prompt' est un outil en ligne de commande qui permet d'interagir directement avec le système d'exploitation en entrant des instructions. Dans la vidéo, l'auteur ouvre une fenêtre de commande prompt pour naviguer vers le bon répertoire de travail et exécuter des commandes pas à pas dans le cadre de l'entraînement du modèle d'IA.

💡Dépendances

Les dépendances sont des bibliothèques ou des modules logiciels nécessaires au bon fonctionnement d'un programme. L'auteur insiste sur l'importance de charger toutes les dépendances avant de commencer l'entraînement, car elles permettent d'assurer le bon déroulement des processus. Il note également que certaines dépendances prennent plus de temps à charger que d'autres.

💡Fichier YAML

Un fichier YAML (Yet Another Markup Language) est un fichier de configuration couramment utilisé pour définir des paramètres dans des projets informatiques, notamment en intelligence artificielle. Dans ce script, l'auteur montre comment personnaliser un fichier YAML pour ajuster des hyperparamètres importants comme le nombre de dimensions, le répertoire de sortie, et les étapes d'enregistrement des résultats pendant l'entraînement du modèle.

💡Hyperparamètres

Les hyperparamètres sont des paramètres réglés avant l'entraînement d'un modèle d'IA et ne sont pas appris à partir des données. Ils contrôlent divers aspects du processus d'entraînement, comme la taille du lot, le taux d'apprentissage ou la fréquence d'enregistrement des résultats. L'auteur ajuste certains hyperparamètres comme le 'rank' et la 'dimension' pour optimiser son modèle de type 'fantasma anime'.

💡Modèle SDXL

SDXL est un modèle d'IA de génération d'images qui peut être affiné avec des ensembles de données spécifiques. Dans la vidéo, l'auteur mentionne qu'il a formé un modèle 'LoRA' pour SDXL précédemment et qu'il s'appuie sur cette expérience pour former un nouveau modèle basé sur un style de fantasma anime. Il explique également que SDXL a certaines limites que ce nouvel outil surmonte, comme la capacité à gérer différentes résolutions d'images.

💡Entraînement d'un modèle

L'entraînement d'un modèle fait référence au processus d'apprentissage d'un modèle d'IA à partir de données d'exemples. Dans la vidéo, l'auteur passe par plusieurs étapes pour entraîner un modèle sur des données d'images stylisées 'anime', ajustant des paramètres et observant les résultats à intervalles réguliers. Il mentionne que l'entraînement prend environ 4 heures et qu'il vérifie régulièrement les résultats intermédiaires pour ajuster les paramètres si nécessaire.

💡Trigger word

Un 'trigger word' est un mot-clé utilisé pour activer une fonctionnalité particulière ou pour influencer la génération de contenu dans un modèle d'IA. L'auteur mentionne qu'il utilise un 'trigger word' dans son fichier YAML, mais il n'est pas sûr de son impact précis sur le modèle généré. Il continue à tester cette fonctionnalité pour observer les différences que cela pourrait apporter.

💡Shuffle token

Le 'shuffle token' est une option qui permet de mélanger les tokens, ou morceaux d'entrée, afin d'améliorer la qualité de l'entraînement. Dans la vidéo, l'auteur décide d'activer cette option, curieux de voir l'impact sur la précision de son modèle d'IA. Il espère que cela pourrait améliorer la diversité des sorties générées.

💡Hugging Face

Hugging Face est une plateforme populaire pour les modèles d'IA et le partage de modèles préentraînés. L'auteur utilise Hugging Face pour récupérer des modèles préexistants nécessaires à l'entraînement de son propre modèle. Il explique comment configurer un fichier 'env' avec un jeton d'accès Hugging Face pour pouvoir télécharger les modèles directement.

Highlights

Ensure you have 24GB of VRAM, preferably using a 3090 Ti for optimal training performance.

Set up a working directory and populate it with necessary files, using a dedicated 'train' folder.

Open command prompt and navigate to the right directory to start setting up the training environment.

Edit the YAML file located in the 'config' folder to set hyperparameters and customize training settings.

Change key YAML settings like output folder, rank, dimension, and sample image generation intervals.

Set up the environment by adding your Hugging Face token for model access, saved in the '.env' file.

Adjust Python commands for compatibility with your system, such as removing the 'python3' call on Windows.

Ensure bucket resizing is enabled to accommodate images of different aspect ratios during training.

Monitor initial steps closely, as the model starts to adapt to the anime-style visuals within 200 steps.

At 600 steps, the model demonstrates significant progress in learning style and coloration.

Flex models can learn and regress in cycles, so don’t stop the training too early to avoid missing nuances.

For style models, consider extending training up to 3400 steps to avoid under-training and overfitting.

Use the checkpoint-saving feature wisely, setting up sample generations every 200 steps to monitor progress.

Flexibility with different sizes and aspect ratios improves the overall quality of the training outputs.

The entire training process may take about 4 hours, and results should be carefully tested afterward.

Transcripts

play00:00

okay today we're going to talk about how

play00:03

to train flux Dev uh using ost's AI

play00:07

toolkit um just make sure you can follow

play00:09

the requirements here uh it's really

play00:11

important that you have 24 gigs of vram

play00:16

I'm running this in a 3090 ti so

play00:19

obviously it can handle or it can be

play00:21

handled pretty well on a local machine

play00:24

but it can't be handled on a local

play00:26

machine that doesn't have enough vam

play00:28

there's a lot of Cool Tools here too

play00:30

we're just going to focus on this for

play00:35

today okay so next we are going to open

play00:39

up command prompt as you can see here

play00:42

then we just have to point towards the

play00:43

right directory so we're going to go

play00:45

through that step by step um first I

play00:48

have everything in a work directory

play00:50

that's sort of my like over arcing one

play00:53

and then I've set up a folder for this

play00:56

that I'm going to populate and pull this

play00:57

all into called train

play01:01

as you can see

play01:01

[Music]

play01:03

here and then we're just going to go

play01:06

through line by line as you can see um

play01:09

there might be like a better way to do

play01:11

this I am not really a coder so I'm just

play01:14

going to do it in the way that's the

play01:15

most intuitive for me um so just trying

play01:19

to populate each one of these and some

play01:22

of them might take a little longer in

play01:23

which case I'll just uh pause and you

play01:27

know um come come back to it it won't

play01:29

look like anything happen but that is

play01:32

maybe a little easier on you guys so I'm

play01:36

going to probably do that for this one

play01:39

let's

play01:41

see yeah this one's taking a little bit

play01:43

more

play01:46

time but it's all in all goes pretty

play01:49

fast um these last ones are going to

play01:52

take some time so I'll definitely be

play01:53

pausing during these yeah I'm going to

play01:55

past them anyway there might be a better

play01:57

way again to do that sorry if there is

play02:01

all right so here we

play02:03

go

play02:05

and we're going to let these

play02:08

dependencies load and yep I'm just

play02:11

speeding forward to the future running

play02:14

this as well I this all sort of come

play02:19

together and now we have everything sort

play02:21

of uh pulling pulling in so that is good

play02:27

it's important obviously to make sure

play02:29

all these different dependencies are set

play02:30

up the correct way and now it should be

play02:34

ready to get started but there are a

play02:35

couple more things that we have to do to

play02:38

set up uh the training so we're going to

play02:40

do that

play02:42

next um let's see in particular we're

play02:46

going to focus on setting up the yaml

play02:49

file with the

play02:52

settings okay now we're going to set up

play02:54

that yaml file which has really like all

play02:58

of the detail that we need to give in

play03:01

terms of hyper parameters so you're

play03:03

going to find that in the configure

play03:05

folder there's an example it's called

play03:08

train Laura flux 24 gig you want to copy

play03:11

that one and make a new version of it

play03:14

which is great and I'm just going to

play03:16

rename it

play03:18

um

play03:19

[Music]

play03:24

train well I really wanted to reflect

play03:26

what I'm training so I'm training a

play03:28

fantasma anime Laura which is based on a

play03:31

Laura I've trained previously for sdxl

play03:33

so we're going to call it that and then

play03:35

we're actually going to put it into

play03:38

the config folder so let me just move

play03:42

these oops r a little too far and drop

play03:45

that right in great um we're going to go

play03:48

ahead and edit that so we want to go

play03:51

through line by line so first and

play03:53

foremost I'm probably going to get rid

play03:55

of some of this extra script um but it

play03:57

might be useful for you

play04:00

uh you do need to change the name to the

play04:02

file name that you're going to want so

play04:04

go ahead and do that first uh I think

play04:07

I'm going to call it well I could call

play04:10

it the same thing but actually in this

play04:12

case I'm calling it

play04:14

fantasma

play04:16

anime and that's going to be the name of

play04:18

the final safe tensor we're going to

play04:21

delete some more of this text some

play04:23

things we don't really want to change

play04:25

either um so let's go through here

play04:30

delete

play04:31

that you're going to want to I I want to

play04:34

change my output folder so output is a

play04:37

subfolder and I'm just adding a

play04:39

subfolder which it will

play04:41

populate I'm going to uncomment this uh

play04:44

just so that we get those performance

play04:46

stats in the future I like them um it's

play04:49

up to you whether you want to keep them

play04:51

or

play04:52

not uh we're keeping the Cuda device the

play04:56

same and we're going to change the

play04:59

trigger word in this case I don't know

play05:01

if the trigger word makes a difference

play05:03

or not um I haven't actually seen a

play05:05

massive difference but it does

play05:06

definitely recognize the trigger word so

play05:09

you know worth worth

play05:11

investigating um I'm going to change

play05:12

this is uh linear and linear Alpha are

play05:15

your rank and dimension so I'm changing

play05:17

them both to 232 it's good to keep them

play05:20

the same

play05:21

number

play05:22

um flat 16 it's F I'm going to do every

play05:27

200 steps here that basically I want the

play05:31

save tensor file to save and also some

play05:34

sample images to generate every 200

play05:36

steps um I'm fine with four this is

play05:39

basically every time there's more than

play05:41

four save tensor files that's going to

play05:44

delete the oldest and then scrolling

play05:47

down a

play05:50

little uh yep um so you want to make

play05:53

sure that you prep your data set

play05:55

correctly so this is really good

play05:57

information to

play05:58

have um

play06:01

basically you just need the text prompt

play06:03

captions to be in a text file that has

play06:06

the same file name for it to recogniz

play06:08

really standard uh same for most

play06:10

training

play06:12

programs um I set up a folder for this

play06:17

uh that I'm want to call

play06:20

from here and then that's

play06:25

fine a lot of this stuff we don't really

play06:27

need to change because it's kind of of

play06:29

inherent to the training unless you're

play06:31

like super Advanced or just trying to

play06:33

figure some stuff out I am going to turn

play06:35

on shuffle token I actually didn't try

play06:37

this last time but I'm really interested

play06:38

in trying it this

play06:40

time so I'm going to do

play06:43

that

play06:45

um yep that is

play06:49

useful and something that you want to

play06:52

keep the same cash L disc um multiple

play06:56

resolutions great uh it actually solves

play06:58

a lot of issues what they had with sdxl

play07:00

and only be able to train in one one

play07:03

thing I like about this trainer as well

play07:04

is that you can train in different sizes

play07:07

uh so step suggust that step count I

play07:11

don't think 4,000 is needed for this one

play07:13

I'm going to go for

play07:14

3,000 but I do think that people do

play07:17

under Trin this model quite a bit um no

play07:21

text encoder training for this I haven't

play07:23

really seen any negative aspects of that

play07:26

as you would have with like sdxl um uh

play07:29

let's see I'm going to keep it

play07:33

balanced

play07:34

[Music]

play07:36

there yeah clean that up there great um

play07:42

gradient

play07:44

checkpointing

play07:47

yeah I do not have a ton of vram so I'm

play07:50

going to do

play07:51

that noise scheduler

play07:55

yep uh keep all that the same I'm not

play07:58

messing with thear rate right now but

play08:00

you could try out different learning

play08:01

rates I do want to skip the samples

play08:03

because it takes up a lot of time in the

play08:05

beginning and I don't necessarily need

play08:08

them it's good though if you're trying

play08:09

to compare the like the distance that

play08:10

your training has

play08:12

gone so I'm going to do

play08:15

that uh we'll keep that all the

play08:19

same let's see here all this looks fine

play08:23

all important to keep I'm not going to

play08:25

actually found that I didn't need to do

play08:27

the low vam mode even though I am right

play08:29

had 24 gigs found that actually was

play08:31

working just fine so I'm not turning

play08:34

that

play08:35

on you can though if you're if you're

play08:38

concerned um also you can restart from

play08:42

your last training uh stop if you need

play08:47

to this is the thing I want to do every

play08:50

200 just like I have for my

play08:53

uh my save I don't want my save to be

play08:57

off balance with my other

play09:00

stuff and then I'm going to keep these

play09:02

as the prompts I do want some prompts

play09:06

um I again I just don't know if the

play09:08

trigger makes a difference at this stage

play09:10

it very well might um but I haven't seen

play09:13

a significant difference one way or

play09:15

another I've seen a little difference

play09:16

but not one that's like super

play09:19

significant um but I'll I'll keep the

play09:21

trigger in a couple of these because

play09:22

it'll be nice to see the difference and

play09:24

compare them and everything else is just

play09:27

about the same I'll add my name into the

play09:29

metadata but otherwise this is like very

play09:31

straightforward I'm I might change the

play09:34

prompts for the sample images if I was

play09:38

training like a character or something

play09:40

like that for for this I I'll keep it

play09:41

the same so go ahead here yeah this

play09:45

looks good um we'll go ahead and save

play09:47

this and this one's good to go okay and

play09:50

there is one more really important step

play09:52

that we need to take before we really

play09:54

dive in and that is we do need to uh set

play09:58

things up so we can

play09:59

call from hugging face so we're going to

play10:02

do these next steps in a second but um

play10:06

you need to be able to call from Hing

play10:08

face so I'm going to show you how to set

play10:10

this up obviously I'm not going to show

play10:11

you my secret key but anyway you just

play10:14

want to create a new file you want to

play10:17

put it into the root which is the AI

play10:19

toolkit

play10:21

folder

play10:26

um okay yes so you want to put it in the

play10:30

AI toolkit

play10:35

folder and you want to save it as uh

play10:38

Point EnV under all files that's all you

play10:42

need to call it and then you're going to

play10:44

take this right here HF token equals

play10:47

your key

play10:49

here I'm going to open that file up drop

play10:52

it in here and then you need to replace

play10:54

that end bit so I'm not going to show

play10:56

you mine obviously but just replace that

play10:59

with a read file or read um token from

play11:03

hiking face and that will do it and then

play11:06

you just save it it's already in the

play11:07

route so it is going to be able to pull

play11:09

the models for

play11:11

you okay now we're going to go ahead and

play11:13

set up our model to start training so I

play11:16

did have to change a couple things here

play11:18

so I'm just going to show you what I did

play11:19

for that um this is all set up we have

play11:22

an environment set so we're in good

play11:25

shape uh but I am just going to open a

play11:28

note just so we can look at this and

play11:30

make some small adjustments because I

play11:32

noticed that maybe it's just a windows

play11:34

thing but I had to change the command a

play11:36

little bit for it to run so uh let's

play11:39

open in an empty let's see got a lot of

play11:43

these open right now

play11:45

[Music]

play11:46

so let's open an empty one

play11:51

here we do want this name by the way for

play11:55

the call that we're going to do so uh

play11:58

good to know that

play12:01

and drop this in so we're going to get

play12:02

rid of that three after python uh

play12:05

because for some reason it do not really

play12:07

recognize and in this case I need to

play12:10

change it to yam and then you just need

play12:13

to put in the name of that yaml file

play12:17

that you did before and you're good to

play12:19

go so it's pretty straightforward now

play12:23

this is where you would hit run I

play12:25

actually have it set up separately as an

play12:27

instance because I had been training

play12:30

earlier and I just don't want to run

play12:32

through all of the model loading again

play12:34

and downloading because it takes a while

play12:36

so when you run it for the first time

play12:37

it's going to take a little bit of time

play12:39

but don't worry about it I'm going to

play12:40

show from where my model uh is set up

play12:44

from which is going to be like right

play12:45

from the start so just explaining that a

play12:48

little though just so you're not

play12:50

confused all righty then so you can see

play12:52

my fresh environment here um I set it up

play12:56

I put it in the v environment I dropped

play12:58

that that um command that we had just

play13:02

put together and now it is starting

play13:05

everything up so normally if you did

play13:07

this it would be like a very long

play13:09

initial process because it's going to be

play13:12

pulling in you know the flux model and

play13:14

all of the other models that it needs to

play13:16

be able to train over in this case I'm

play13:19

saving a little bit of a headache

play13:21

because uh I've already done that once

play13:23

before so I really just am starting from

play13:26

the um it does download like Shar cards

play13:29

and stuff but generally most of what I

play13:31

need is already set up so uh it's

play13:34

loading the checkpoint pieces it's

play13:35

starting to put everything together um

play13:39

so we'll look at this for a moment but

play13:41

obviously it's sort of like General

play13:43

running through stuff so yeah so oh one

play13:47

nice thing is that it does uh do buckets

play13:50

of different sizes so you can use

play13:52

different size images which I think is

play13:53

super awesome and I love that ostr has

play13:56

included that um so definitely take

play13:59

advantage of that I'm so happy that I

play14:01

don't have to crop every image and this

play14:04

is sort of like how some of these newer

play14:05

models are trained to begin with so it

play14:08

it does make a really big difference in

play14:10

the overall training process to be able

play14:12

to show images of different um aspect

play14:15

ratios so yeah um other than that I'm

play14:19

going to give a little pause and here we

play14:20

go it has started so and these are all

play14:24

going to also

play14:27

save um

play14:29

well here you can see the actual data

play14:31

set that it's pulling from I actually

play14:32

should show you really what we're

play14:35

looking at in terms of images so if you

play14:37

take a look here I'll just adjust

play14:42

this we're going to make these extra

play14:44

large so this is the style that I'm

play14:46

training just to give you a little bit

play14:47

of a point of reference um kind of an

play14:50

anime style I really like how anime is

play14:52

done on flux and uh with a lot of like

play14:56

color explosive elements and and so we

play15:00

are going to jump ahead here and take a

play15:03

look at 200 steps in because I think

play15:07

this is a great sort

play15:09

of um visualization of how quickly

play15:13

things train and also just sort of like

play15:16

where we're at as you can see we are

play15:20

generating some example images here so

play15:24

I'll pop into those in a second to show

play15:26

you they're in that folder that we were

play15:27

talking about earlier

play15:30

so let me pull those up here so we did

play15:34

put those right into the output folder

play15:36

that we looked at a little earlier and

play15:39

there it actually populated the folders

play15:41

We need so now we can see them in

play15:42

samples so this is our first stop um you

play15:45

can see it is starting to pick up a

play15:47

little bit of that cartoony style some

play15:49

of the coloration certainly the clouds

play15:52

but it's the explosions as well but it's

play15:55

not really learned much yet so it's just

play15:58

a good example of how much Flex does

play16:01

learn quite quickly um with that in mind

play16:04

it still has quite a ways to go and I do

play16:06

find that people stop a little bit too

play16:08

early at times as well so I'm going to

play16:10

show you one more stop and then I

play16:14

probably won't go all the way to the end

play16:15

but do a second video for that

play16:18

part okay now we're sort of jumping

play16:20

further into the future this is at 600

play16:23

steps so you can see it's already

play16:25

generated a few different sets of

play16:27

samples and I what I really like to

play16:29

highlight here is how much we've adopted

play16:34

the style but I typically let this run

play16:37

3000 because it does go through Cycles

play16:40

so I definitely recommend taking your

play16:42

time with it don't get too ahead of

play16:45

yourself because you can stop earlier

play16:48

than you should and find that actually

play16:51

you lose a lot of nuance I'm going to do

play16:53

a second video and probably like a

play16:55

little nicer video cuz I was just trying

play16:57

to get this one out quickly but

play16:59

basically what I've observed is that it

play17:01

goes through these cycles of learning

play17:03

and regressing um that you know I've

play17:06

done a few trainings now that really

play17:09

stop at different steps and I've watched

play17:12

how it seems to pick up a lot of details

play17:15

sort of superficial details early on and

play17:17

then it starts to look a little bit

play17:19

overfit and then it kind of forgets a

play17:21

bunch of details and then it'll do the

play17:23

same thing over and over again and

play17:25

actually we've gotten uh with the help

play17:27

of ostas I was able to get a training

play17:30

all the way to

play17:32

3400 steps and it was still looking

play17:35

really nice it wasn't overfit that was

play17:37

for a character so I think a style would

play17:39

actually be even more flexible in some

play17:42

ways um but just to keep that in mind

play17:46

that in fact like you can kind of go

play17:49

pretty far with it so anyway you you'll

play17:51

find the files here um and they do get

play17:55

saved uh however many steps you've set

play17:57

in your EML I 200 once it gets to five

play18:01

because I set four as like the total

play18:03

limit it'll delete one of the older ones

play18:05

which is a nice way to save on space as

play18:08

well um but yeah I had thought about

play18:11

doing this video all the way to the end

play18:13

but I do think that sort of observations

play18:16

on training deserves an extra sort of

play18:21

focused video with some side by-side

play18:23

examples so I'm going to work on that uh

play18:26

and it'll probably be better than this

play18:27

one cuz I don't us edit the videos it's

play18:29

usually Timothy who's on the pr crafted

play18:32

team who does a lot of the video editing

play18:34

but I really wanted to get this out

play18:35

quickly because I know people are trying

play18:37

to pick up training as quickly as

play18:39

possible and I thought it would be nice

play18:40

to just have a little guide you should

play18:42

be able to adopt this guide for this in

play18:45

most ways but anyway uh I appreciate

play18:48

appreciate you guys uh following along

play18:51

and yeah I'm I'm excited to train more

play18:55

as you can see we're just only just

play18:57

above 600 steps here but I will release

play19:00

this model in a couple days um and it's

play19:05

you know it's it's quite a bit of time

play19:07

and energy I'm guessing this will take

play19:09

about 4 hours to train total and then

play19:11

I'll want to test it out um you'd set it

play19:14

up just as you'd set up any model really

play19:17

so yeah thank you guys so much and have

play19:19

a great one

Rate This

5.0 / 5 (0 votes)

Ähnliche Tags
Entraînement AIOutil OSTModèles FluxGPU requisYAML configModèle fantasmaHyperparamètresVRAM 24 GoHugging FaceAnime Style
Benötigen Sie eine Zusammenfassung auf Englisch?