Stable Diffusion for Flawless Portraits

Vladimir Chopine [GeekatPlay]
20 Mar 202313:23

Summary

TLDREn este video, se muestra cómo crear un retrato perfecto usando una imagen anterior como base. Se utiliza RPGverse 4 con una resolución de 512x768, ajustando el tamaño y el ruido para obtener mejores resultados. Se abordan técnicas avanzadas como el uso de control nets y scripts para ajustar detalles y preservar expresiones faciales. Se recomienda experimentar con configuraciones para lograr un resultado similar al original pero con cambios creativos en el entorno. El video también menciona la posibilidad de crear animaciones utilizando esta técnica.

Takeaways

  • 🎨 Se va a crear un retrato perfecto pegado sobre una foto anterior utilizando RPGverse 4.
  • 📏 La resolución recomendada para obtener mejores resultados con el modelo es 512 de ancho y 768 de altura.
  • 🔍 Se puede previsualizar y ajustar el tamaño de la imagen para adaptarla a las dimensiones recomendadas.
  • ✂️ Se sugiere usar la herramienta de recorte y redimensionamiento para ajustar las áreas laterales según sea necesario.
  • 🔄 El ruido y el escalado CFG (Control Frequency Generation) son importantes, con un rango seguro de 5 a 9, donde 7 es una opción segura.
  • 🖼️ Al generar la imagen, se pueden experimentar con diferentes niveles de ruido para ajustar la creatividad y la similitud con la imagen original.
  • 🤖 Se utiliza ControlNet para ajustar las poses y mejorar la precisión, con al menos dos modelos configurados para obtener mejores resultados.
  • 📐 La resolución de la anotación debe ser igual a la de la imagen para asegurar que las poses se muestren correctamente.
  • 🔍 Se puede mejorar la calidad del retrato utilizando otro control model, como el de Canon, para agregar más detalles.
  • 🧩 Al usar varios modelos, es necesario ajustar sus pesos para balancear la influencia de cada uno en la imagen final.
  • 🛠️ Se utiliza un script de 'imagen a imagen' para añadir ruido y luego denoizar a partir de ese ruido, lo que ayuda a obtener una imagen muy cercana al original.
  • 🖌️ Para preservar expresiones faciales y detalles, se puede usar la función de pintura para reemplazar elementos específicos de la imagen.
  • 🔍 Se puede ajustar el desenfoque para mejorar la integración de los elementos en la imagen, aunque esto puede introducir artefactos en algunos casos.
  • 🎞️ Se pueden crear animaciones utilizando técnicas similares, procesando la imagen en lotes para obtener una secuencia coherente.

Q & A

  • ¿Qué tipo de imagen se está creando en el video?

    -Se está creando un retrato perfecto pegado en una foto anterior, utilizando técnicas de inteligencia artificial para generar imágenes detalladas y personalizadas.

  • ¿Cuál es el modelo de inteligencia artificial utilizado en el script?

    -El modelo utilizado es RPG Verse 4, el cual fue entrenado específicamente con una resolución de 512 ancho y 768 alto.

  • ¿Por qué es importante ajustar la resolución de la imagen según las especificaciones del modelo?

    -Ajustar la resolución según las especificaciones del modelo asegura los mejores resultados posibles al trabajar con él, ya que estas dimensiones están optimizadas para el rendimiento del modelo.

  • ¿Qué es el 'noise' en el contexto de la generación de imágenes AI?

    -El 'noise' se refiere a la adición de ruido o variabilidad aleatoria en el proceso de generación de imágenes, lo que puede influir en la creatividad y la originalidad del resultado final.

  • ¿Qué es un 'control net' y cómo se utiliza en el proceso?

    -Un 'control net' es una red de control que se utiliza para ajustar y refinar ciertos aspectos de la imagen generada, como las poses o la calidad de los detalles, proporcionando más opciones y control sobre la salida final.

  • ¿Cómo se utiliza la resolución de anotación para asegurar que las poses en la imagen sean correctas?

    -La resolución de anotación debe ser igual a la resolución de la imagen para que las poses se muestren correctamente. Esto ayuda a evitar distorsiones o errores en la representación de las poses.

  • ¿Qué es el 'preprocessor' y cómo afecta la generación de la imagen?

    -El 'preprocessor' es un componente que procesa la imagen antes de la generación final, asegurando que las poses y otras características sean precisas y coherentes con la imagen de referencia.

  • ¿Qué es el 'Script image to image alternative test' y cómo ayuda a mejorar la imagen?

    -El 'Script image to image alternative test' es una técnica que utiliza el ruido de la imagen original, lo convierte en una señal de ruido y luego utiliza un sistema de desenrubo para acercar la imagen generada al original, mejorando la precisión y la calidad.

  • ¿Cómo se utiliza la pintura en el proceso de generación de imágenes para preservar ciertas características?

    -La pintura se utiliza para reemplazar o ajustar específicamente elementos de la imagen, como la expresión o la cara, mientras se preserva el resto de la imagen generada, permitiendo una mayor personalización y control sobre la salida final.

  • ¿Qué técnicas se pueden utilizar para crear animaciones a partir de la imagen generada?

    -Se pueden utilizar técnicas de procesamiento por lotes y generación de animaciones en serie utilizando la imagen como base, lo que permite crear secuencias de animaciones con cambios sutiles y coherentes.

  • ¿Cómo se pueden solucionar problemas de parpadeo en las animaciones generadas?

    -Para evitar parpadeos en las animaciones, se pueden utilizar técnicas específicas de generación de imágenes sin parpadeo, como las que se describen en otro video mencionado en el script, que ofrece un enfoque para crear animaciones sin flickering.

Outlines

00:00

🎨 Creación de un retrato perfecto con RPGverse 4

En este primer párrafo, se discute cómo crear un retrato perfecto aplicado a una foto previa utilizando el modelo RPGverse 4. Se mencionan las especificaciones del modelo, como el tamaño de la resolución y el ancho, y se sugiere ajustar el tamaño de la imagen a 512x768 para obtener los mejores resultados. Se habla de la importancia de la escala de ruido y la escala CFG, y se ofrecen consejos sobre cómo ajustar estos valores para obtener una imagen de alta calidad. Además, se menciona el uso del control net con al menos dos modelos para mejorar la precisión del retrato.

05:00

🖼️ Ajuste de la resolución y el uso de controles avanzados

Este párrafo se enfoca en ajustar la resolución y el uso de controles avanzados para mejorar la calidad del retrato. Se sugiere mantener la resolución del lienzo en 512 y ajustar la resolución de la imagen a 510, aunque también se menciona que 512 podría funcionar. Se discute la importancia de la resolución en la detallación y cómo reducirla puede afectar la calidad. Además, se abordan técnicas para ajustar el peso de los modelos y se presenta la técnica de 'image to image alternative test' para reducir el ruido y acercar la imagen al original.

10:03

🌟 Conservación de expresiones faciales y creación de animaciones

El tercer párrafo cubre la técnica de conservar las expresiones faciales al crear un retrato y cómo utilizar la función de 'inpainting' para reemplazar elementos específicos de la imagen. Se discute la aplicación de ruido y la preservación de las características faciales, así como la utilización de técnicas experimentales como el desenfoque para mejorar la integración del retrato en el entorno. Además, se menciona la posibilidad de crear animaciones utilizando la configuración de 'batch processing' y se ofrece una perspectiva sobre cómo se pueden superar los problemas de parpadeo en las creaciones AI.

Mindmap

Keywords

💡Retrato perfecto

El término 'retrato perfecto' se refiere a una imagen de alta calidad y realismo que capta de manera precisa los detalles de una persona. En el video, este concepto es central ya que se trata de crear un retrato que se asemeje a una imagen previa, utilizando técnicas avanzadas de edición y procesamiento de imágenes.

💡Control net

El 'control net' es una herramienta utilizada en el procesamiento de imágenes para controlar y ajustar ciertos aspectos de la imagen, como la postura o la expresión facial. En el video, se menciona que se utilizan dos modelos de control net para ajustar la postura y preservar detalles específicos, como las manos en una pose.

💡RPGverse 4

RPGverse 4 es el modelo de inteligencia artificial utilizado en el video para generar el retrato. Se destaca por su capacidad para trabajar con una resolución específica y proporcionar resultados óptimos. En el script, se menciona que el modelo fue entrenado con una resolución de 512 en ancho y 768 en altura.

💡Resolución

La 'resolución' es una medida de la cantidad de píxeles en una imagen, determinando su nivel de detalle y claridad. En el video, la resolución es crucial para asegurar que el retrato se ajuste y se asemeje a la imagen original, y se menciona que se debe igualar a la resolución de la imagen de referencia.

💡Ruido

El 'ruido' en el contexto de procesamiento de imágenes se refiere a la adición intencional de variabilidad aleatoria para influir en el resultado final. En el video, se explora cómo el ruido puede ser controlado y utilizado para ajustar la creatividad y la similitud con la imagen original.

💡Escalar

El término 'escalar' se refiere al proceso de cambiar el tamaño de una imagen o parte de ella. En el video, se discute cómo escalar y recortar la imagen para adaptarla a las necesidades del modelo y mejorar la calidad del retrato final.

💡Preprocesador

Un 'preprocesador' es un componente en el software de edición de imágenes que se utiliza para preparar la imagen antes del procesamiento principal. En el video, se menciona el uso del 'open pose' como un preprocesador para ajustar las posturas en la imagen.

💡Canon

El 'Canon' mencionado en el video se refiere a un modelo adicional de control net utilizado para mejorar los detalles de la imagen. Se utiliza para añadir más definición y nitidez a los elementos de la imagen, como el atuendo.

💡Script

Un 'script' en este contexto es un conjunto de instrucciones o un programa que automatiza ciertas tareas en el procesamiento de imágenes. En el video, se utiliza un script llamado 'image to image alternative test' para ajustar la imagen de acuerdo con el ruido y mejorar la similitud con la imagen original.

💡Pintar

El proceso de 'pintar' en el video se refiere a la fase de edición donde se ajustan y preservan detalles específicos de la imagen, como la expresión facial. Se menciona que se realiza en pasos para identificar y corregir áreas específicas antes de aplicar los cambios definitivos.

💡Denoising

El 'denoising' es el proceso de reducir o eliminar el ruido en una imagen para mejorar su calidad. En el video, se utiliza un sistema de denoising basado en ruido para ajustar la imagen y hacer que se asemeje más a la original, preservando detalles importantes como la textura del cabello.

Highlights

Creating a perfect portrait pasted on a previous photo using RPG verse 4.

Setting the best size resolution for the model to 512x768 for optimal results.

Adjusting the CFG scale and noise level for the highest quality image generation.

Using control net with two models to enhance the portrait generation process.

Importance of setting the annotation resolution equal to the image for accurate pose detection.

Utilizing the open pose preprocessor for better pose accuracy.

Adjusting the control model weights to balance details and pose accuracy.

Using the Script image to image alternative test for noise reduction and detail preservation.

Experimenting with different settings to achieve a close resemblance to the original image.

Preserving facial expressions and details using in-painting techniques.

Adjusting blur levels for better blending and avoiding artifacts.

Creating animations by processing images in batches for a sequence.

Addressing flickering issues in animations with specific techniques.

Comparing original and generated images to showcase the effectiveness of the process.

The potential for creating special compositing effects with AI-generated images.

Encouraging experimentation with different prompts for creative image generation.

Providing a link to a video on flicker-free AI image creation for interested viewers.

Transcripts

play00:00

hello there so in this video we're going

play00:02

to create a perfect portrait pasted on

play00:04

previous photo and right here you can

play00:06

see we have our water what we're going

play00:08

to use

play00:09

but before we jump to adjust sizes let's

play00:11

see what we're going to use it first as

play00:13

a model and this is RPG verse 4 and if

play00:17

we look on the specification there are

play00:19

best

play00:21

size resolution and a CFG was trained

play00:24

for this specific model it was

play00:27

512 the width and 768 so what I want to

play00:32

do I want right here going and put those

play00:35

numbers because this is giving us the

play00:37

best result that we can have it when we

play00:39

work with this model so we'll have it

play00:41

512 and our

play00:44

with a 768 so 768 we also can preview

play00:49

and you can see 768

play00:51

just cut on the sizes around this area

play00:56

because of that we want switched to the

play00:58

crop and resize because we want cut some

play01:01

of those areas on the sides

play01:03

also depend on the CFG scale and other

play01:06

things and based on the size as well the

play01:08

noise we'll maybe going the noise try to

play01:11

go as highest as possible this is

play01:13

meaning it's our CMG scale or actually

play01:15

our sampling step based on this should

play01:18

be somewhere about 35

play01:21

and CFG scale we can get up to 9. I

play01:24

don't recommend go higher most models

play01:26

screen anywhere between five to nine so

play01:30

it's more safe areas I will say seven

play01:32

and between seven and nine it and nine

play01:35

it's a will work hundred percent on

play01:37

almost all models okay now we're going

play01:39

ahead and fill up all our information

play01:41

and let's go ahead and click generate

play01:43

and as we look it kind of pause same way

play01:47

looking but not all away so what's

play01:50

happening if I want to say let's go with

play01:52

the noise one percent it's all created

play01:54

noise and we'll click generate most

play01:57

likely I can see what will happen ahead

play01:59

it's total different so it does not

play02:01

resemble our image at all what we

play02:04

created so what we're going what we're

play02:06

going to do it is play with some

play02:08

settings to be sure we resemble as close

play02:11

as possible with this but increase

play02:13

creativity to the max

play02:16

okay first we're going to use the

play02:18

control net and we're going to use a two

play02:21

control NAD models at that time if you

play02:24

need it you can go inside the settings

play02:25

go to the control net and right here

play02:29

where is that how many models you want

play02:31

to set at least three for now usually

play02:34

most people use a two two is enough I

play02:37

like three because give me sometimes a

play02:38

little bit more options it's up to you

play02:40

but set at least two at least for this

play02:43

video said at least two of them when

play02:45

you're done click apply and reload

play02:48

so after you need to load again and this

play02:51

time you should have it at least two

play02:52

tabs for control model

play02:54

the one first what we're going to do it

play02:57

is set properly pause and let's see

play03:00

what's happening with the pauses okay

play03:02

first I'll show you some example right

play03:05

here for example we have a pose and if

play03:08

we're going to select let's enable

play03:10

preprocessor

play03:12

let's go with the open pose that's what

play03:14

we want to do and we'll just go ahead

play03:16

and

play03:18

click on preview result

play03:20

you'll notice how the hands going right

play03:23

here and its main reason because this is

play03:26

not correct size it's why very important

play03:29

for us to use The annotation resolution

play03:31

equal to our image so for example if I

play03:34

have this annotation resolution 720

play03:36

let's try preview again and this time

play03:39

you can see right here you see how the

play03:42

arms showing correctly this is just one

play03:45

of the better example to show in this

play03:46

annotation resolution should closer

play03:48

going to the image you're going to use

play03:52

okay so in our case

play03:54

we're using this image and I'm looking

play03:56

for 512 within this case so it's what

play04:00

we're going to do 512 and let's go click

play04:02

preview on The annotation resolution be

play04:04

sure that our pause is all readable and

play04:08

this is actually set very good I like

play04:10

how it's going on this case we can

play04:12

remove from this point

play04:14

and we'll just leave it like this let's

play04:16

go to Kenny set of open pose because

play04:19

with the open pose open pose right here

play04:21

this is our preprocessor this is our

play04:23

processor

play04:25

so if at this time we go ahead and click

play04:27

generate you'll notice we'll have it a

play04:30

model set with properly pause

play04:33

however we do lose some of

play04:35

digitalizations

play04:37

from outfit that you maybe want to pass

play04:39

some parameters it still look very nice

play04:43

and a pose is close enough however we

play04:46

can do even better than that for this

play04:48

we'll go to another control model and

play04:51

we're going using the canning for this

play04:53

so let's go enable

play04:55

we'll go copy paste our image so we can

play04:57

preview and remember same like we did

play05:00

before it was 510 this is what we want

play05:03

to do here put it 510 however 512

play05:06

probably will work we'll just leave it

play05:07

as

play05:09

up or not canvas sorry on a guidance

play05:11

right here let's go there should pop up

play05:13

our guidance right there and we'll just

play05:15

leave it 512 here on our resolution as

play05:18

well

play05:18

and we'll go and click to preview

play05:22

so overall right here what I'm looking

play05:24

at this line how much details if not

play05:27

enough details you can always go ahead

play05:29

and reduce like low resolution

play05:34

see how much more details we added if

play05:36

you need it but I think we'll pop up

play05:38

same we can you can use it with a higher

play05:41

details and higher should add more and

play05:43

even include my own signature that I put

play05:46

it on my photos

play05:48

but it is maybe a little bit too much

play05:51

so we'll just bring back let's preview

play05:54

again and again this one is actually a

play05:56

very good resolution in this case so

play05:58

let's go ahead remove our preview

play06:01

and we'll go a canning scanning Kenny

play06:04

here the next thing is because we're

play06:06

using two models

play06:08

we went a little bit adjust on the

play06:10

weight so this one will go weight about

play06:12

0.7 and when I canning for the details

play06:16

well maybe because they're 0.5

play06:19

okay let's go ahead generate and look

play06:21

again

play06:22

and here we have it even better

play06:24

detailizations now

play06:27

but here's other things very interesting

play06:29

so what is the happening it's taken

play06:31

noise it's created noise and put it over

play06:33

but it would take the noise of this

play06:35

image it will take image convert to the

play06:38

noise and based from this noise we can

play06:40

get it very very close to original so

play06:44

for this what we're going to do is using

play06:46

script and we're going to use The Script

play06:48

image to image alternative test

play06:51

so let's enable this test

play06:54

also notice I'm using Euler and

play06:59

some RPG

play07:01

I think it's required to use it

play07:04

R2 karmas

play07:06

however if we going to use it image the

play07:09

image it is actually asked to use the

play07:12

Euler

play07:13

because we said you'll run the top and

play07:15

here we'll just uncheck this and check

play07:16

that box

play07:18

sampling step we uncheck because we're

play07:21

going to use it our 35 we can switch if

play07:25

we need it you can also use it for coded

play07:28

decoded same amount of steps we need

play07:30

matching and if you want you can always

play07:33

check this box then it will be matched

play07:35

with your sampling steps on top actually

play07:38

the bottom will override top one okay

play07:40

and right here we override noise to the

play07:43

one let's uncheck this because we

play07:45

already control it by one here and let's

play07:48

go ahead try to generate when we will do

play07:50

this you'll notice it will take first

play07:53

our image add noise to this image so

play07:56

it's going to the base by using on our

play07:59

ruler

play08:01

noise denoising system and now denoising

play08:05

back from this to our original image so

play08:08

you'll notice how the face positions

play08:11

everything will be extremely close to

play08:14

what we haven't before

play08:16

to even point of the hair how we're

play08:19

going and this is what nice about

play08:21

creating if you have it exactly the same

play08:23

point you like hair like always details

play08:25

you can take and create image very very

play08:28

close to your original image and one

play08:32

thing you can also verify if you click

play08:34

second time generate we should have it

play08:37

very close and similar to what we have

play08:39

just with a little details adjustments

play08:42

so we can take our image and almost

play08:46

constrain to the point and by the way

play08:49

these techniques we're using here and

play08:51

how it is

play08:52

with the RN noise generation overlay

play08:56

okay so at this point it's look very

play08:58

nice but if you notice the faces is

play09:01

different what if I take somebody else

play09:03

portraits I'll apply everything same but

play09:05

I want preserve the expression and face

play09:08

and everything so for this we actually

play09:10

need to go use it in painting and I do

play09:13

like to do this in steps so it's helped

play09:16

me to identify one things fix it and

play09:18

keep on going so when I satisfy how

play09:21

everything going I can go inside in

play09:23

painting copy paste what I have it and

play09:27

now I'm going to just mask over the face

play09:30

that I want to preserve okay when I down

play09:34

right here I want to go and says in

play09:37

painting not mask so it's in painting

play09:39

everything except with masking and you

play09:42

can try feel or a latent noise either

play09:45

one and pixel padding I will say 100

play09:49

pixel sometimes go a little bit too low

play09:52

you can again this is experimental

play09:54

depending on the size of the image you

play09:56

created and other stuff

play09:57

but everything else you know is right

play10:00

here it will stay same so when we're

play10:02

done let's go ahead and click generate

play10:04

notice as generating our face apply the

play10:08

noise but not that much so it is

play10:10

preserve some of the utilizations as

play10:13

we're doing

play10:14

and when it's creating back

play10:18

you'll notice we have our model created

play10:21

and most important

play10:24

we preserve phase for our model okay so

play10:27

let's go compare here's before you can

play10:29

see a face and it's the same one

play10:31

embedded very well and this is because

play10:34

we used within painting just replace the

play10:37

specific elements that we think

play10:40

you can also play a little bit around

play10:42

with the blur because if we apply a

play10:44

little bit more blur like maybe 10

play10:47

pixels it will bend a little bit right

play10:50

here around more aggressively this is

play10:54

again this is more experimental based on

play10:56

the size and resolution you can always

play10:58

go a little bit more crazy and higher

play11:01

qual

play11:02

bigger Edge on this and let's just

play11:04

preview C and I'll show you what is

play11:06

meaning by when we're going closer to

play11:09

these edges

play11:10

okay and you can see we have it a little

play11:12

bit better blending but right there

play11:15

because it's a we have a blur you may

play11:17

have some artifacts a little bit but

play11:19

generally it's actually blending very

play11:21

well so we have it same face same pause

play11:24

but we change all environment quite a

play11:26

bit on this one

play11:29

and here we can compare our original

play11:31

photos that we put it in and we change

play11:34

all environment out feed but we still

play11:37

have it same shape of the figure same

play11:40

pose same phase

play11:43

um as there it just almost like special

play11:46

compositing we did it and much easier

play11:49

creative way of course everything can be

play11:51

changed depending on your prompt but in

play11:54

this case you should experiment overall

play11:56

what we've done we used image to image

play11:58

we adjust the denoising string with the

play12:02

two control models one is the open pose

play12:04

and another ones is a Canon for more

play12:07

details and we're also using The Script

play12:10

image to image alternative test

play12:12

to take our original image and based on

play12:15

this all of this provide very accurate

play12:17

but at the same time very creative

play12:20

reference for our new image you can

play12:23

actually create very interesting

play12:25

animations even based with that and you

play12:27

can create these animations by going in

play12:30

a batch and at this point processing as

play12:33

an image badge and creating

play12:36

um set of this so let me show you some

play12:37

examples okay so and right here you can

play12:41

see the render and you can see the

play12:45

original one

play12:46

notice that we have some flickering

play12:50

applied and it it is problem with the

play12:53

specific techniques if you're interested

play12:56

on a flickery free uh creating I have

play12:59

another video with totally flicker free

play13:02

creating with AI stable diffusion and

play13:05

I'll post that link to this video below

play13:08

but here's a proximately you can do kind

play13:10

of fun stuff with this

play13:12

thank you for watching this video if you

play13:14

like it please subscribe give us thumbs

play13:16

up share this video help

play13:19

um this channel grow and I appreciate

play13:21

all your support thanks have a great

Rate This

5.0 / 5 (0 votes)

Related Tags
AI GeneraciónRetratosDiseño CreativoControl de PoseDenoisingAjuste de TamañoDetalles de ImagenTutorial AIModelado de ImágenesComposición Digital
Do you need a summary in English?