Stable Diffusion for Flawless Portraits
Summary
TLDREn este video, se muestra cómo crear un retrato perfecto usando una imagen anterior como base. Se utiliza RPGverse 4 con una resolución de 512x768, ajustando el tamaño y el ruido para obtener mejores resultados. Se abordan técnicas avanzadas como el uso de control nets y scripts para ajustar detalles y preservar expresiones faciales. Se recomienda experimentar con configuraciones para lograr un resultado similar al original pero con cambios creativos en el entorno. El video también menciona la posibilidad de crear animaciones utilizando esta técnica.
Takeaways
- 🎨 Se va a crear un retrato perfecto pegado sobre una foto anterior utilizando RPGverse 4.
- 📏 La resolución recomendada para obtener mejores resultados con el modelo es 512 de ancho y 768 de altura.
- 🔍 Se puede previsualizar y ajustar el tamaño de la imagen para adaptarla a las dimensiones recomendadas.
- ✂️ Se sugiere usar la herramienta de recorte y redimensionamiento para ajustar las áreas laterales según sea necesario.
- 🔄 El ruido y el escalado CFG (Control Frequency Generation) son importantes, con un rango seguro de 5 a 9, donde 7 es una opción segura.
- 🖼️ Al generar la imagen, se pueden experimentar con diferentes niveles de ruido para ajustar la creatividad y la similitud con la imagen original.
- 🤖 Se utiliza ControlNet para ajustar las poses y mejorar la precisión, con al menos dos modelos configurados para obtener mejores resultados.
- 📐 La resolución de la anotación debe ser igual a la de la imagen para asegurar que las poses se muestren correctamente.
- 🔍 Se puede mejorar la calidad del retrato utilizando otro control model, como el de Canon, para agregar más detalles.
- 🧩 Al usar varios modelos, es necesario ajustar sus pesos para balancear la influencia de cada uno en la imagen final.
- 🛠️ Se utiliza un script de 'imagen a imagen' para añadir ruido y luego denoizar a partir de ese ruido, lo que ayuda a obtener una imagen muy cercana al original.
- 🖌️ Para preservar expresiones faciales y detalles, se puede usar la función de pintura para reemplazar elementos específicos de la imagen.
- 🔍 Se puede ajustar el desenfoque para mejorar la integración de los elementos en la imagen, aunque esto puede introducir artefactos en algunos casos.
- 🎞️ Se pueden crear animaciones utilizando técnicas similares, procesando la imagen en lotes para obtener una secuencia coherente.
Q & A
¿Qué tipo de imagen se está creando en el video?
-Se está creando un retrato perfecto pegado en una foto anterior, utilizando técnicas de inteligencia artificial para generar imágenes detalladas y personalizadas.
¿Cuál es el modelo de inteligencia artificial utilizado en el script?
-El modelo utilizado es RPG Verse 4, el cual fue entrenado específicamente con una resolución de 512 ancho y 768 alto.
¿Por qué es importante ajustar la resolución de la imagen según las especificaciones del modelo?
-Ajustar la resolución según las especificaciones del modelo asegura los mejores resultados posibles al trabajar con él, ya que estas dimensiones están optimizadas para el rendimiento del modelo.
¿Qué es el 'noise' en el contexto de la generación de imágenes AI?
-El 'noise' se refiere a la adición de ruido o variabilidad aleatoria en el proceso de generación de imágenes, lo que puede influir en la creatividad y la originalidad del resultado final.
¿Qué es un 'control net' y cómo se utiliza en el proceso?
-Un 'control net' es una red de control que se utiliza para ajustar y refinar ciertos aspectos de la imagen generada, como las poses o la calidad de los detalles, proporcionando más opciones y control sobre la salida final.
¿Cómo se utiliza la resolución de anotación para asegurar que las poses en la imagen sean correctas?
-La resolución de anotación debe ser igual a la resolución de la imagen para que las poses se muestren correctamente. Esto ayuda a evitar distorsiones o errores en la representación de las poses.
¿Qué es el 'preprocessor' y cómo afecta la generación de la imagen?
-El 'preprocessor' es un componente que procesa la imagen antes de la generación final, asegurando que las poses y otras características sean precisas y coherentes con la imagen de referencia.
¿Qué es el 'Script image to image alternative test' y cómo ayuda a mejorar la imagen?
-El 'Script image to image alternative test' es una técnica que utiliza el ruido de la imagen original, lo convierte en una señal de ruido y luego utiliza un sistema de desenrubo para acercar la imagen generada al original, mejorando la precisión y la calidad.
¿Cómo se utiliza la pintura en el proceso de generación de imágenes para preservar ciertas características?
-La pintura se utiliza para reemplazar o ajustar específicamente elementos de la imagen, como la expresión o la cara, mientras se preserva el resto de la imagen generada, permitiendo una mayor personalización y control sobre la salida final.
¿Qué técnicas se pueden utilizar para crear animaciones a partir de la imagen generada?
-Se pueden utilizar técnicas de procesamiento por lotes y generación de animaciones en serie utilizando la imagen como base, lo que permite crear secuencias de animaciones con cambios sutiles y coherentes.
¿Cómo se pueden solucionar problemas de parpadeo en las animaciones generadas?
-Para evitar parpadeos en las animaciones, se pueden utilizar técnicas específicas de generación de imágenes sin parpadeo, como las que se describen en otro video mencionado en el script, que ofrece un enfoque para crear animaciones sin flickering.
Outlines
🎨 Creación de un retrato perfecto con RPGverse 4
En este primer párrafo, se discute cómo crear un retrato perfecto aplicado a una foto previa utilizando el modelo RPGverse 4. Se mencionan las especificaciones del modelo, como el tamaño de la resolución y el ancho, y se sugiere ajustar el tamaño de la imagen a 512x768 para obtener los mejores resultados. Se habla de la importancia de la escala de ruido y la escala CFG, y se ofrecen consejos sobre cómo ajustar estos valores para obtener una imagen de alta calidad. Además, se menciona el uso del control net con al menos dos modelos para mejorar la precisión del retrato.
🖼️ Ajuste de la resolución y el uso de controles avanzados
Este párrafo se enfoca en ajustar la resolución y el uso de controles avanzados para mejorar la calidad del retrato. Se sugiere mantener la resolución del lienzo en 512 y ajustar la resolución de la imagen a 510, aunque también se menciona que 512 podría funcionar. Se discute la importancia de la resolución en la detallación y cómo reducirla puede afectar la calidad. Además, se abordan técnicas para ajustar el peso de los modelos y se presenta la técnica de 'image to image alternative test' para reducir el ruido y acercar la imagen al original.
🌟 Conservación de expresiones faciales y creación de animaciones
El tercer párrafo cubre la técnica de conservar las expresiones faciales al crear un retrato y cómo utilizar la función de 'inpainting' para reemplazar elementos específicos de la imagen. Se discute la aplicación de ruido y la preservación de las características faciales, así como la utilización de técnicas experimentales como el desenfoque para mejorar la integración del retrato en el entorno. Además, se menciona la posibilidad de crear animaciones utilizando la configuración de 'batch processing' y se ofrece una perspectiva sobre cómo se pueden superar los problemas de parpadeo en las creaciones AI.
Mindmap
Keywords
💡Retrato perfecto
💡Control net
💡RPGverse 4
💡Resolución
💡Ruido
💡Escalar
💡Preprocesador
💡Canon
💡Script
💡Pintar
💡Denoising
Highlights
Creating a perfect portrait pasted on a previous photo using RPG verse 4.
Setting the best size resolution for the model to 512x768 for optimal results.
Adjusting the CFG scale and noise level for the highest quality image generation.
Using control net with two models to enhance the portrait generation process.
Importance of setting the annotation resolution equal to the image for accurate pose detection.
Utilizing the open pose preprocessor for better pose accuracy.
Adjusting the control model weights to balance details and pose accuracy.
Using the Script image to image alternative test for noise reduction and detail preservation.
Experimenting with different settings to achieve a close resemblance to the original image.
Preserving facial expressions and details using in-painting techniques.
Adjusting blur levels for better blending and avoiding artifacts.
Creating animations by processing images in batches for a sequence.
Addressing flickering issues in animations with specific techniques.
Comparing original and generated images to showcase the effectiveness of the process.
The potential for creating special compositing effects with AI-generated images.
Encouraging experimentation with different prompts for creative image generation.
Providing a link to a video on flicker-free AI image creation for interested viewers.
Transcripts
hello there so in this video we're going
to create a perfect portrait pasted on
previous photo and right here you can
see we have our water what we're going
to use
but before we jump to adjust sizes let's
see what we're going to use it first as
a model and this is RPG verse 4 and if
we look on the specification there are
best
size resolution and a CFG was trained
for this specific model it was
512 the width and 768 so what I want to
do I want right here going and put those
numbers because this is giving us the
best result that we can have it when we
work with this model so we'll have it
512 and our
with a 768 so 768 we also can preview
and you can see 768
just cut on the sizes around this area
because of that we want switched to the
crop and resize because we want cut some
of those areas on the sides
also depend on the CFG scale and other
things and based on the size as well the
noise we'll maybe going the noise try to
go as highest as possible this is
meaning it's our CMG scale or actually
our sampling step based on this should
be somewhere about 35
and CFG scale we can get up to 9. I
don't recommend go higher most models
screen anywhere between five to nine so
it's more safe areas I will say seven
and between seven and nine it and nine
it's a will work hundred percent on
almost all models okay now we're going
ahead and fill up all our information
and let's go ahead and click generate
and as we look it kind of pause same way
looking but not all away so what's
happening if I want to say let's go with
the noise one percent it's all created
noise and we'll click generate most
likely I can see what will happen ahead
it's total different so it does not
resemble our image at all what we
created so what we're going what we're
going to do it is play with some
settings to be sure we resemble as close
as possible with this but increase
creativity to the max
okay first we're going to use the
control net and we're going to use a two
control NAD models at that time if you
need it you can go inside the settings
go to the control net and right here
where is that how many models you want
to set at least three for now usually
most people use a two two is enough I
like three because give me sometimes a
little bit more options it's up to you
but set at least two at least for this
video said at least two of them when
you're done click apply and reload
so after you need to load again and this
time you should have it at least two
tabs for control model
the one first what we're going to do it
is set properly pause and let's see
what's happening with the pauses okay
first I'll show you some example right
here for example we have a pose and if
we're going to select let's enable
preprocessor
let's go with the open pose that's what
we want to do and we'll just go ahead
and
click on preview result
you'll notice how the hands going right
here and its main reason because this is
not correct size it's why very important
for us to use The annotation resolution
equal to our image so for example if I
have this annotation resolution 720
let's try preview again and this time
you can see right here you see how the
arms showing correctly this is just one
of the better example to show in this
annotation resolution should closer
going to the image you're going to use
okay so in our case
we're using this image and I'm looking
for 512 within this case so it's what
we're going to do 512 and let's go click
preview on The annotation resolution be
sure that our pause is all readable and
this is actually set very good I like
how it's going on this case we can
remove from this point
and we'll just leave it like this let's
go to Kenny set of open pose because
with the open pose open pose right here
this is our preprocessor this is our
processor
so if at this time we go ahead and click
generate you'll notice we'll have it a
model set with properly pause
however we do lose some of
digitalizations
from outfit that you maybe want to pass
some parameters it still look very nice
and a pose is close enough however we
can do even better than that for this
we'll go to another control model and
we're going using the canning for this
so let's go enable
we'll go copy paste our image so we can
preview and remember same like we did
before it was 510 this is what we want
to do here put it 510 however 512
probably will work we'll just leave it
as
up or not canvas sorry on a guidance
right here let's go there should pop up
our guidance right there and we'll just
leave it 512 here on our resolution as
well
and we'll go and click to preview
so overall right here what I'm looking
at this line how much details if not
enough details you can always go ahead
and reduce like low resolution
see how much more details we added if
you need it but I think we'll pop up
same we can you can use it with a higher
details and higher should add more and
even include my own signature that I put
it on my photos
but it is maybe a little bit too much
so we'll just bring back let's preview
again and again this one is actually a
very good resolution in this case so
let's go ahead remove our preview
and we'll go a canning scanning Kenny
here the next thing is because we're
using two models
we went a little bit adjust on the
weight so this one will go weight about
0.7 and when I canning for the details
well maybe because they're 0.5
okay let's go ahead generate and look
again
and here we have it even better
detailizations now
but here's other things very interesting
so what is the happening it's taken
noise it's created noise and put it over
but it would take the noise of this
image it will take image convert to the
noise and based from this noise we can
get it very very close to original so
for this what we're going to do is using
script and we're going to use The Script
image to image alternative test
so let's enable this test
also notice I'm using Euler and
some RPG
I think it's required to use it
R2 karmas
however if we going to use it image the
image it is actually asked to use the
Euler
because we said you'll run the top and
here we'll just uncheck this and check
that box
sampling step we uncheck because we're
going to use it our 35 we can switch if
we need it you can also use it for coded
decoded same amount of steps we need
matching and if you want you can always
check this box then it will be matched
with your sampling steps on top actually
the bottom will override top one okay
and right here we override noise to the
one let's uncheck this because we
already control it by one here and let's
go ahead try to generate when we will do
this you'll notice it will take first
our image add noise to this image so
it's going to the base by using on our
ruler
noise denoising system and now denoising
back from this to our original image so
you'll notice how the face positions
everything will be extremely close to
what we haven't before
to even point of the hair how we're
going and this is what nice about
creating if you have it exactly the same
point you like hair like always details
you can take and create image very very
close to your original image and one
thing you can also verify if you click
second time generate we should have it
very close and similar to what we have
just with a little details adjustments
so we can take our image and almost
constrain to the point and by the way
these techniques we're using here and
how it is
with the RN noise generation overlay
okay so at this point it's look very
nice but if you notice the faces is
different what if I take somebody else
portraits I'll apply everything same but
I want preserve the expression and face
and everything so for this we actually
need to go use it in painting and I do
like to do this in steps so it's helped
me to identify one things fix it and
keep on going so when I satisfy how
everything going I can go inside in
painting copy paste what I have it and
now I'm going to just mask over the face
that I want to preserve okay when I down
right here I want to go and says in
painting not mask so it's in painting
everything except with masking and you
can try feel or a latent noise either
one and pixel padding I will say 100
pixel sometimes go a little bit too low
you can again this is experimental
depending on the size of the image you
created and other stuff
but everything else you know is right
here it will stay same so when we're
done let's go ahead and click generate
notice as generating our face apply the
noise but not that much so it is
preserve some of the utilizations as
we're doing
and when it's creating back
you'll notice we have our model created
and most important
we preserve phase for our model okay so
let's go compare here's before you can
see a face and it's the same one
embedded very well and this is because
we used within painting just replace the
specific elements that we think
you can also play a little bit around
with the blur because if we apply a
little bit more blur like maybe 10
pixels it will bend a little bit right
here around more aggressively this is
again this is more experimental based on
the size and resolution you can always
go a little bit more crazy and higher
qual
bigger Edge on this and let's just
preview C and I'll show you what is
meaning by when we're going closer to
these edges
okay and you can see we have it a little
bit better blending but right there
because it's a we have a blur you may
have some artifacts a little bit but
generally it's actually blending very
well so we have it same face same pause
but we change all environment quite a
bit on this one
and here we can compare our original
photos that we put it in and we change
all environment out feed but we still
have it same shape of the figure same
pose same phase
um as there it just almost like special
compositing we did it and much easier
creative way of course everything can be
changed depending on your prompt but in
this case you should experiment overall
what we've done we used image to image
we adjust the denoising string with the
two control models one is the open pose
and another ones is a Canon for more
details and we're also using The Script
image to image alternative test
to take our original image and based on
this all of this provide very accurate
but at the same time very creative
reference for our new image you can
actually create very interesting
animations even based with that and you
can create these animations by going in
a batch and at this point processing as
an image badge and creating
um set of this so let me show you some
examples okay so and right here you can
see the render and you can see the
original one
notice that we have some flickering
applied and it it is problem with the
specific techniques if you're interested
on a flickery free uh creating I have
another video with totally flicker free
creating with AI stable diffusion and
I'll post that link to this video below
but here's a proximately you can do kind
of fun stuff with this
thank you for watching this video if you
like it please subscribe give us thumbs
up share this video help
um this channel grow and I appreciate
all your support thanks have a great
5.0 / 5 (0 votes)