AI Runner 3.0.0 Development Preview: Draw and generate

Capsize Games
29 Feb 202403:44

Summary

TLDRIn this demonstration, the presenter introduces a new feature in an upcoming version of a creative application that utilizes two canvases: one for drawing and the other for rendering images. Utilizing the Stable Diffusion model, the presenter showcases how to generate a realistic image of a majestic mountain with a river running from it, simply by typing a prompt and adjusting the generation steps for speed. The demonstration highlights the interactive process of drawing on one canvas to enhance the generated image on the other, planning future improvements to manage multiple requests efficiently, and showing the potential for users to refine their creations through iterative drawing and prompt adjustments.

Takeaways

  • 🖌️ The software features two canvases: one for drawing (left) and one for rendering images (right).
  • 📷 The right canvas is used to render professional photographs, such as majestic mountains on sunny days with rivers.
  • 👩‍🔬 The software utilizes 'Stable Diffusion' model, known for generating realistic photos, applicable to various subjects beyond just people.
  • ⚙️ The demonstration involves lowering the generation steps to 15 to expedite the rendering process.
  • 🔁 Upon releasing the mouse button after drawing, the software submits a new request for image-to-image conversion, using the left canvas as the base.
  • 🎨 Users can draw different elements (e.g., fields, sky, mountains) which then get integrated into the rendered image on the right canvas.
  • 📈 Each release of the button sends a new request, potentially leading to a queue of requests needing backend processing.
  • 💡 A planned enhancement is to process only the latest request from the drawing canvas to improve efficiency and reduce load.
  • 🚀 Performance may vary based on hardware, with faster video cards providing quicker results.
  • 🎨🖼️ The tool allows iterative refining by adding elements such as snow or rivers to the drawing, which then reflect in the rendered image.
  • ☁️ Users can add additional features like clouds to improve the composition and realism of the rendered image.

Q & A

  • What are the two canvases shown in the video for?

    -The left canvas is for drawing and the right canvas is for rendering the AI-generated image.

  • What AI model is being used to generate the images?

    -The video shows using Stable Diffusion model to generate the images.

  • How does the presenter update the generated image?

    -By drawing on the left canvas and releasing the mouse button, a new request is made to update the image using the drawing as a base.

  • What enhancement is planned for multiple requests in queue?

    -The presenter plans to only use the latest request instead of processing all requests in queue to speed things up.

  • How could the generated images be improved?

    -By working on the prompt wording or adding more details to the drawing on the left canvas.

  • Why does the presenter lower the steps to 15 initially?

    -To make the image generation process quicker for the demo.

  • What elements does the presenter add to the drawing?

    -A field, blue background for the sky, details on the mountain, snow, a river, and some clouds.

  • Why does the presenter say the generated image looks terrible?

    -Because the image quality is not refined and needs further improvements to prompt wording and drawing.

  • How are multiple requests handled when drawing?

    -A new request is made each time a drawing change is made. This can result in a queue of requests.

  • What format is used for the Q&A output?

    -The output is returned in JSON format containing a list of question and answer objects.

Outlines

00:00

😀 Demoing New Drawing Feature for Image Generation

The paragraph introduces a new upcoming feature that allows drawing an image on one canvas and having it rendered into a generated image on a second canvas based on a provided text prompt. It demonstrates generating an initial mountain image from the prompt text, then drawing additions like a field, sky, extra mountain details, snow, and a river which get incorporated with each mouse release into a newly generated image.

Mindmap

Keywords

💡canvas

The script refers to two canvases - one on the left for drawing and one on the right for rendering the AI-generated image. The canvases allow the user to draw an image that then serves as the basis for the AI to generate a new similar image. This demonstrates an interactive image creation process.

💡mountain

The initial prompt describes generating an image of a majestic mountain with a river, which is then rendered by the AI. The mountain is a core element that the user later adds more detail to through drawing.

💡drawing

The key innovation in the video is using drawing to incrementally improve and modify the AI-generated image. As the user draws, new requests are sent to generate updated images that match the drawings.

💡queue

Due to multiple drawing requests building up, the speaker notes that requests get queued on the backend before an updated image is rendered. This queue could slow things down.

💡model

The script mentions using the Stable Diffusion machine learning model to generate the images. The model name provides context on the AI capability being demonstrated.

💡prompt

The textual prompt that describes the desired image is crucial for guiding the AI model on what to generate. The script shows iteratively improving the prompt.

💡request

Each time the user modifies the drawing and releases the mouse, a new API request is sent to generate an updated image. Managing these requests is important.

💡layer

The process shows layering user drawing on top of the AI-generated image to incrementally improve it. Each layer adds more direction.

💡interactive

A key highlight is the interactive nature of the tool - with the user directly manipulating the image through drawing to guide the AI generator.

💡render

The right canvas renders each image generated by the AI model based on the drawings and prompts. This output is what the user sees.

Highlights

Demo of new feature with two canvases - one for drawing input and one for AI image generation output

Using Stable Diffusion model to generate images based on text prompts

Drawing input is used to continually update and refine the AI generated image

Releasing mouse after drawing triggers a new AI image generation request

Queue of requests is built up each time drawing input changes

Only latest request will be used to avoid lag from processing queue

Drawing simple shapes and colors to indicate sky, mountains, river, etc.

Output image matches and incorporates the hand-drawn input

Can iteratively add more drawing to refine and improve the image

Drawing snow on mountain tops and seeing it reflected in output

Adding a river by drawing blue shape

Output still needs more refinement but demonstrates general capabilities

Can continue drawing clouds, trees, etc to improve quality

Drawing interface allows intuitive input for image generation

Ongoing additions planned to speed up process and improve accuracy

Transcripts

play00:02

okay I'm just going to show a quick

play00:03

feature uh that I've

play00:06

added to this upcoming version um I'm

play00:10

going to change this design

play00:12

slightly uh but for now what we have are

play00:15

two different canvases we have one on

play00:17

the left and one on the right um and

play00:20

what's going on here is the left canvas

play00:22

is for drawing and the right canvas is

play00:25

for rendering an image um over here in

play00:28

the prompt I've typed a professional

play00:30

photograph of a majestic Mountain sunny

play00:32

day a river runs from the

play00:34

mountain um under stable diffusion we're

play00:38

using this model next photo generates

play00:41

realistic photos of people but also of

play00:45

everything um and I've just lowered the

play00:48

steps down to 15 for now just to make it

play00:51

generate a little

play00:52

quicker um so the first thing I'm going

play00:55

to do is just hit generate and I'll

play00:57

generate this image of this mountain

play01:01

that's going to load up the model for

play01:03

the first

play01:05

time and there we have kind of a

play01:08

mountain in the background and a river

play01:10

in the

play01:11

foreground um but now what I'm going to

play01:14

do is I'm going to start drawing on this

play01:16

side and what you'll have when I release

play01:19

the mouse button it will make another

play01:24

request using this as the base

play01:28

image uh for image to image and control

play01:31

that so we're going to draw here a

play01:33

little bit more to like kind of make a

play01:37

field um then we'll go ahead and do blue

play01:41

for the background for we're just going

play01:43

to draw fill the background in we'll

play01:45

have like

play01:46

a sky

play01:52

here okay great you can see how it

play01:55

basically matches

play01:57

up uh with our drawing and then we're

play02:00

going to choose this for the

play02:06

mountain now one thing uh to note is

play02:10

that each time you

play02:11

release the button it's going to do

play02:14

another request uh and it throws it into

play02:17

a queue so you could end up with tons of

play02:21

requests and each one of them has to

play02:23

be filled on the back

play02:26

end and when the image generates it will

play02:29

get pass to this other canvas so what I

play02:32

would like to do instead is when

play02:34

multiple requests are in the queue and

play02:36

they've come from this brush canvas I

play02:39

want to only use the latest request so

play02:42

that's a enhancement I'll be adding

play02:45

later that it should speed this up a bit

play02:47

if you have a fast video card it should

play02:49

run fairly quickly

play02:51

regardless um here I'm going to add a

play02:53

little bit of snow to the top of the

play02:58

mountain

play03:04

and then we will add let's go ahead and

play03:07

add that

play03:11

River see what we

play03:14

get it's kind of

play03:16

cool of course this looks

play03:19

terrible uh but we could work on our

play03:21

prompt or add more uh to our drawing

play03:24

here on the left and it

play03:26

will uh you know slowly start to look a

play03:28

little bit better

play03:31

uh we could add some clouds

play03:34

maybe couple of clouds in the

play03:42

sky