GEN-3: The Ultimate Prompting Guide

Theoretically Media
1 Jul 202411:54

Summary

TLDRIn this video, Tim explores the advancements of Runway ML's Gen 3, a significant upgrade from its popular predecessor. He provides an in-depth guide to effective prompting for Gen 3, sharing his research, testing, and insights. Tim demonstrates the AI's capabilities through various examples, highlighting improvements in video generation and the importance of descriptive prompts. He also discusses the model's adherence to prompts, its ability to handle text, and the potential for future features like image-to-video. The video concludes with an invitation for viewers to share their findings and favorite prompts.

Takeaways

  • 🚀 Runway ML's Gen 3 is a significant upgrade from the popular Gen 2 model, marking a step forward in the 2.0 era of AI video generation.
  • 🔍 The presenter has extensively researched, tested, and studied Gen 3 to provide an in-depth guide on how to use it effectively.
  • đŸŽ„ A comparison between Gen 2 and Gen 3 showcases the advancements in video quality and AI's ability to create more realistic and coherent scenes.
  • 📝 Gen 3 allows for more descriptive prompting, moving away from spamming keywords to a style that is more narrative and detailed.
  • 🌟 The importance of structuring prompts effectively is highlighted, with examples showing how additional details can vastly improve the outcome.
  • 🎹 The script emphasizes the value of borrowing successful prompt structures from others in the community during the learning phase of the new model.
  • 🔑 Keywords associated with subjects, actions, settings, and shots are identified as essential elements to include in prompts for optimal results.
  • 🌈 The inclusion of style keywords, such as 'cinematic' or 'IMAX', can enhance the overall look of the generated video, as demonstrated with examples.
  • 🔄 Gen 3's adherence to prompts is so strong that it may introduce cuts or dissolves to fulfill the user's request, even if it results in odd outcomes.
  • 🔄 The presenter suggests reusing seeds from successful generations and making minor adjustments to maintain a consistent style while exploring variations.
  • đŸ€– Gen 3's capabilities extend to text-to-video, as shown by community examples that mimic popular culture visuals like the MCU opening sequence.
  • đŸš« The script mentions encountering content system restrictions when using certain keywords, suggesting the need for creativity to navigate these limitations.
  • ⏱ Gen 3 performs well with time-lapse prompts, effectively showing transitions at different intervals, as demonstrated in the provided examples.
  • 📊 The presenter encourages users to rate their outputs to contribute to the improvement of Gen 3, which is still in its alpha phase.

Q & A

  • What is the main topic of the video transcript?

    -The main topic of the video transcript is the introduction and exploration of Runway ML's Gen 3 model, a successor to the Gen 2 model, and an ultimate prompting guide for using Gen 3 to create AI videos.

  • What significant change is mentioned in the video about Gen 3 compared to Gen 2?

    -Gen 3 is described as a significant step forward that allows for more descriptive prompting, less focused on spamming keywords, and better adherence to the user's prompt, which is a change from Gen 2.

  • What does the speaker do to demonstrate the progress made with Gen 3?

    -The speaker revamps a previous Gen 2 video with Gen 3 to showcase the improvements in AI video generation, highlighting the advancements made in a short amount of time.

  • What are some of the key elements that should be included in a prompt for Gen 3 according to the video?

    -Key elements for a prompt in Gen 3 include the subject, action, setting, shot type, and style, which help to maximize the generation of the desired video output.

  • How does the speaker suggest using keywords in prompts for Gen 3?

    -The speaker suggests incorporating keywords associated with subject, action, setting, shot, and style into the prompt, but also emphasizes the importance of descriptive prompting over just keyword spamming.

  • What is the purpose of the PDF mentioned in the video?

    -The PDF is a resource that includes a list of shot terms, prompts, and additional information to help users experiment with Gen 3 and improve their AI video generation.

  • What is an example of a prompt structure improvement suggested in the video?

    -An example of a prompt structure improvement is changing a simple prompt to a more descriptive one, such as 'long shot in the distance a man in Black robes calmly walks across a vast desert Wasteland, the camera orbits to reveal a gunslinger watching him with steely eyes'.

  • What does the speaker mean by 'prompt splunking'?

    -'Prompt splunking' refers to the process of experimenting with different prompts to see what kind of AI video outputs can be generated, learning and iterating based on the results.

  • How does Gen 3 handle situations where it can't fulfill a specific part of the prompt?

    -If Gen 3 can't fulfill a specific part of the prompt, it often puts a cut or dissolve in the video to try and accomplish the mission set by the prompt.

  • What is the speaker's approach to maintaining the overall look of a generated video when iterating on a prompt?

    -The speaker suggests reusing the original seed and adjusting the prompt while keeping the seed constant to maintain the overall look of the generated video.

  • What is the potential issue with using certain keywords in Gen 3 prompts as mentioned in the video?

    -The potential issue is that using certain keywords, like those associated with copyrighted content like 'James Bond' or 'MCU', might trigger content systems and result in errors or refusal to generate the video.

  • What feature of Gen 3 is still in the alpha phase and expected to improve?

    -The Gen 3 model itself is in the alpha phase, and the speaker expects it to improve over time based on user feedback and ratings of the outputs.

  • What is the speaker's final call to action for the viewers of the video?

    -The speaker encourages viewers to share their findings and favorite prompts in the comments section of the video to contribute to the collective exploration and understanding of Gen 3's capabilities.

Outlines

00:00

🚀 Introduction to Runway ML Gen 3

The video script introduces the third generation of Runway ML, a significant advancement over its predecessor, marking the beginning of a new era in AI video generation. The narrator has spent considerable time researching, testing, and studying Gen 3 to provide an ultimate prompting guide. The script showcases the evolution of AI video capabilities by comparing an early Gen 2 model from April 2020 to a revamped version using Gen 3. It highlights the improved descriptive prompting style of Gen 3, which allows for more detailed instructions and less reliance on keyword spamming. The narrator also emphasizes the importance of structuring prompts effectively to achieve better results, such as including subject, action, setting, shot, and style elements.

05:01

🔍 Exploring Prompting Techniques in Gen 3

This paragraph delves into the intricacies of prompting Gen 3, discussing the adherence of the model to user prompts and the creative results that can be achieved with descriptive and structured prompts. The narrator shares examples of prompts and the corresponding AI-generated videos, noting occasional morphing issues and the model's tendency to insert cuts or dissolves when it cannot fulfill a specific request. The paragraph also touches on the use of the word 'suddenly' to create dramatic transitions and the potential for reusing seeds from successful generations to maintain a consistent style. The narrator encourages experimentation with prompts and iterates on the idea of using keywords associated with different aspects of the video to enhance the generation process.

10:02

🎹 Creative Prompting and Community Insights

The script explores creative prompting techniques and insights from the community, demonstrating how specific words like 'suddenly' can elicit interesting video results. It also discusses the use of text in prompts to generate video, showcasing examples like a Marvel Cinematic Universe-style opening and a miniature civilization living on the pages of an ancient scroll. The narrator shares personal experiences with creating a music video for Radiohead's 'Exit Music (For a Film)' using AI video and mentions the limitations of the model, such as its inability to generate videos from actual script pages. The paragraph concludes with a call to rate outputs to help improve the Gen 3 model and anticipation for future features like image-to-video capabilities.

Mindmap

Keywords

💡Runway ML Gen 3

Runway ML Gen 3 refers to the third generation of the Runway ML software, which is an AI-driven tool for creating videos. It is a significant update to its predecessor, Gen 2, and represents a step forward in AI video generation technology. The video discusses the advancements and new features of Gen 3, highlighting its ability to understand and generate more complex and descriptive prompts.

💡Prompting

In the context of AI video generation, 'prompting' is the process of providing the AI with a textual description or instructions to guide the creation of a video. The video emphasizes the importance of effective prompting for Gen 3, explaining that it allows for more descriptive and less keyword-focused prompts, which can lead to more accurate and detailed video outputs.

💡Descriptive Prompting

Descriptive prompting is a method of giving the AI detailed descriptions in the prompts rather than just a list of keywords. The video illustrates how Gen 3 can interpret these descriptive prompts to create more nuanced and contextually accurate video scenes, as opposed to the older, more keyword-focused approach.

💡Morphing Issues

Morphing issues refer to the visual anomalies that can occur in AI-generated videos where elements of the scene change unexpectedly or unrealistically. The video script mentions an example where a character suddenly has an umbrella, highlighting the challenges and areas for improvement in Gen 3's video generation capabilities.

💡Prompt Structuring

Prompt structuring is the organization of the information in a prompt to optimize the AI's understanding and the resulting video output. The video discusses how adding details and structuring the prompt can lead to improved video results, such as specifying the shot type, subject, action, setting, and style.

💡Shot Terms

Shot terms are specific types of camera shots used in filmmaking, such as 'wide angle,' 'close-up,' and 'long shot.' The video script provides examples of how including shot terms in the prompt can help Gen 3 understand the desired framing and perspective for a scene in the generated video.

💡Style

In the context of video generation, 'style' refers to the aesthetic or visual approach to how the scene is presented, such as 'cinematic,' 'IMAX,' or 'film noir.' The video explains how specifying style in the prompt can influence the overall look and feel of the generated video, as demonstrated with the 'IMAX' example.

💡Seed

In AI video generation, a 'seed' is a value used to initialize the random number generator, which influences the output's uniqueness. The video script discusses how changing the seed can result in different video outputs while maintaining a similar style, allowing for iteration and experimentation with the AI's output.

💡Community Ideas

Community ideas refer to the shared knowledge and creative input from the user community of a tool or platform. The video mentions leveraging community ideas, such as using the word 'suddenly' in prompts, to enhance the AI's video generation capabilities and achieve more dynamic and interesting results.

💡Time Lapses

Time lapses are a filmmaking technique where time is condensed, showing hours or days passing in a matter of seconds. The video script notes that Gen 3 does a good job with time lapses, particularly when they involve different intervals, as demonstrated in the example of a woman sitting by a window with the days turning rapidly into night.

💡Alpha Phase

The 'alpha phase' refers to the early stages of development for a product or software, where it is still being tested and refined. The video mentions that Gen 3 is in its alpha phase, which means it is a work in progress and will continue to evolve based on user feedback and technological advancements.

Highlights

Introduction of Runway ML's Gen 3, a significant advancement in AI video generation.

Comparison of Gen 2 and Gen 3, showcasing the evolution from text-to-video to more advanced capabilities.

The new prompting style in Gen 3 allows for more descriptive prompts rather than keyword spamming.

Example of improved results with additional descriptive details in the prompt.

Importance of including subject, action, setting, and shot in the prompt for optimal results.

Availability of a free PDF with shot terms and prompts for Gen 3 experimentation.

The use of 'suddenly' in prompts can lead to interesting and dynamic video results.

Gen 3's ability to handle text and create video from comic book pages or other textual content.

The challenge of mimicking specific styles or franchises like the MCU opening in Gen 3.

Exploring community ideas and experimenting with prompts for creative video generation.

The potential of Gen 3 to create time-lapse videos with different intervals.

The limitations of Gen 3 in handling script pages for video generation compared to Dream Factory.

The importance of rating outputs to help improve Gen 3 during its alpha phase.

Speculations about future features in Gen 3, such as image-to-video capabilities.

Encouragement for the community to share findings and favorite prompts for collaborative exploration.

The presenter's anticipation for further experimentation and 'prompt splunking' with the audience.

Transcripts

play00:00

so Runway ml's gen 3 has arrived this is

play00:03

obviously a successor to their wildly

play00:06

popular Gen 2 model this is a really

play00:09

significant step forward and really does

play00:11

cement Us in this new 2.0 era of AI

play00:15

video today we're going to dive into

play00:16

kind of an ultimate prompting guide for

play00:19

Gen 3 I have spent the last few days

play00:21

like really going on a deep dive of

play00:22

researching it testing it and studying

play00:25

it and I'm going to pass all of that

play00:26

along to you okay lot to cover let's get

play00:29

started

play00:30

briefly I did just want to take a quick

play00:32

moment to showcase exactly how far we've

play00:34

come back on April 26th of

play00:37

20203 uh I posted a first look at Gen 2

play00:41

which at the time was only text to video

play00:44

and uh strung some scenes together and

play00:45

made this

play00:56

[Music]

play01:01

it's charming warpy and morphe uh but

play01:03

this morning I just decided to revamp it

play01:06

for a V2 with Gen 3 and we got

play01:10

[Music]

play01:20

[Music]

play01:26

this so yeah we have come a long way in

play01:29

a very short short amount of time let's

play01:31

start looking at prompting in gen 3

play01:33

which is much more akin to the modern

play01:37

style of prompting it allows you to be a

play01:39

lot more descriptive in your prompt and

play01:41

you know less focused on spamming

play01:43

keywords so for example giving gen 3 The

play01:46

Prompt the man in Black fled across the

play01:48

desert and the Gunslinger followed Dark

play01:50

Tower fans that one's for you we end up

play01:52

getting this shot which is not bad I

play01:54

mean there are some morphing issues like

play01:56

suddenly the man in Black has an

play01:57

umbrella here uh and you'll see in one

play01:59

second that the Gunslinger followed

play02:01

prompt ends up giving us this I mean

play02:03

he's not the worst Gunslinger ever but

play02:05

he also does kind of look like he you

play02:07

know was an extra in a film Noir movie

play02:08

and kind of wandered onto the wrong set

play02:10

but with some additional details into

play02:12

the prompt and some prompt structuring

play02:14

uh we end up with a shot like this which

play02:16

is obviously vastly improved there are

play02:18

uh you know a handful of problems here

play02:20

we'll talk about that in one second uh

play02:22

but the prompt here is long shot in the

play02:24

distance a man in Black robes calmly

play02:27

walks across a vast desert Wasteland the

play02:29

camera orbits to reveal a gunslinger

play02:31

watching him with steel result to note I

play02:33

did crib that orange and red color

play02:35

grading look uh part of the prompt from

play02:38

Nicholas nubert which in these early

play02:40

days I think is actually really

play02:41

important to do as we're all learning

play02:43

what this model is capable of I'll shout

play02:46

out everyone and have links to profiles

play02:48

down below I don't necessarily think

play02:50

that the keyword version of prompting is

play02:52

necessarily better than the more

play02:54

descriptive version of prompting but I

play02:56

do think that there are certain buckets

play02:58

that you should probably hit if you're

play03:00

looking to maximize your generation so

play03:04

while you can go about writing your

play03:05

prompt in whatever fashion you like I do

play03:07

think that it's helpful to have keywords

play03:10

that are associated with these sections

play03:13

built into your prompt obviously first

play03:16

your subject your person place or thing

play03:18

whatever you we are focusing on in your

play03:21

shot second would be the action that

play03:23

that thing is taking uh is it walking is

play03:26

it dancing is it staring intently uh I

play03:29

do note that that adjectives do work

play03:31

well here so uh angrily walking or

play03:34

dancing happily setting obviously refers

play03:36

to your location a castle a busy City

play03:39

street or a Dusty motel that said I do

play03:41

think that you can buy a little bit more

play03:43

if you attach kind of mood

play03:45

characteristics to it as well such as

play03:48

you know dark stormy clouds or a bright

play03:50

sunny day shot obviously refers to

play03:52

things like wide angle closeup and long

play03:54

shot I do have a list of shot terms that

play03:56

you can try out in gen 3 you don't need

play03:58

to worry about like screen shouting this

play04:00

or anything this is all available along

play04:02

with a number of prompts that we're

play04:03

looking at today in a PDF over on

play04:06

gumroad it is completely free if you

play04:09

know you see the little cost thing just

play04:11

put zero in there and you can download

play04:12

it although you are always welcome to

play04:14

leave a donation it is always highly

play04:16

appreciated once again just to reiterate

play04:18

there is no right or wrong way to prompt

play04:21

uh do feel free to swap the order around

play04:23

on these things put your shot first your

play04:25

subject uh in the middle see what

play04:28

happens just experiment and iterate

play04:30

rounding out with style this kind of

play04:31

reinforces the overall look that you're

play04:34

going for uh things like cinematic film

play04:36

I have noticed that calling out IMAX

play04:38

actually does seem to play a part for

play04:42

example taking this shot which is a

play04:44

woman striding through a misty Forest

play04:45

wearing a leather jacket um we end up

play04:47

with a result that looks like this

play04:49

however by adding in a keyword of IMAX

play04:52

uh we end up with this as a result uh

play04:55

which definitely looks a lot better um

play04:58

as a note I did prompt monster in the

play05:00

background as well with this generation

play05:02

we ended up with kind of like that weird

play05:04

dissolve that's definitely something

play05:06

that I've noticed that gen 3 does it

play05:08

tries very hard to adhere to your prompt

play05:10

to the point where if it can't do

play05:12

something it will often kind of put a

play05:14

cut or dissolve in there in order to

play05:17

accomplish the mission uh for example in

play05:19

this shot where I prompted a woman's

play05:21

green eye in a macro shot the camera

play05:23

pulls out to reveal the interior of an

play05:25

industrial spaceship muted cold

play05:27

atmosphere in the style of a modern

play05:28

Blockbuster um it gives us this so we

play05:32

get the macro shot of the eye but as we

play05:34

pull out we end up dissolving to our

play05:37

woman she also morphs and uh turns into

play05:40

like walking away from us but you know

play05:42

whatever it's still AI video you're

play05:44

still going to end up with weird stuff

play05:46

that said uh one trick if you do run

play05:48

across a generation that you like and

play05:49

you kind of want to iterate on that uh

play05:52

like for example here we have a cyber

play05:53

Punk woman holding a katana strides

play05:56

confidently down a neon lit Street in a

play05:58

futuristic city what we can do is if we

play06:00

just end up rerolling this prompt again

play06:03

so rerunning our prompt we end up with

play06:05

this and she looks way scarier than our

play06:07

first generation so if that is not the

play06:09

direction that you're aiming for one of

play06:10

the things that you can do is come back

play06:12

over to your seed here and just copy

play06:15

that hit reuse prompt and come down to

play06:17

settings over here and just change the

play06:19

seed over to that Original Seed and then

play06:23

generate from there now you're not going

play06:24

to end up with the same output and

play06:26

really what would be the point in that

play06:27

but at least stylistically it will

play06:29

maintain kind of the overall look I

play06:32

ended up utilizing this method to create

play06:35

a music video for radio heads exit music

play06:38

for a film that's posted over on X I

play06:40

can't post it here because of well

play06:41

copyright stuff um but as you can see

play06:44

here I managed to pretty much maintain

play06:47

the overall look of L this idea of these

play06:51

people kind of uh disconnected walking

play06:54

through uh you know an apocalypse scene

play06:56

obviously a lot of layers here we have

play06:58

an AI video for a song called exit music

play07:01

for a film off of an album called Okay

play07:04

computer yeah I can be clever sometimes

play07:06

exploring more with Community ideas Tom

play07:08

Blake notes that the word suddenly seems

play07:11

to do some pretty interesting stuff when

play07:13

you use that in your prompt so I took

play07:15

that idea and tried out rain falling

play07:17

over a city suddenly we fast Zoom down

play07:20

to the city street and enter the POV of

play07:22

someone running into a coffee shop uh

play07:25

cinematic Moody dark atmosphere uh the

play07:27

result is I think pretty cool um we

play07:31

definitely got most of what we asked for

play07:34

here Although our POV person obviously

play07:36

did not make it into the coffee shop in

play07:38

this shot and is still stuck outside of

play07:39

the rain also those storm clouds are

play07:41

super intense gen 3 can also do text and

play07:45

one of the coolest examples that I've

play07:46

seen of this uh Blain Brown put together

play07:49

kind of in trying to mimic the MCU

play07:53

Marvel opening um yeah that is that is

play07:56

super impressive so the prompt here is a

play07:59

close-up of superhero comic book pages

play08:01

flipping with narrow depth of field on a

play08:03

wooden table as the camera zooms out the

play08:06

words uh bzen in 3D letters is revealed

play08:10

with several superhero comic panels on

play08:12

the word as textures uh you can read the

play08:15

rest and actually is all in the PDF as

play08:18

well now to note I did try to run that

play08:21

same prompt only swapping out uh

play08:23

Blaine's name for mine and I was getting

play08:26

errors I don't know if like the key

play08:28

wordss of Marvel MCU ended up triggering

play08:32

some kind of content system I have been

play08:34

running into some weird issues with that

play08:37

by like name dropping things like James

play08:39

Bond and you know gen 3 being like I

play08:41

ain't doing that but I mean there are

play08:42

always ways around that for example uh I

play08:45

saw this generation put together by

play08:47

Heather Cooper which I thought was just

play08:49

super super cool The Prompt here is a

play08:52

miniature civilization living on the

play08:53

pages of an ancient scroll building tiny

play08:55

castles pyramids and cities from letters

play08:58

and paragraphs as the pages unroll the

play09:00

buildings become colorful and because

play09:02

today's Monday and I watched House of

play09:03

the Dragon last night uh I saw that and

play09:06

instantly thought about the old Game of

play09:08

Thrones intro uh decided to try to

play09:11

recreate that using Heather's prompt as

play09:14

kind of a bass prompt so uh I ended up

play09:16

changing it to a miniature civilization

play09:18

living on a medieval fantasy map

play09:20

building tiny castles and cities from a

play09:22

map in a 3D rendered style and yeah I

play09:25

mean it's not uh Winterfell but you know

play09:29

kind of in the neighborhood admittedly

play09:31

no matter how much I experimented with

play09:32

the prompting I couldn't get like the

play09:34

buildings to sort of uh build in a time

play09:37

lapse as it does in the actual title

play09:39

sequence but that's okay I mean this is

play09:41

its own thing and again as much as I'm

play09:43

talking about like prompt formatting and

play09:45

keywords and all of that stuff uh you

play09:47

know sometimes you can just type in

play09:49

something stupid and get something

play09:50

awesome as always generating did here uh

play09:53

with a puppet talking to a man who does

play09:55

not want to be talking to a puppet I

play09:57

mean that is comedy gold it cannot do is

play10:00

take actual script pages and make video

play10:02

out of that uh as we did in the Dream

play10:04

Factory video I took a portion from The

play10:06

Dark Knight screenplay this is the scene

play10:08

in the opening where you know the uh two

play10:10

henchmen zip line across the building uh

play10:13

running that we end up with uh with this

play10:16

um I mean this is pretty hilar yeah

play10:18

that's that's pretty amazing um you know

play10:20

what this reminds me of is uh if you

play10:22

ever saw beind rewind um you know the

play10:25

film with most deaf and Jack Black where

play10:27

they're recreating famous movie scenes

play10:28

with like a VHS camcorder yeah that is

play10:31

this aoha AI notes that it does a really

play10:34

good job with time lapses particularly

play10:36

time lapses that are happening at two

play10:38

different intervals uh The Prompt here

play10:40

is a woman sits staring through a window

play10:42

outside the days turn rapidly into night

play10:45

at 100x so again not the longest most

play10:48

complex prompt but you know it got the

play10:49

desired result one thing that I do think

play10:51

would be very helpful is to make sure

play10:53

that you rate your outputs remember this

play10:55

is Gen 3 Alpha it is still very much in

play10:58

the alpha phase and the model will only

play11:00

continue to improve I mean just think

play11:03

back to that first gen 2 output we've

play11:06

also got a lot of exciting steps to come

play11:08

like image to video in gen 3 uh there is

play11:11

some question about how something like a

play11:13

motion brush will work in gen3 I'm not

play11:16

entirely sure I kind of suspect that it

play11:20

might look a little something like box

play11:22

amator which we looked at a while back

play11:25

no inside information there that is just

play11:27

me speculating so overall there is a lot

play11:29

of exploring to do with this new model

play11:32

and I'm really looking forward to going

play11:34

like prompt splunking with you so please

play11:36

do drop your findings and your favorite

play11:38

prompts in the comments down below thank

play11:41

you very much for watching my name is

play11:42

Tim

Rate This
★
★
★
★
★

5.0 / 5 (0 votes)

Étiquettes Connexes
AI VideoGen 3Prompting GuideContent CreationVideo EvolutionCreative AIArtificial IntelligenceTech ReviewInnovation TrendsAI Art
Besoin d'un résumé en anglais ?