GEN-3: The Ultimate Prompting Guide
Summary
TLDRIn this video, Tim explores the advancements of Runway ML's Gen 3, a significant upgrade from its popular predecessor. He provides an in-depth guide to effective prompting for Gen 3, sharing his research, testing, and insights. Tim demonstrates the AI's capabilities through various examples, highlighting improvements in video generation and the importance of descriptive prompts. He also discusses the model's adherence to prompts, its ability to handle text, and the potential for future features like image-to-video. The video concludes with an invitation for viewers to share their findings and favorite prompts.
Takeaways
- 🚀 Runway ML's Gen 3 is a significant upgrade from the popular Gen 2 model, marking a step forward in the 2.0 era of AI video generation.
- 🔍 The presenter has extensively researched, tested, and studied Gen 3 to provide an in-depth guide on how to use it effectively.
- 🎥 A comparison between Gen 2 and Gen 3 showcases the advancements in video quality and AI's ability to create more realistic and coherent scenes.
- 📝 Gen 3 allows for more descriptive prompting, moving away from spamming keywords to a style that is more narrative and detailed.
- 🌟 The importance of structuring prompts effectively is highlighted, with examples showing how additional details can vastly improve the outcome.
- 🎨 The script emphasizes the value of borrowing successful prompt structures from others in the community during the learning phase of the new model.
- 🔑 Keywords associated with subjects, actions, settings, and shots are identified as essential elements to include in prompts for optimal results.
- 🌈 The inclusion of style keywords, such as 'cinematic' or 'IMAX', can enhance the overall look of the generated video, as demonstrated with examples.
- 🔄 Gen 3's adherence to prompts is so strong that it may introduce cuts or dissolves to fulfill the user's request, even if it results in odd outcomes.
- 🔄 The presenter suggests reusing seeds from successful generations and making minor adjustments to maintain a consistent style while exploring variations.
- 🤖 Gen 3's capabilities extend to text-to-video, as shown by community examples that mimic popular culture visuals like the MCU opening sequence.
- 🚫 The script mentions encountering content system restrictions when using certain keywords, suggesting the need for creativity to navigate these limitations.
- ⏱ Gen 3 performs well with time-lapse prompts, effectively showing transitions at different intervals, as demonstrated in the provided examples.
- 📊 The presenter encourages users to rate their outputs to contribute to the improvement of Gen 3, which is still in its alpha phase.
Q & A
What is the main topic of the video transcript?
-The main topic of the video transcript is the introduction and exploration of Runway ML's Gen 3 model, a successor to the Gen 2 model, and an ultimate prompting guide for using Gen 3 to create AI videos.
What significant change is mentioned in the video about Gen 3 compared to Gen 2?
-Gen 3 is described as a significant step forward that allows for more descriptive prompting, less focused on spamming keywords, and better adherence to the user's prompt, which is a change from Gen 2.
What does the speaker do to demonstrate the progress made with Gen 3?
-The speaker revamps a previous Gen 2 video with Gen 3 to showcase the improvements in AI video generation, highlighting the advancements made in a short amount of time.
What are some of the key elements that should be included in a prompt for Gen 3 according to the video?
-Key elements for a prompt in Gen 3 include the subject, action, setting, shot type, and style, which help to maximize the generation of the desired video output.
How does the speaker suggest using keywords in prompts for Gen 3?
-The speaker suggests incorporating keywords associated with subject, action, setting, shot, and style into the prompt, but also emphasizes the importance of descriptive prompting over just keyword spamming.
What is the purpose of the PDF mentioned in the video?
-The PDF is a resource that includes a list of shot terms, prompts, and additional information to help users experiment with Gen 3 and improve their AI video generation.
What is an example of a prompt structure improvement suggested in the video?
-An example of a prompt structure improvement is changing a simple prompt to a more descriptive one, such as 'long shot in the distance a man in Black robes calmly walks across a vast desert Wasteland, the camera orbits to reveal a gunslinger watching him with steely eyes'.
What does the speaker mean by 'prompt splunking'?
-'Prompt splunking' refers to the process of experimenting with different prompts to see what kind of AI video outputs can be generated, learning and iterating based on the results.
How does Gen 3 handle situations where it can't fulfill a specific part of the prompt?
-If Gen 3 can't fulfill a specific part of the prompt, it often puts a cut or dissolve in the video to try and accomplish the mission set by the prompt.
What is the speaker's approach to maintaining the overall look of a generated video when iterating on a prompt?
-The speaker suggests reusing the original seed and adjusting the prompt while keeping the seed constant to maintain the overall look of the generated video.
What is the potential issue with using certain keywords in Gen 3 prompts as mentioned in the video?
-The potential issue is that using certain keywords, like those associated with copyrighted content like 'James Bond' or 'MCU', might trigger content systems and result in errors or refusal to generate the video.
What feature of Gen 3 is still in the alpha phase and expected to improve?
-The Gen 3 model itself is in the alpha phase, and the speaker expects it to improve over time based on user feedback and ratings of the outputs.
What is the speaker's final call to action for the viewers of the video?
-The speaker encourages viewers to share their findings and favorite prompts in the comments section of the video to contribute to the collective exploration and understanding of Gen 3's capabilities.
Outlines
🚀 Introduction to Runway ML Gen 3
The video script introduces the third generation of Runway ML, a significant advancement over its predecessor, marking the beginning of a new era in AI video generation. The narrator has spent considerable time researching, testing, and studying Gen 3 to provide an ultimate prompting guide. The script showcases the evolution of AI video capabilities by comparing an early Gen 2 model from April 2020 to a revamped version using Gen 3. It highlights the improved descriptive prompting style of Gen 3, which allows for more detailed instructions and less reliance on keyword spamming. The narrator also emphasizes the importance of structuring prompts effectively to achieve better results, such as including subject, action, setting, shot, and style elements.
🔍 Exploring Prompting Techniques in Gen 3
This paragraph delves into the intricacies of prompting Gen 3, discussing the adherence of the model to user prompts and the creative results that can be achieved with descriptive and structured prompts. The narrator shares examples of prompts and the corresponding AI-generated videos, noting occasional morphing issues and the model's tendency to insert cuts or dissolves when it cannot fulfill a specific request. The paragraph also touches on the use of the word 'suddenly' to create dramatic transitions and the potential for reusing seeds from successful generations to maintain a consistent style. The narrator encourages experimentation with prompts and iterates on the idea of using keywords associated with different aspects of the video to enhance the generation process.
🎨 Creative Prompting and Community Insights
The script explores creative prompting techniques and insights from the community, demonstrating how specific words like 'suddenly' can elicit interesting video results. It also discusses the use of text in prompts to generate video, showcasing examples like a Marvel Cinematic Universe-style opening and a miniature civilization living on the pages of an ancient scroll. The narrator shares personal experiences with creating a music video for Radiohead's 'Exit Music (For a Film)' using AI video and mentions the limitations of the model, such as its inability to generate videos from actual script pages. The paragraph concludes with a call to rate outputs to help improve the Gen 3 model and anticipation for future features like image-to-video capabilities.
Mindmap
Keywords
💡Runway ML Gen 3
💡Prompting
💡Descriptive Prompting
💡Morphing Issues
💡Prompt Structuring
💡Shot Terms
💡Style
💡Seed
💡Community Ideas
💡Time Lapses
💡Alpha Phase
Highlights
Introduction of Runway ML's Gen 3, a significant advancement in AI video generation.
Comparison of Gen 2 and Gen 3, showcasing the evolution from text-to-video to more advanced capabilities.
The new prompting style in Gen 3 allows for more descriptive prompts rather than keyword spamming.
Example of improved results with additional descriptive details in the prompt.
Importance of including subject, action, setting, and shot in the prompt for optimal results.
Availability of a free PDF with shot terms and prompts for Gen 3 experimentation.
The use of 'suddenly' in prompts can lead to interesting and dynamic video results.
Gen 3's ability to handle text and create video from comic book pages or other textual content.
The challenge of mimicking specific styles or franchises like the MCU opening in Gen 3.
Exploring community ideas and experimenting with prompts for creative video generation.
The potential of Gen 3 to create time-lapse videos with different intervals.
The limitations of Gen 3 in handling script pages for video generation compared to Dream Factory.
The importance of rating outputs to help improve Gen 3 during its alpha phase.
Speculations about future features in Gen 3, such as image-to-video capabilities.
Encouragement for the community to share findings and favorite prompts for collaborative exploration.
The presenter's anticipation for further experimentation and 'prompt splunking' with the audience.
Transcripts
so Runway ml's gen 3 has arrived this is
obviously a successor to their wildly
popular Gen 2 model this is a really
significant step forward and really does
cement Us in this new 2.0 era of AI
video today we're going to dive into
kind of an ultimate prompting guide for
Gen 3 I have spent the last few days
like really going on a deep dive of
researching it testing it and studying
it and I'm going to pass all of that
along to you okay lot to cover let's get
started
briefly I did just want to take a quick
moment to showcase exactly how far we've
come back on April 26th of
20203 uh I posted a first look at Gen 2
which at the time was only text to video
and uh strung some scenes together and
made this
[Music]
it's charming warpy and morphe uh but
this morning I just decided to revamp it
for a V2 with Gen 3 and we got
[Music]
[Music]
this so yeah we have come a long way in
a very short short amount of time let's
start looking at prompting in gen 3
which is much more akin to the modern
style of prompting it allows you to be a
lot more descriptive in your prompt and
you know less focused on spamming
keywords so for example giving gen 3 The
Prompt the man in Black fled across the
desert and the Gunslinger followed Dark
Tower fans that one's for you we end up
getting this shot which is not bad I
mean there are some morphing issues like
suddenly the man in Black has an
umbrella here uh and you'll see in one
second that the Gunslinger followed
prompt ends up giving us this I mean
he's not the worst Gunslinger ever but
he also does kind of look like he you
know was an extra in a film Noir movie
and kind of wandered onto the wrong set
but with some additional details into
the prompt and some prompt structuring
uh we end up with a shot like this which
is obviously vastly improved there are
uh you know a handful of problems here
we'll talk about that in one second uh
but the prompt here is long shot in the
distance a man in Black robes calmly
walks across a vast desert Wasteland the
camera orbits to reveal a gunslinger
watching him with steel result to note I
did crib that orange and red color
grading look uh part of the prompt from
Nicholas nubert which in these early
days I think is actually really
important to do as we're all learning
what this model is capable of I'll shout
out everyone and have links to profiles
down below I don't necessarily think
that the keyword version of prompting is
necessarily better than the more
descriptive version of prompting but I
do think that there are certain buckets
that you should probably hit if you're
looking to maximize your generation so
while you can go about writing your
prompt in whatever fashion you like I do
think that it's helpful to have keywords
that are associated with these sections
built into your prompt obviously first
your subject your person place or thing
whatever you we are focusing on in your
shot second would be the action that
that thing is taking uh is it walking is
it dancing is it staring intently uh I
do note that that adjectives do work
well here so uh angrily walking or
dancing happily setting obviously refers
to your location a castle a busy City
street or a Dusty motel that said I do
think that you can buy a little bit more
if you attach kind of mood
characteristics to it as well such as
you know dark stormy clouds or a bright
sunny day shot obviously refers to
things like wide angle closeup and long
shot I do have a list of shot terms that
you can try out in gen 3 you don't need
to worry about like screen shouting this
or anything this is all available along
with a number of prompts that we're
looking at today in a PDF over on
gumroad it is completely free if you
know you see the little cost thing just
put zero in there and you can download
it although you are always welcome to
leave a donation it is always highly
appreciated once again just to reiterate
there is no right or wrong way to prompt
uh do feel free to swap the order around
on these things put your shot first your
subject uh in the middle see what
happens just experiment and iterate
rounding out with style this kind of
reinforces the overall look that you're
going for uh things like cinematic film
I have noticed that calling out IMAX
actually does seem to play a part for
example taking this shot which is a
woman striding through a misty Forest
wearing a leather jacket um we end up
with a result that looks like this
however by adding in a keyword of IMAX
uh we end up with this as a result uh
which definitely looks a lot better um
as a note I did prompt monster in the
background as well with this generation
we ended up with kind of like that weird
dissolve that's definitely something
that I've noticed that gen 3 does it
tries very hard to adhere to your prompt
to the point where if it can't do
something it will often kind of put a
cut or dissolve in there in order to
accomplish the mission uh for example in
this shot where I prompted a woman's
green eye in a macro shot the camera
pulls out to reveal the interior of an
industrial spaceship muted cold
atmosphere in the style of a modern
Blockbuster um it gives us this so we
get the macro shot of the eye but as we
pull out we end up dissolving to our
woman she also morphs and uh turns into
like walking away from us but you know
whatever it's still AI video you're
still going to end up with weird stuff
that said uh one trick if you do run
across a generation that you like and
you kind of want to iterate on that uh
like for example here we have a cyber
Punk woman holding a katana strides
confidently down a neon lit Street in a
futuristic city what we can do is if we
just end up rerolling this prompt again
so rerunning our prompt we end up with
this and she looks way scarier than our
first generation so if that is not the
direction that you're aiming for one of
the things that you can do is come back
over to your seed here and just copy
that hit reuse prompt and come down to
settings over here and just change the
seed over to that Original Seed and then
generate from there now you're not going
to end up with the same output and
really what would be the point in that
but at least stylistically it will
maintain kind of the overall look I
ended up utilizing this method to create
a music video for radio heads exit music
for a film that's posted over on X I
can't post it here because of well
copyright stuff um but as you can see
here I managed to pretty much maintain
the overall look of L this idea of these
people kind of uh disconnected walking
through uh you know an apocalypse scene
obviously a lot of layers here we have
an AI video for a song called exit music
for a film off of an album called Okay
computer yeah I can be clever sometimes
exploring more with Community ideas Tom
Blake notes that the word suddenly seems
to do some pretty interesting stuff when
you use that in your prompt so I took
that idea and tried out rain falling
over a city suddenly we fast Zoom down
to the city street and enter the POV of
someone running into a coffee shop uh
cinematic Moody dark atmosphere uh the
result is I think pretty cool um we
definitely got most of what we asked for
here Although our POV person obviously
did not make it into the coffee shop in
this shot and is still stuck outside of
the rain also those storm clouds are
super intense gen 3 can also do text and
one of the coolest examples that I've
seen of this uh Blain Brown put together
kind of in trying to mimic the MCU
Marvel opening um yeah that is that is
super impressive so the prompt here is a
close-up of superhero comic book pages
flipping with narrow depth of field on a
wooden table as the camera zooms out the
words uh bzen in 3D letters is revealed
with several superhero comic panels on
the word as textures uh you can read the
rest and actually is all in the PDF as
well now to note I did try to run that
same prompt only swapping out uh
Blaine's name for mine and I was getting
errors I don't know if like the key
wordss of Marvel MCU ended up triggering
some kind of content system I have been
running into some weird issues with that
by like name dropping things like James
Bond and you know gen 3 being like I
ain't doing that but I mean there are
always ways around that for example uh I
saw this generation put together by
Heather Cooper which I thought was just
super super cool The Prompt here is a
miniature civilization living on the
pages of an ancient scroll building tiny
castles pyramids and cities from letters
and paragraphs as the pages unroll the
buildings become colorful and because
today's Monday and I watched House of
the Dragon last night uh I saw that and
instantly thought about the old Game of
Thrones intro uh decided to try to
recreate that using Heather's prompt as
kind of a bass prompt so uh I ended up
changing it to a miniature civilization
living on a medieval fantasy map
building tiny castles and cities from a
map in a 3D rendered style and yeah I
mean it's not uh Winterfell but you know
kind of in the neighborhood admittedly
no matter how much I experimented with
the prompting I couldn't get like the
buildings to sort of uh build in a time
lapse as it does in the actual title
sequence but that's okay I mean this is
its own thing and again as much as I'm
talking about like prompt formatting and
keywords and all of that stuff uh you
know sometimes you can just type in
something stupid and get something
awesome as always generating did here uh
with a puppet talking to a man who does
not want to be talking to a puppet I
mean that is comedy gold it cannot do is
take actual script pages and make video
out of that uh as we did in the Dream
Factory video I took a portion from The
Dark Knight screenplay this is the scene
in the opening where you know the uh two
henchmen zip line across the building uh
running that we end up with uh with this
um I mean this is pretty hilar yeah
that's that's pretty amazing um you know
what this reminds me of is uh if you
ever saw beind rewind um you know the
film with most deaf and Jack Black where
they're recreating famous movie scenes
with like a VHS camcorder yeah that is
this aoha AI notes that it does a really
good job with time lapses particularly
time lapses that are happening at two
different intervals uh The Prompt here
is a woman sits staring through a window
outside the days turn rapidly into night
at 100x so again not the longest most
complex prompt but you know it got the
desired result one thing that I do think
would be very helpful is to make sure
that you rate your outputs remember this
is Gen 3 Alpha it is still very much in
the alpha phase and the model will only
continue to improve I mean just think
back to that first gen 2 output we've
also got a lot of exciting steps to come
like image to video in gen 3 uh there is
some question about how something like a
motion brush will work in gen3 I'm not
entirely sure I kind of suspect that it
might look a little something like box
amator which we looked at a while back
no inside information there that is just
me speculating so overall there is a lot
of exploring to do with this new model
and I'm really looking forward to going
like prompt splunking with you so please
do drop your findings and your favorite
prompts in the comments down below thank
you very much for watching my name is
Tim
Ver Más Videos Relacionados
Text to Image generation using Stable Diffusion || HuggingFace Tutorial Diffusers Library
📣 Anteprima in Italia: Ideogram 2.0 è una bomba [Tutorial]
How to Use DALL.E 3 - Top Tips for Best Results
The Perfect Prompt Generator No One Knows About
AI Video Tools Are Exploding. These Are the Best
Midjourney Version 6 - IS AMAZING!!!
5.0 / 5 (0 votes)