10 Things About OpenAI SORA You Probably Missed
Summary
TLDRSora,由OpenAI发布的AI视频生成器,正在革新视频制作领域。它通过文本提示生成视频,降低了制作成本,提高了效率。尽管目前Sora生成的视频缺乏音频和细节编辑功能,但结合其他AI工具,未来有望实现音视频一体化生成。Sora的出现预示着视频制作的新时代,可能会对传统视频制作行业产生深远影响。
Takeaways
- 🎥 Sora是由OpenAI于2024年2月15日发布的视频生成器,它能够根据文本提示生成视频内容。
- 🔍 视频生成器的发布伴随着大量的炒作,但通过深入研究和探索,发现了许多不为人知的功能。
- 🎶 Sora目前仅生成视频,不包含音乐或声音效果,但11 Labs已经发布了一个能够从文本提示生成整个音景的音频生成器。
- 💡 音频和视频生成器的结合将创造出全新的音视频生产工具,极大地降低了个人制作商业广告等视频内容的成本。
- 🌟 Sora的新技术包括视频扩展和视频循环生成,这些功能未来可能会成为视频编辑软件的标准配置。
- 🚀 Sora的出现大幅降低了视频制作的门槛,使得高质量视频制作变得更加容易和可访问。
- 📌 Sora目前还无法进行细节编辑,但随着技术的发展,未来可能会实现对视频细节的精确控制和编辑。
- 📖 通过单个文本提示,Sora能够生成完整的故事,预示着未来可能仅通过文本提示就能生成完整的电影或电视节目。
- 🕒 Sora的发展阶段类似于GPT-3的前身,预示着AI视频技术将迅速发展并在未来几年内实现更多功能。
- 📷 Sora将改变视频素材和库的生产方式,使得个人和小团队能够以极低的成本生成定制的视频素材。
- 🌐 Sora的技术最终可能实现3D世界的生成,为视频制作、游戏开发和其他领域带来革命性的变化。
Q & A
Sora是由哪家公司发布的?
-Sora是由OpenAI公司发布的。
Sora发布日期是什么时候?
-Sora发布于2024年2月15日。
Sora视频生成器在发布时缺少了哪个重要元素?
-Sora视频生成器在发布时缺少了音频元素,所有示例视频都是静音的,没有背景音乐或声效。
11 Labs发布了什么来补充Sora的视频生成能力?
-11 Labs发布了一个新的声音生成器,它能够根据文本提示生成整个音景。
Sora的一个新功能是什么,这在以前是不可能的?
-Sora的一个新功能是可以延长视频,这意味着可以无缝地过渡到视频的开头,生成全新的视频内容。
Sora如何改变了视频制作的成本?
-Sora极大地降低了视频制作的成本,使得原本需要大量时间和资源的高质量视频制作变得简单快捷,甚至一个人就可以完成整个制作过程。
Sora的编辑性如何?
-目前Sora在编辑性方面存在限制,如果客户有反馈需要修改细节,可能需要重新生成整个场景。但随着技术的发展,未来可能会有更多编辑工具出现,以满足编辑需求。
Sora能否从单个提示生成故事?
-是的,Sora能够从单个文本提示生成完整的故事,这展示了其在内容创作方面的潜力。
Sora在AI视频发展中处于什么位置?
-Sora在AI视频发展中类似于GPT-3在大型语言模型中的位置,它是一个强大的工具,但还有待进一步的完善和普及。
Sora对视频制作行业有哪些潜在的影响?
-Sora可能会改变视频制作行业的许多方面,包括降低成本、提高制作效率、创造新的视频内容形式,甚至可能影响到视频素材和版权的管理和销售。
Sora能否生成3D世界?
-Sora有潜力生成3D世界,通过称为Goshan Splatting的技术,可以将视频转换成3D模型,并在游戏引擎如Unity中进一步使用。
Outlines
🤖 Sora AI视频生成器的潜力与挑战
本段介绍了Sora AI视频生成器的发布背景和作者Eigor对它的深入研究。Sora由OpenAI发布,引发了大量讨论。Eigor花费大量时间研究Sora的功能和潜力,发现除了视觉效果外,音频同样重要。虽然Sora生成的视频没有音乐或声效,但11 Labs发布了一个声音生成器,可以与Sora结合,生成完整的音景。Eigor预测,不久的将来,AI将能够自动生成背景音乐、声效和角色对话,彻底改变音视频制作流程。
🎥 Sora的创新功能与成本降低
这一部分讨论了Sora带来的创新功能,如视频扩展和循环生成,以及它如何降低视频制作成本。Eigor提到,Sora可以无缝扩展视频片段,创造出全新的帧,这是前所未有的。此外,Sora还能生成可以无限循环的视频,这可能会成为互联网上新的梗。Eigor还指出,Sora的出现将大幅降低制作高质量视频的成本,使得原本需要大量人力物力的项目变得易于实现。
🌐 视频编辑的新时代与个性化内容
在这一段中,Eigor探讨了Sora对视频编辑行业的影响,以及如何通过单个文本提示生成故事。他提到,Sora和类似的AI工具将使得视频编辑变得更加简单,甚至可以生成具有特定格式和风格的内容。Eigor还预测,随着技术的发展,AI将能够根据用户的反馈和需求,对视频进行细致的编辑和调整。
📹 视频制作的未来趋势与个性化库
Eigor在这一段中讨论了Sora对视频制作行业的长远影响,特别是在个性化视频库的创建上。他预测,视频制作者将能够利用Sora生成特定的视频片段和背景,而不需要购买或拍摄实际的素材。这种能力将极大地降低成本,提高效率,并且使得视频制作更加个性化和创新。
🌍 3D世界构建与Sora的无限可能
最后一段聚焦于Sora在3D世界构建和模拟方面的潜力。Eigor提到,Sora不仅可以生成逼真的视频,还能将这些视频转换成3D模型,为游戏引擎如Unity提供素材。此外,Sora通过简单的文本提示就能生成特定风格的世界,如Minecraft。Eigor对这项技术的未来发展充满期待,同时也对未来的可能性感到既兴奋又畏惧。
Mindmap
Keywords
💡Sora
💡AI视频生成器
💡音频生成
💡视频编辑
💡成本降低
💡视频制作
💡技术报告
💡社交媒体
💡视频库
💡3D世界生成
Highlights
Sora视频生成器由OpenAI发布,于2024年2月15日发布。
Sora目前仅生成视频,所有示例都是静音的,没有音乐或声音效果。
11 Labs发布了一个新的声音生成器,可以从文本提示生成整个音景。
Sora可以扩展视频,这是之前不可能的,它从头开始生成视频。
Sora能够生成额外的帧,让视频无缝循环。
Sora大幅降低了制作视频的成本,这可能会改变视频制作行业。
Sora的编辑性有限,目前无法对生成的视频进行细节修改。
Runway ML引入了多动作刷工具,允许在视频中仅对特定部分进行动画处理。
Sora可以从单个提示生成整个故事。
Sora的发展速度相当于从GPT-2直接跳到GPT-3,预示着AI视频技术的快速发展。
Sora可能会成为库存视频的终结者,因为它能以极低的成本生成视频。
Sora能够生成任何格式的视频,从手机格式到宽屏格式。
Sora可以用于3D世界和世界生成,将视频转换为3D模型。
Sora生成的视频在时间上是一致的,可以用于3D环境和角色的创建。
Sora的发布标志着AI视频生成技术的一个新时代,预示着未来的无限可能。
Sora的能力和应用正在迅速发展,未来可能会有更多的创新和突破。
Sora的发布和演示视频可以在OpenAI的官方页面上找到,供人们尝试和体验。
Transcripts
Sora the video generator by open AI
released on February 15th 2024 and I've
spent pretty much every hour of my life
scouring the internet and researching
what else this could do and there's
actually a lot of things that weren't
obvious in the middle of all the hype
that accompanied the release of this AI
video generator I studied a technical
report on detail watched all the YouTube
videos spent an unhealthy amount of time
on Twitter looking for all the
discussions and the little findings
people had matter of fact since release
I didn't even leave the
apartment
if we haven't met yet I'm eigor I made
it my full-time calling to research what
AI has to offer and how to put it to
work in your everyday life and before
doing that with the a Advantage I had a
video production company that operated
for eight years in Central Europe I
helped clients with everything from
corporate video trainings to directing
smaller commercials and even shooting
festivals nightclub videos when it comes
to videography I've really seen it all
and this stuff is exactly in the middle
between technology and video production
so I can't wait to dive into all of this
all right so without further Ado let's
look at all the implications of Sora
that you might have not been aware of
right away okay so first of all I want
to talk about audio because Sora only
generates video right all the example we
saw
were muted without music or sound
effects in the background and a lot of
people rightfully pointed out that hey
in film it's really 50/50 at the very
least it's 50% visuals and another 50%
audio and there's many layers to that
right you might have the actor's voice
as one track but then there's also sound
effects of things happening around them
and then you have foli which is the
background sound that just persists
you're not really consciously aware of
it but it's there and if it's not there
the shot is missing something so surely
audio must be a complicated issue too
right well not really because 11 Labs
actually reacted to the Sora release and
they released a new sound generator that
from text prompts is able to generate an
entire soundscape okay so today we don't
have access right but if open AI hooked
up Sora to this audio generator you
would have a audio visual generator
where you create full soundscapes have a
quick listen
and sure a sound designer could do this
manually but again if you're a oneman
show and you're producing a commercial
like I did so so many times you're doing
everything yourself from planning to
recording editing doing the sound design
doing the color grading doing feedback
rounds with the client invoicing and
often times you don't have budget for a
sound designer so you bet that there's
going to be models I don't know if Sora
or others that combine both they're
going to give you audio visual outputs
this is not a question that's just a
straight fact at this point and with
tools like sun AI out there already that
can generate full songs including lyrics
at a decent quality with AI well you're
going to be able to generate the
background music the background sound
effects the voices that are in the scene
because voice generators are thing and
they're virtually indistinguishable
already right and now the video
components so we really have the full
stack for audiovisual production it's
just a question of time now and from my
estimate it looks to be months not years
till we'll get there okay my next point
is all about the capab abilities of Sora
that are actually brand new because a
lot of the stuff that we saw just
drastically reduce the cost of what it
takes to produce a clip like this or an
animated video like this you might be
aware that movies like this exist right
it just cost a lot of money to produce
this so first of all let's talk about
the things that are actually brand new
and not just a cost reduction although
that has its implications too and we'll
talk about that but the things that are
actually new are first of all you can
extend videos okay so this is
beautifully outlined in a technical
paper here and it shows the example of a
San Francisco subway car so as you can
see this clip is the same in all three
instances but if you back up a little
bit then extended the beginning of it
okay so as you can see the video
generated by Sora is different every
single time and it seamlessly
transitions into the subway car so this
is something that was not possible up
until now okay it generates this video
from scratch now I guess you could argue
that you could recreate this entire
scene in 3D and then create the frames
before that and seamlessly transition
into it but you have to realize that at
a certain point this is going to become
a feature in every editing software
right you'll have just an image and it
will turn it into a video and then you
can extend it to any duration you can
add a clip before add a clip after
you'll be able to turn your old family
photos into Vivid memories sort of that
is really scary but it's going to be a
thing and you bet apps like Instagram at
one point I don't know when are going to
have a feature where you're going to be
able to turn a photo into video and then
extend that indefinitely another new
capability is you're going to be able to
Lo videos okay and this is also
something that you could kind of but not
really achieved today definitely not in
this form okay you'll give it a video
clip and it will be generating extra
frames that will seamlessly let the
footage loop I had a good chat with a
friend and we kind of talked about how
this could be the new Rick rolling on
the
internet because if you do this to a
longer clip you just don't realize that
it's looping and that it's just playing
forever so you could send somebody a
clip and it might take them minutes to
realize that the whole thing is looping
and just repeating over and over again
anyway this is something that was was
not really possible and some people went
ahead and tried this anyway in
videography there was this whole Trend a
few years back where people were trying
to seamlessly transition one thing into
another like for
example and my shirt is gone magic now
those are the simplest way to do it but
here we will have the capability of
generating brand new frames and things
will be able to Loop indefinitely okay
so those are the new features you can
expect in editing software somewhere
down the line but then there's a lot of
the ones that are just simple cost
reduction this is why people refer to it
as the death of Hollywood in many cases
now I don't know if that's an accurate
assessment in my opinion I think they're
going to use this Tech to Advantage to
lower the prices of production and pump
out even more content we'll also talk
about that soon but let's finish up the
segment and talk about the things that
were already available but now it's just
a 10,000x reduction cost for that
calculation I see a subscription price
that is somewhere around the GPT plus
plan so what's going to be possible at
this super low cost is first of all
generating images we're able to do that
with other image generators right sure
these are hyper realistic and very high
quality just like M journey and so but
then it's capability to turn images into
videos that is very very big in my
opinion because it's going to make it so
easy to craft compelling videos like I
feel like most people that talk about
this don't appreciate how much this is
going to lower the barrier for entry for
videography and high quality videography
that is because you're going to get
access to things like this so even if
you've seen this before I think I have a
bit of a different perspective here so
look here on the left you have the Drone
image here on the right you have this
butterfly right and here in the middle
you have the mix of the two where the
Drone is flying through something like
the Coliseum and then it morphs into a
butterly fly and look I could do this
today okay this just takes about 3 to 5
hours of work dependent on your skill
level you just go into after effects and
you rotoscope out this butterfly meaning
you go frame by frame that's 25 frames
every single second and you make sure
you animate a mask exactly in the form
of the Butterflies wings and you redo
that for every movement now yes there's
tools that help you but a lot of times
you're stuck with manual labor there so
it might just turn out that the 3 to 5
hour task turns it into 15 20
hours and then you can bring the
butterfly into here and morph it into
the Drone with something like a morph
cut inside of Premiere Pro now if none
of that means anything to you that's
fine I'm just saying hours of work are
going to be done like
this and this is just one simple example
in many others a oneman crew could never
do this right all these animation
related examples where they turn an
image into an animation like this are
usually just not feasible for a oneman
show it takes too much time to animate
all the little things you might be able
to do it for a few shots but if you do a
whole one minute trailer you'll find
that you spend 2 weeks at the computer
if you really animate all the little
details like in this shot and you have a
lot of different shots so that's my
second point it lowered the bar by a
factor that is larger than most people
realize I don't know if it's 1,000x or
10,000x but a lot of these things were
Unthinkable for small Crews or oneman
shows and now they will be doable like
for example before
after Okay so this point is all about
the editability of the video and here in
Twitter Owen Fern went ahead and he
criticized the fact that hey yes these
Generations are absolutely incredible
but what if the client has feedback and
this is very very appropriate criticism
in my opinion because clients always
have feedback and if you're going to use
this for job if this is supposed to be
the death of Hollywood just between
directors and producers there is so much
feedback going on in the post-production
of any advertisement movie heck even if
it's an event video I had clients that
went back and forth 10 times and gave
feedback over and over again and I had
to adjust things so one points out here
that yeah there's going to be a lot of
little details that will need to be
changed about these scenes and with Sora
you're not really able to go back and
change little details right you're going
to have to regenerate the whole scene
and maybe you like the character here
but you just don't like the fact that
this is not a Thum it just looks like a
fifth finger and we would like to give
it a look of a Thum can we do that and
his point is the answer has to be no and
then you have a dissatisfied client
which is a very fair point but as I've
been following this very closely over
the last months there's one tool and one
research that needs to to be pointed out
here okay first things first Runway ml
the previous so to say leader in AI
video a few weeks ago introduced a
feature called multi motion brush tool
which allowed you to use multiple
brushes on the video to just animate
specific parts now that is for animation
but over in M journey and many other
image generators you're able to do
something called inpainting where you
just paint in a little part of the image
and then edit just that you can reprompt
it so on images today you could actually
go in and just paint in this Thum and
say regenerate the Thum why would that
not be possible on video eventually it
will be and further than that bite Dan
the creator of Tik Tok actually
published a research paper less than a
week ago about this so-called boxor okay
so I didn't cover it on the channel
because I like to cover things that are
available today or truly truly
revolutionary this kind of Falls in this
in between zone of hey really
interesting but it's not available and
in my eyes probably not worth a
dedicated video but look the whole point
of this is you draw different boxes in
the scene and thereby you can control
the seen in great detail so if you
select the balloon and say it's going to
fly away in this direction and then you
select a girl and she's going to run in
a different direction exactly that is
going to happen so between tools like
the box imator and inating in mid
Journey it's just a question of time
where you're going to be able to use a
mix of these tools and also in paint on
top of AI video now sure there's going
to be a temporal axis there right
because on images you only have the X
and Y AIS and in video there's also the
time axis and sometimes you even have
movement in zspace but between This
research and painting I can totally see
that happening for AI video 2 down the
line plus as we know with prompt
engineering today for language based
models there's a lot of control that you
have in the text prompt you just have to
be really detailed if you look at a lot
of these prompts they're good but
they're not as detailed as they could be
some of the best stable diffusion
prompting is extremely detailed also in
mid journey in stable diffusion if you
keep your prompts relatively simple
you're going to get varied results even
if you roll the dice and create a new
scene it's going to be very similar plus
let's refer back to Mid Journey again
they just recent recently announced a
new character tool where it's going to
maintain character consistency based on
a character that you pick in a tool so
all of these AI image features that
we've been talking about and I've been
tracking regularly they're going to
apply to video tool it's just going to
take longer but I absolutely believe
that we'll be able to implement all of
this little feedback into AI video and
therefore this actually being production
ready at some point okay so my next
Point here is that I didn't expect right
in a beginning is that you can prompt
stories into existence from a single
prompt okay so here's an example from
Bill PE from the open AI team and he
generated an entire story of two dogs
that should walk through NYC then a taxi
should stop to let the dogs pass across
walk then they should walk past the
pretzel and hot dog stand and finally
they should end up at Broadway signs and
if you follow this channel you might
know how much context you can add text
prompts to achieve exceptionally
accurate results from things like chat
GPT if you added way more details here I
believe they would be reflected in it
and then the story can develop and as
right now you already have tools that
can manipulate someone's mouth to speak
in another language so it looks
naturally also that will be possible
here so you will be able to create these
long shots like they have in movies
which are incredibly difficult to
achieve I mean some movies like Dunkirk
took it so far where they turned the
movie into a single Take It All flows
seamlessly and Sora is able to do it too
and that I didn't expect at the
beginning also they didn't share this
example right off the bat I think this
is actually very very impressive and if
now we're already able to generate
stories from a single Simple Text prompt
it's just a question of time until we
arrive at something like this where you
just type in a prompt and you get a full
movie back or a full show I mean at some
point it's just a question of having
enough gpus this is obviously just a
mockup but something to think about
especially because this is the worst
teack is ever going to be and you know
what let's talk about that point that is
actually my next one so where are we in
the timeline of this okay it was really
helpful to look into some of the
discussions that are happening online to
orient myself in terms of where we
actually are today soad most St from
stability AI actually had a fantastic
take here he compared this to the gpt3
of video model models so if you didn't
know gpt3 was the predecessor to chat
GPT okay it was available before but the
interface was not as intuitive and you
actually had to prompt it differently
rather than cat gbt that had
reinforcement learning for human
feedback which means a lot of humans
feedbacked the outputs to make it more
user friendly for humans and that's
where this is at right now okay it's not
at the cat GPD point where it's going to
be really easy to use and it's going to
gain Mass popularity and then we got
gbd4 and all the additional features and
it's just crazy capable now and he even
said that all the images generators like
stable diffusion were more comparable to
gpt2 where the quality of the output was
not nearly as good as gpt3 so as in
large language models this puts us on
the timeline somewhere in the middle of
2022 because the chat gbt gbt 4 llama
and mistrals will come over the next few
years we Rems at the pace that we're
moving ahead right and on this topic
there's another fantastic Fred by Nick
samier here on X and he ran all the
exact prompts that Sora generated
through my journey and then paired them
with the results and the thing is
they're shockingly similar right so
people are already joking that hey is my
journey just open AI disguised probably
they're just using very similar training
data right but look at that all of these
examples are very similar now I'm sure
these are the ones that were the most
similar right to create this illusion of
it essentially being the same model here
I mean if you look closer the beaver is
very different but the point is these
are not night and day right sure these
helmets are completely different but the
Cinematic look is very similar with
slightly different color grading down
here fair but the point that I'm trying
to make here is that we literally
skipped two to three years ahead in AI
video because what we had up pela was
something like gpt1 or
gpt2 oh that's hot now we got gpt3 that
is actually usable and can create useful
outputs that are essentially hyper
realistic but we're not even at the chat
GPT moment yet where you get editability
and things like audio generation that we
talked about here that is all yet to
come but again at this pace of
development we should probably be
thinking in days and and weeks and maybe
months and not years or decades I guess
that poses the question at which point
in the development do we reach the
Matrix and I don't know the answer to
that question I'm turning 30 next month
and it does feel like it will happen in
this lifetime or something akin to that
right who knows moving on okay so my
next Point goes back to my original
video where I stated that you know this
is going to be the death of stock
footage I sell it myself since almost
decade and there's just no way people
are going to be paying $50 or $100 per
clip if they can just generate them for
a few cents and yeah I think that one is
an obvious one but beyond that it really
got me thinking about what this means
for video creation especially for the
smaller cruise and oneman shows well
you're going to be able to generate
entire video libraries for yourself hear
me out so right now if you have a video
let's say this is the a roll right this
is the main story of the video me
talking presenting to you all my
findings and then on top of that we have
something that we refer to as broll
these are the clips that are there to
add an additional layer of information
they add visual interest keep you more
more engaged and really allow us to get
the most out of this audiovisual medium
and right at this very moment you're
consuming both audio and video at the
same time so we're trying to make the
most out of all these layers I do my
best to keep my speech and presentation
concise because I value your time and
then in the editing we do our best to
add as much information on top and right
now that is done for boll so we pay for
various libraries where we take these
shots that enhance our videos and we
also pay for various music libraries to
add the right type of music to enhance
the atmosphere of the video but with
models like Sora this will really change
the game because you're going to be able
to generate an entire library for
yourself for that specific project
because the cost goes down so much
you're going to be able to prompt things
into existence that beforehand you would
have to research download and compile
and usually they don't even match and
you have to do color correction and
color grading on top of them and here as
you can see from a single text prompt we
got five video frames and all of these
can be upscaled with something like
topas video AI right that tool is paid
they cost a few hundred doar but you can
upscale 1080p Clips to 4K with AI really
effectively but here you're just going
to be able to prompt them and then again
just looking over at all the AI Imaging
tools all the features that we see in
the Imaging tools are going to be
available to the video tools so
something like a oneclick upscale to 4K
quality is going to be there can you
regenerate this or can you generate four
more just like this is going to be there
you can think about the whole mour
interface in Discord being something
that you can do with these videos
upscale reroll more like this use a
different ver version of the model and
after a few minutes you'll have a whole
library of Boll that can enhance your
video now I as a video creator can't
wait for this I know that eventually the
end point of all of this is the
technology really replacing a lot of
content and who knows if I'll be sitting
here and presenting the news to you if
an AI can do it in real time minutes
after the release of something and you
will be able to get it exactly in the
voice that you prefer while it also
respects your context right so in this
video I kind of have to assume your
knowledge level right so at certain
points I also have to assume that
somebody never created a video before
but some of you might be experienced
directors that know all these Concepts
and know how the industry works well the
AI is eventually going to be able to
create that exactly for your context but
I digress the point here is that at
least for the footage at least for the
production of this video I could have a
custom library that is going to enhance
all the visuals and maybe we could be
taking a trip through Tokyo as of now
where I present these ideas there's
going to be some point where I'm just
going to be able to take my voice and
use my digital Avatar let him walk
through Tokyo and explain these Concepts
in a very practical manner without ever
leaving my desk I don't think at this
point that is a stretch a week or two
ago it seemed a bit unreal to think of
lifelike video the best we had was
animations that were good and talking
head videos that looked okay they looked
convincing for a second or two if you
weren't looking for AI but again if this
is the gpt3 of AI video then what is the
chat GPT and the GPT 4 going to look
like that's what I'm already thinking
about and some of these Advanced
capabilities are outlined in the
technical paper too here here it clearly
states that you're going to be able to
create videos in any format okay so from
1920 * 1080 to 1080 * 1920 so you know
phone format all the way to WID screen
and then cropping into cinematic formats
from this is easy right all you need to
do is add black bars at the top and
bottom and you have all the Cinematic
format so really there's going to be a
lot of variability and you're going to
be able to get exactly the b-roll that
you need for your project and then
eventually AI is going to be creating
the scripts and editing the video itself
according to all the other videos it saw
and how they were edited right I mean
that might take a lot of time and we do
so much manual work with these videos
that there's always going to be a style
expression and a handwriting to the
post- production of a video I think but
it's crazy to see that you know a week
ago thinking about the fact that you
would have a library of b-roll for a
specific video well you had to go out
there and shoot it in the real world or
you had to purchase stock footage and
then it was scattered and all over the
place here you're going to be able to
get the best of both worlds going to get
great b-roll and all from the same scene
and it's going to cost virtually nothing
or if you have some b-roll that you
already use going to be able to extend
that or maybe you have some phone
pictures and you're going to turn those
into b-roll it's really a whole new
world for video production I I can't
overstate that but it doesn't end there
and this brings me to my last point
which is 3D World and World Generation
because in the technical paper they
actually refer to this as a world
simulator and I think that's a big claim
but it's also a Justified one because if
you take some of the clips at face value
it's incredible it's temporarily
consistent the these houses are not
warping right you're moving through the
scene like a drone would you have these
people on their horses going about their
daily business it's incredible but what
you have to realize is that beyond that
you can apply this in something like
goshan splatting which simply put is a
technology that creates this so-called
Gan Splat that is a 3D representation of
the video in even simpler terms it turns
a video into a 3D model and this is what
it looks like in practice now look this
is a simple video that wasn't even
intended for this purpose but you could
easily imagine a drone shot where the
Drone parallaxes around the subject and
gets it from all angles and then you can
create 3D objects of something that
doesn't even exist so right here manov
Vision took exactly this drone clip and
he recreated it as a goshan Splat and
then brought it into Unity a real-time
game engine and then you can animate the
camera and insert characters and do all
sorts of things right the important fact
here is that Sora doesn't have to do
everything from A to Z you can still
have a human write the script you can
still have a human in front of a green
screen acting it out you can have your
favorite actors in these scenes but it's
going to be so much cheaper to produce
because you're just going to generate
old environments like this and then
everything is going to be shot in front
of a green screen until AI perfectly
synthesizes the actor's voices which if
you follow this channel you know that it
already has and then the last missing
piece is really the human part it's
character consistency and the ability to
edit little details so it aligns with
the vision of everybody involved in the
movies creation and then if you take
that thought experiment even a step
further you end up in Minecraft because
in the technical paper you can see these
that are not recorded from with in
Minecraft these have been generated by
Sora by simply including the word
Minecraft in the prompt it saw so much
Minecraft footage that it was able to
recreate Minecraft perfectly and if it
can do it with Minecraft now how long
until it will do it to all of this world
I don't know but I'm scared and excited
at the same time but one thing is for
sure I want to stay on top of all of
this I'm going to keep my eye on it and
if you want to follow me along for the
ride subscribe to this channel subscribe
to our Weekly Newsletter that is
completely free and keeps you up to date
once a week with all the Revolutionary
breakthroughs and that's really all I
got for today except if you want to try
out Sora there is actually a very very
limited demo here on this page if you
haven't tried this yet I recommend it
because it's the closest you can get to
trying it and it's this little interface
here where you can change these
variables so you can go from an old man
to an adorable kangaroo and then there's
a few more variables that you can change
out here okay Antarctica and for now
this is the closest we get to playing
with this thing so I hope you enjoy this
let me know which one of these was new
or interesting to you and if you have
even more facts that I might have not
considered yet also leave those below
and if you haven't seen the original
video about the announcements and all
the video clips they presented that is
over here all right I can't wait to see
how this develops and what the
competition comes up with this is a
whole new world and I'm here for it see
you soon
浏览更多相关视频
"神级AI"应用推荐!【改变人生】的50个GPT应用!OpenAI 官方出品
OpenAI shocks the world yet again… Sora first look
Sora AI出场即巅峰,ChatGPT实现全面统治 | Sora视频生成模型能力详解
OpenAI Sora 创建视频:它是什么、如何使用它、是否可用以及其他问题的解答
【AI】2024年最强AI视频生成工具TOP 5,其中三款完全免费
You Won't Believe OpenAI JUST Said About GPT-5! Microsoft Secret AI, Hallucination Solved, GPT2
5.0 / 5 (0 votes)