The First High Res FREE & Open Source AI Video Generator!

MattVidPro AI
6 Jun 202316:22

Summary

TLDRThe video explores the emerging field of AI video generation, highlighting Google's Imogen and RunwayML's Gen 2. It introduces 'Potate One', an open-source, high-resolution text-to-video model surpassing Model Scope AI. With capabilities for 1024x576 resolution and potential for local GPU usage, Potate One offers a promising alternative to Gen 2, encouraging community-driven improvements and experimentation in AI video generation.

Takeaways

  • 😀 AI text generation, like chatbots, is currently the most popular form of AI, followed closely by text-to-image generation and manipulation.
  • 🌟 The next frontier in generative AI is AI video generation, with Google's Imogen video and Runway ML's Gen 2 being notable examples.
  • 🚀 Runway ML's Gen 2 is a multi-modal system that can generate videos from text, images, or video clips, but it's not open source and has limited public access.
  • 🌐 An open-source competitor to Gen 2 has emerged, called 'potate one', which is based on Model Scope AI video generator and offers higher frame rates and resolutions.
  • 📹 'Potate one' is capable of generating 1024 by 576 resolution videos, marking a significant leap into HD territory for open-source text-to-video models.
  • 💧 Despite the higher resolution, 'potate one' videos still sometimes include watermarks, similar to the base Model Scope videos.
  • 🔗 The GitHub repository for 'potate one' is available, allowing users to run the model on their own machines and access training scripts.
  • 🔄 'Potate 2' is in development, promising potentially higher resolution and more coherent video generation capabilities.
  • 🎥 The video generation process with 'potate one' is slow, especially on Google Collab, but can be faster with better hardware or paid services.
  • 🤖 The 'potate one' model shows promise in coherency and resolution, making it a strong open-source alternative to proprietary models like Gen 2.

Q & A

  • What are the two main types of AI that are currently popular?

    -The two main types of AI that are currently popular are AI text generation, such as chatbots, and text to image generation and manipulation.

  • What is the next level of AI image generation mentioned in the script?

    -The next level of AI image generation mentioned is AI video generation.

  • What is Google's contribution to AI video generation as mentioned in the script?

    -Google's contribution to AI video generation is an Imogen video, which is a high-resolution and high frame rate video.

  • What is Runway ML's Gen 2 and why is it significant?

    -Runway ML's Gen 2 is a multi-modal system that can generate novel videos from text, images, or video clips. It is significant because it is one of the few AI video generation tools that is accessible to the public.

  • Why is open-source software important in the context of AI video generation?

    -Open-source software is important because it allows for modification and building upon existing video generators, expanding the possibilities and improving the technology.

  • What is 'Potate One' and how does it relate to AI video generation?

    -'Potate One' is an open-source, 1024 by 576 text-to-video model announced by Kim Andrew. It is significant as it breaks into HD territory for open-source text-to-video models.

  • What are the main features of 'Potate One' that make it competitive with Runway ML's Gen 2?

    -Potate One is competitive with Runway ML's Gen 2 due to its higher frame rate, higher resolution video generation, and being fully open-source.

  • What is the significance of the model being able to generate videos with a resolution of 1024 by 576?

    -The significance is that it represents a step into HD territory for open-source text-to-video models, offering higher quality than previous models.

  • What is the role of the GitHub repository in the context of 'Potate One'?

    -The GitHub repository is where the source code and training scripts for 'Potate One' are available, allowing users to modify and improve the model.

  • How can users try out 'Potate One' without installing anything on their own machine?

    -Users can try out 'Potate One' for free using Google Colab, which provides a simple setup and allows the generation of videos without local installation.

  • What are some of the limitations or challenges mentioned in the script regarding AI video generation?

    -Some limitations or challenges include the generation time, which can be slow, and the complexity of setting up the model locally, which may require experience with installing GitHub repos and running Python.

Outlines

00:00

🚀 Introduction to AI Video Generation

The paragraph introduces the topic of AI video generation as the next frontier in AI technology, following the popularity of AI text and image generation. It mentions Google's Imogen video and Runway's Gen 2 as existing technologies, but highlights the limitations in accessibility. The paragraph then introduces 'potate one', an open-source AI video generator that can produce higher resolution videos compared to previous models. It also discusses the potential of open-source software, the availability of 'potate one' through a Twitter announcement, and the promise of future improvements with 'potate 2'.

05:01

🍓 Exploring 'Potato One' Video Generations

This paragraph delves into the capabilities of 'potato one', showcasing various demo videos such as animated fruits, a trippy art line drawing, and an astronaut in a blob world. It emphasizes the higher resolution and coherency of the videos produced by 'potato one' compared to the base model scope. The paragraph also discusses the accessibility of the model through Google Colab, the availability of training scripts, and the potential for community contributions through GitHub and Discord.

10:03

🛠 Setting Up and Using 'Potato One' on Google Colab

The paragraph provides a walkthrough on how to set up and use 'potato one' on Google Colab. It describes the process of generating videos, including the time it takes and the resolution achieved. It also touches on the potential for higher frame rates and the ability to generate longer videos with more steps. The paragraph includes a brief tutorial on using the Colab interface, the installation of necessary packages, and the generation of a video with a sample prompt.

15:04

🌐 Open Source Potential and Future of AI Video Generation

The final paragraph discusses the open-source nature of 'potato one' and its implications for the future of AI video generation. It mentions the integration of 'potato one' into Blender and the ease of use it offers. The paragraph concludes with an invitation for viewers to share their creations and to look forward to the next generation of AI video generation, 'potato 2'.

Mindmap

Keywords

💡AI Text Generation

AI Text Generation refers to the use of artificial intelligence to create written content. In the video, this concept is introduced as a prevalent form of AI, with chatbots like Chat GBT being highlighted as a popular application. It exemplifies the ability of AI to mimic human-like conversational abilities, which is a significant aspect of the current AI explosion discussed in the video.

💡Text to Image Generation

Text to Image Generation is a generative AI technology that converts textual descriptions into visual images. The video mentions this as a closely followed trend after AI text generation, with platforms like Mid-Journey and Dolly being noted for their capabilities. This technology is significant as it showcases AI's ability to understand and visualize concepts described in text.

💡AI Video Generation

AI Video Generation is the next evolutionary step beyond text and image generation, where AI creates video content. The video discusses this as a cutting-edge area with examples like Google's Imogen video. It represents the frontier of AI's creative capabilities, moving from static outputs to dynamic, time-based media.

💡Runway ML's Gen 2

Runway ML's Gen 2 is mentioned as a multi-modal system capable of generating novel videos from various inputs. Despite its capabilities, the video notes that access is limited, which underscores a theme of the video about the desire for more accessible and open AI tools. Gen 2 serves as a comparison point for the open-source alternatives discussed.

💡Open Source

Open Source in the context of the video refers to software whose source code is available for anyone to use, modify, and enhance. The video emphasizes the importance of open-source models like 'Potate One', suggesting that they democratize AI technology by allowing broader access and community-driven innovation, which is a key narrative in the discussion of AI's future.

💡Model Scope AI Video Generator

Model Scope AI Video Generator is described as an early attempt at AI video generation, producing lower quality videos. It is positioned as a precursor to more advanced models like 'Potate One', illustrating the progression in AI video generation technology and the video's theme of continuous improvement in AI capabilities.

💡Potate One

Potate One is introduced as the first open-source 1024 by 576 text-to-video model, marking a significant advancement in the resolution and quality of open-source AI video generation. It is central to the video's narrative of promoting open-source alternatives and represents a milestone in the accessibility of high-definition AI video generation.

💡Coherency

Coherency, in the video, refers to the logical and meaningful continuity within the generated video content. It is a critical aspect of evaluating AI video generation, as it speaks to the realism and believability of the AI's output. The video discusses how 'Potate One' and similar models are improving coherency, which is essential for more practical and engaging video generation.

💡Google Colab

Google Colab is mentioned as a platform that provides free access to AI models like 'Potate One'. It is highlighted for its role in making AI technology more accessible, allowing users to run complex AI models with limited hardware requirements. This ties into the video's broader message about the democratization of AI tools.

💡Resolution

Resolution in the video pertains to the pixel dimensions of the video output, with 'Potate One' achieving 1024 by 576, which is noted as a significant advancement in the quality of open-source AI video generation. High resolution is emphasized as a key factor in the realism and detail of the generated videos, which is a central theme in the video's exploration of AI's creative potential.

Highlights

AI video generation is emerging as the next frontier in AI technology.

Google's Imogen video showcases high-resolution, high-frame-rate capabilities.

RunwayML's Gen 2 stands out as a multi-modal system for video generation.

The limitations of Gen 2 include restricted access and a lack of open-source availability.

Open-source AI video generators like Model Scope AI Video Generator are gaining traction.

Potate One is introduced as an open-source, high-definition text-to-video model.

Potate One is based on Model Scope and offers higher resolution than Gen 2.

The video generation process includes customizable parameters like FPS and frame count.

Potate One can be run on Google Colab with 15GB of RAM, making it accessible to the public.

The video generation quality is competitive with RunwayML's Gen 2, despite being in early stages.

The open-source nature of Potate One allows for community-driven improvements and modifications.

Potate One's coherency and resolution are significant advancements in AI video generation.

The video generation process can be time-consuming, especially on free platforms like Google Colab.

Potate One's potential for integration into software like Blender expands its usability.

The community is encouraged to experiment with Potate One and share their creations.

Potate Two is in development, promising further enhancements in AI video generation.

The video concludes with a call to action for viewers to engage with the AI technology and share their experiences.

Transcripts

play00:00

as we drift through the new AI explosion

play00:03

that is happening in our society two

play00:05

main AIS seem to keep popping up the

play00:08

first one being obviously AI text

play00:10

generation such as chat gbt probably the

play00:13

most popular form of AI at the moment

play00:15

but the second most in its very close

play00:16

second is of course text to image

play00:19

generation and manipulation this form of

play00:21

generative AI is super popular obviously

play00:24

the mid-journey the dolly the Bing it's

play00:26

all really great stuff but what's the

play00:28

next step the next level of AI image

play00:31

Generation Well that happens to be AI

play00:33

video generation we've seen a lot from

play00:36

AI video generation so far of course

play00:38

Google is a mind-blowing Imogen video

play00:41

comes to mind but of course we don't

play00:43

have access to any of this this was just

play00:45

a cool paper that Google released with

play00:47

some more high resolution and high frame

play00:49

rate video and of course the only one

play00:51

that we can actually use is Runway

play00:53

researches Gen 2 which is really A

play00:56

multi-modal system that can generate

play00:58

novel videos from text images or video

play01:01

clip but it does just do plain text to

play01:03

video and one of the main issues is that

play01:06

most people still do not have access to

play01:08

the Gen 2 app and the only real access

play01:11

anyone has is through the Discord server

play01:14

by runwayml and well I mean this video

play01:17

generation is really truly the best the

play01:19

public has at the moment you can only

play01:21

generate four second videos at a

play01:24

decently low frame rate and of course

play01:26

it's not open source like we saw with

play01:28

the stable diffusion AI image generator

play01:31

you see when things are open source well

play01:34

the possibilities of Open Source

play01:36

software allow for the modification and

play01:38

build upon of these video generators so

play01:42

viewers I'm happy to share with you

play01:44

today that we actually have a Gen 2

play01:46

competitor that is fully open source

play01:49

it's based off of model scope AI video

play01:52

generator which was all right I mean it

play01:54

was like the very very baby steps of AI

play01:57

video I mean all of this is still baby

play01:58

steps but models scope is definitely

play02:01

behind a gen 2. this open source free AI

play02:05

video generator I'm showing you today is

play02:08

actually fairly competitive with Runway

play02:10

ml's gen 2. we're talking higher frame

play02:12

rate higher resolution video Generations

play02:14

so viewers our journey Starts Here on

play02:18

Twitter where Kim Andrew says that they

play02:21

are happy to announce the first open

play02:23

source 1024 by 576 text-to-video model

play02:27

known as potate one so obviously now

play02:30

we're actually you know breaking into

play02:32

the HD territory for these open source

play02:35

text-to-video models which is awesome it

play02:37

is really heavily based on model scope

play02:40

the open source model that produced

play02:43

rather cruddy little blurry videos often

play02:47

with watermarks you still get watermarks

play02:49

sometimes with this but this is much

play02:50

higher resolution and I think the

play02:52

quality is definitely substantially

play02:54

better but we got lots of other thanks

play02:56

here so model scope's the main one

play02:57

Lambda API and some other devs you can

play03:01

try it through this link right here and

play03:04

he also says that potate 2 is in the

play03:07

oven which means he's working on a

play03:09

better version of this potentially a

play03:11

higher resolution more coherent one

play03:13

either way the demo video that we have

play03:15

here is only about a second long but it

play03:18

actually looks pretty promising we have

play03:19

a nice still background that isn't

play03:21

moving too much and we've got all these

play03:23

different kind of colorful beautiful 3D

play03:25

animated fruits flying in the air and

play03:28

it's a pretty good little demo video I

play03:30

think this is something that I would

play03:31

expect to see out of Gen 2 I think

play03:34

that's that's kind of the quality we're

play03:35

looking at and if we go down we've got a

play03:37

GitHub link of course this whole thing

play03:39

is open source you can run this on your

play03:41

own machine anyone can go ahead and

play03:43

modify this training scripts are also

play03:46

available which is really cool we've got

play03:48

another example here of an astronaut

play03:50

that seems to be jumping through this

play03:52

world of fuzzy little blobs it's kind of

play03:55

like this really trippy weird video but

play03:58

you know it'd be cool for a music video

play03:59

or something thing either way again this

play04:01

is something I would expect to see out

play04:03

of the likes of Runway ml's Gen 2 but

play04:07

this comes from a fully open source free

play04:09

to download and use model the higher

play04:11

resolution really is kicking butt here

play04:13

that is really what we needed some high

play04:15

resolution High Fidelity High frame rate

play04:17

text a video and we're slowly but surely

play04:19

getting there and I would be much

play04:20

happier for us to get there with open

play04:22

source models because open source just

play04:25

means that anyone can modify it anyone

play04:27

can make it better anyone can have

play04:29

access to it and the vram requirements

play04:31

to run it locally actually are not too

play04:33

bad you can run this completely free in

play04:36

collab with 15 gigs of RAM that is

play04:38

supplied by Google collab and if you

play04:41

have a graphics card like I do that has

play04:44

over 15 gigabytes of vram you could

play04:46

potentially run this thing at your house

play04:48

Nvidia was kind enough to send me a RTX

play04:52

4080 that has 16 gigabytes of vram so I

play04:56

could run this thing on my own GPU at

play04:58

home I don't have to go to a server or

play05:00

go to some service like gen 2. we have

play05:02

another one this is a little video of

play05:04

fruits bouncing off the ground seemingly

play05:07

but yeah you can see the fruits are

play05:08

actually blurry as they get closer to

play05:10

the camera which is a sign of good

play05:11

coherency the fruits are clearly

play05:13

strawberries and maybe some avocados or

play05:17

something I'm not really sure exactly

play05:18

what's going on but there all seem to be

play05:20

bouncing and landing around it's a

play05:22

pretty decent looking video we've got

play05:24

something that's a little bit more

play05:25

trippy here again this is like almost a

play05:27

flash warning this is pretty crazy but

play05:30

it looks like some art line drawing

play05:32

you'd see in a cool music video so there

play05:34

isn't any hugging face demo which always

play05:36

seems to be the easiest way to try these

play05:38

things out but there is a collab demo

play05:40

which is also fairly easy keep in mind

play05:42

viewers this is a prototype model it was

play05:44

trained with lambdalabs.com on an a100

play05:47

GPU 10 000 training steps again you have

play05:51

full access to the data set in config

play05:53

all of these fine-tuning models here

play05:56

we've got two text-to-video fine tunings

play05:58

a video blips 2 preprocessor a pie scene

play06:01

detect and of course here are the links

play06:03

to the base model here model scope and

play06:06

yeah you can try it for completely free

play06:07

with this little collab here is the link

play06:10

to the GitHub if you have issues

play06:11

figuring out how to run this they

play06:13

actually do have a Discord server which

play06:14

is nice but there's a bunch of different

play06:16

collabs obviously the main one here is

play06:18

going to be your potate one text to

play06:20

video collab of course a little tutorial

play06:22

video which I didn't think was super

play06:24

helpful but we've got some nice examples

play06:26

to look at down here and these are more

play06:29

high resolution more coherent than just

play06:31

your base model scope video this one is

play06:34

a giraffe underneath a microwave here

play06:36

and you can see the draft is inside of

play06:37

the microwave this is like a throwback

play06:39

or a lot of these problems actually

play06:40

Throwbacks to other text of video

play06:43

generators that we've seen from meta Ai

play06:45

and Google AI but yeah the giraffe and

play06:47

the microwave came out all right it's

play06:49

literally just a draft sitting in a

play06:50

microwave we've got a golden doodle

play06:52

playing in the park by a lake this one's

play06:54

all right as well not super coherent

play06:56

we've got the panda bear driving a car

play06:58

this one's honestly pretty good he's

play07:00

just kind of sitting there the car is

play07:02

very still and very coherent which I

play07:04

like he makes sense in the passenger

play07:06

seat there got the teddy bear running in

play07:08

New York City this one kind of just

play07:10

looks like he's hobbling around on his

play07:11

butt through New York City but I gotta

play07:13

say the actual motion of him moving

play07:15

through the city or the actual panning

play07:17

action through the city looks pretty

play07:18

decent this is one of the weaker

play07:20

generations for sure though this is

play07:22

definitely a very complex shot here this

play07:24

is a drone fly through of a fast food

play07:26

restaurant on a dystopian Alien Planet

play07:28

if you click it here it actually does

play07:30

look a lot like Drone footage in terms

play07:32

of how smooth it pans into the

play07:34

restaurant and you could definitely tell

play07:36

it's some sort of building on like a

play07:37

dystopian Alien Planet although you know

play07:40

the building isn't all that coherent

play07:42

we've also got a dog wearing a superhero

play07:45

outfit with a red cape flying through

play07:48

the sky and this one actually is really

play07:50

good you can see him do a full 180 turn

play07:52

there and the whole body looks pretty

play07:54

coherent and notice guys my favorite

play07:56

part about this one is the physics on

play07:58

the cape they actually fly around and

play08:00

Float around and make sense quite a lot

play08:03

this generation is impressive and to see

play08:05

something this coherent come out of the

play08:06

model shows you that it's definitely

play08:08

doing some good work we still do have

play08:10

this Shutterstock logo here I mean this

play08:13

is really just an after effect on all of

play08:15

these from the model scope video which

play08:18

this is based off of but again guys

play08:20

model scope video is open source which

play08:21

allows us to improve upon it with stuff

play08:23

like potate one I really think that this

play08:26

dog generation though with the cape

play08:27

flying around in him doing the full 180

play08:29

with the the actual coherent background

play08:31

is just truly marvelous that is a good

play08:34

generation you've also got three more

play08:36

here monkey learning to play the piano

play08:37

there definitely looks like a little

play08:39

monkey and he's just kind of scrabbling

play08:41

around on the piano not supers coherent

play08:43

but the background is Rock Solid the

play08:45

piano is Rock Solid and the monkey looks

play08:47

decent I would say and we've got a

play08:49

litter of puppies running through the

play08:50

yard it was able to do multiple puppies

play08:52

at once which is pretty good but they're

play08:54

kind of morphing in each other this is a

play08:56

pretty weak generation and uh yeah it's

play08:58

just a little bit scrabbling weird and

play09:00

finally we've got a robot dancing in

play09:02

Times Square and this one honestly came

play09:04

out shockingly good as well you can see

play09:06

Times Square moving in the background in

play09:08

a cinematic way the light is reflecting

play09:10

off the floor in a realistic way that

play09:13

that's what you would expect essentially

play09:14

from all these screens in the background

play09:16

of Times Square and the robot does kind

play09:18

of seem to be just standing there

play09:19

dancing maybe a little bit the robot is

play09:22

coherent throughout the whole video

play09:23

though and it's doing a nice panning so

play09:25

while these videos are pretty low frame

play09:27

very short videos they look very very

play09:30

promising in terms of their coherency

play09:32

and for me personally that is the most

play09:34

important part we have to nail down with

play09:36

text to video before we start to improve

play09:39

on frame rates and overall generation

play09:41

time we want to get that coherency down

play09:43

and this is looking a lot more coherent

play09:46

than just your base model scope video

play09:47

it's looking almost as coherent as Gen 2

play09:51

in a lot of situations and again fully

play09:54

open source which is really the the

play09:56

crown jewel of this whole potate one

play09:58

text to video generator so viewers as I

play10:00

said earlier you can use this entire

play10:03

thing for free on Google collab which is

play10:05

awesome the collab is super simple to

play10:08

set up I'll show you guys how to set up

play10:09

in a little bit but it does take quite a

play10:11

long time about 10 minutes to actually

play10:14

generate a video through this thing but

play10:16

that's on Google collab it could be a

play10:18

lot faster by either paying for a better

play10:20

collab or running it potentially at home

play10:22

with your own GPU again here's where you

play10:25

put the problems in it also supports

play10:26

negative prompts which is good number of

play10:28

steps per frame generation a guidance

play10:30

scale total FPS in this case it was 24

play10:33

and then the actual total number of

play10:35

frames so you could generate longer

play10:37

videos if you want it's just going to

play10:39

take a really long time and collab I did

play10:41

two videos with the base prompt of duck

play10:43

here so let's see how those turn out all

play10:45

right here is our first generation

play10:47

actually at 24 FPS it's not too bad uh

play10:50

it looks pretty rough pretty rudimentary

play10:53

here but the resolutions decently High

play10:55

yeah this is a little bit tough to look

play10:56

at but I do like the way that the Ducks

play10:58

are flapping their wings around here

play11:00

considering this is brand new baby stage

play11:02

technology and as you can see viewers it

play11:04

is at 24 FPS with a 1024 and 576

play11:08

resolution and what's actually really

play11:10

cool is that's actually a higher

play11:12

resolution than you get with Gen 2

play11:14

outputs Gen 2 outputs are 768 by 448 so

play11:18

you're getting a little bit higher

play11:19

resolution than even just regular Gen 2

play11:21

although the coherency might not fully

play11:24

be that capable yet so viewers when you

play11:26

first open up into the Google collab

play11:29

notebook what you're going to want to do

play11:30

is run this first play button all the

play11:33

way up at the top this is just going to

play11:34

set up the requirements to make

play11:36

everything work it installs a bunch of

play11:37

different githubs and has essentially

play11:39

installs the AI onto this collab

play11:43

notebook and once you're done with that

play11:45

it's very simple you can simply type in

play11:47

your prompt in this case I'm doing lemon

play11:48

character dancing on the beach bokeh and

play11:50

then I click this little run button I

play11:52

just want to click run anyways and

play11:55

eventually it will start to generate

play11:56

those frames again you can change the

play11:58

number of frames if you want it will

play12:00

just go up in terms of the length of the

play12:02

video it's a very simple collab to use

play12:04

and set up you'll see this warning that

play12:06

says could not find a tensor RT that's

play12:08

okay it'll still generate again it just

play12:10

takes a really long time to generate

play12:13

these videos but that's okay and by the

play12:15

way guys if you make anything really

play12:17

cool please feel free to share it on my

play12:19

Discord server I love to see all your

play12:21

cool generations and stuff you create

play12:22

with this new AI technology that we

play12:24

experience you could even turn the FPS

play12:26

up or the total number of steps to get a

play12:30

clearer generation it's all just going

play12:31

to take longer and as you can see the

play12:34

GPU Ram will start to pick up again we

play12:36

have that 15 gigabyte total and as you

play12:38

can see we slowly begin to generate our

play12:41

first frame again this is why it takes

play12:44

so long it takes about 11 seconds of Pop

play12:47

to generate a single step and again

play12:49

there's 50 steps per frame so for 20

play12:52

frames you can see it starts to get

play12:54

pretty long but that's kind of how it is

play12:56

when we're using this very limited

play12:58

completely free Google collab on your

play13:00

own GPU at home this might be blazing

play13:03

fast so while this generates I'm going

play13:05

to go ahead and see if I can run this on

play13:07

my own machine at home this is very

play13:09

complex if you're someone who isn't used

play13:12

to running this stuff on your own

play13:14

machine you're going to have to download

play13:15

Python and have that python runtime pip

play13:18

install all of these different GitHub

play13:21

locations but let's give it a shot and

play13:24

see if I can make this work I'm an

play13:25

inexperienced person when it comes to

play13:27

this so viewers here is my lemon

play13:29

character dancing on the beach it's a

play13:32

little disturbing I won't deny I did try

play13:35

to get this set up and working inside of

play13:38

my computer but I wasn't able to figure

play13:40

it out with the limited time that I have

play13:41

today if you viewers want me to teach

play13:43

you guys how to install this thing

play13:44

locally please leave a comment down

play13:46

below and I'll do a dedicated video for

play13:47

it but uh if you are more akin to using

play13:51

this kind of stuff it's going to be a

play13:52

lot easier if you want faster generation

play13:54

you could do something like lower the

play13:56

total number of steps and by the way

play13:58

guys once you you do a new generation

play14:00

your old MP4 files get deleted so yeah

play14:03

we could have each one of these video

play14:04

Generations just be one single step and

play14:07

when we rerun it it's going to generate

play14:09

a lot faster but only one step per frame

play14:12

really isn't going to give you a very

play14:13

clear video as we'll see and you can see

play14:15

if we run this thing on only one single

play14:17

step you just get a very blurry basic

play14:19

image no real clear content I think it's

play14:23

fair to say viewers that this thing

play14:25

really is meant to be used by a person

play14:29

who has a little bit of experience

play14:31

installing these GitHub repos on their

play14:34

own systems and I know that's a lot of

play14:35

you viewers at home as you can see cam

play14:37

Andrews is happy that people are using

play14:39

potato one to create stunning videos

play14:41

here's another example of a few videos

play14:43

combined together to create something a

play14:45

little bit more than just the one video

play14:47

so if we play this here you can see it's

play14:49

all a video of a very similar area here

play14:52

it's like mountains with a waterfall and

play14:53

it looks pretty decent not super

play14:55

coherent of course it's not nearly as

play14:58

good as regular text image but this is

play14:59

the baby steps of text a video and it's

play15:02

really nice that we actually have a

play15:04

alternative now to Gen 2 in some ways

play15:06

that is fully open source fully

play15:09

manipulatable there's no bars held back

play15:12

on you when you use this thing and it

play15:15

creates some decently high resolution

play15:16

video footage and you can do stuff

play15:19

longer than just a second if you really

play15:21

want to wait for it the generation time

play15:23

really is the main downside of this

play15:25

thing in viewers one more thing here

play15:27

potato one is also available in blender

play15:29

thanks to tint wanton you can

play15:32

essentially directly integrate it into

play15:35

blender and it's a lot easier to use

play15:37

than your typical stuff maybe I'll

play15:39

actually do a video on how to integrate

play15:40

it into blender because that's probably

play15:42

a little bit easier than trying to

play15:44

install it through python on your own

play15:46

system viewers I love covering really

play15:49

cool new Cutting Edge AI projects such

play15:52

as this one especially if they're going

play15:53

to be open source and free for all to

play15:55

download and use let me know if you

play15:58

create anything cool with this I'd love

play15:59

to see some videos down in my Discord

play16:01

server which is linked in the

play16:03

description and yeah do you think this

play16:05

is better than Gen 2 do you prefer it

play16:07

over Gen 2 just because it's going to be

play16:08

open source I'm very excited to see how

play16:10

the next Generation known as potate 2

play16:13

ends up that's gonna be it for me for

play16:15

today tune in at the end of the week for

play16:18

a larger AI news recap and I'll see you

play16:20

in the next video goodbye

Rate This

5.0 / 5 (0 votes)

الوسوم ذات الصلة
AI TechnologyText-to-VideoOpen SourceVideo GenerationGenerative AIModel ScopePotato OneGoogle ColabAI NewsInnovation
هل تحتاج إلى تلخيص باللغة الإنجليزية؟