Massive AI News from Stability, Adobe, Ideogram, and More!

Theoretically Media
1 Mar 202410:18

Summary

TLDR今週はクリエイティブAIツールに関するニュースが満載です。LTX StudioやPika、Runwayのアップデートに続き、Stability AIからの新しいビデオ生成プラットフォーム、印象的なリップシンクツール、Adobeによる音楽分野の大型リリース、Soraの動作詳細、そして映画館で上映される初の完全AI生成映画について紹介します。Morph StudioとStability AIが提携してAI映画製作プラットフォームを作成し、Emo Talkerがリップシンクを改革。AdobeはProject Musicを発表し、Idiogramは1.0アップデートをリリースしました。また、Soraの逆エンジニアリングに関する研究も行われています。最後に、完全にAIで生成された映画が劇場で上映される歴史的瞬間についても触れています。

Takeaways

  • 🚀 Morph Studioは、Stability AIと提携して、AIによる映画制作プラットフォームを開発した。このプラットフォームは、直感的なノードベースのワークフローを特徴とする。
  • 🎥 新しいビデオ生成プラットフォームでは、スタイル転送を含む複数のビデオを組み合わせてエクスポートすることができる。
  • 👄 Emo Talkerは、任意の静止画像にリップシンクを追加し、感情的な表現を強化するツールである。これはAlibabaによって開発された。
  • 🎵 Adobeは、音楽空間で「Project Music」を発表し、音楽制作にAIを組み込む新しい試みを行っている。
  • 🎨 Soraの動作原理に関する詳細が明らかにされ、ビデオのスムーズな連続性を実現するために「SpaceTime latent patches」を使用していることが示された。
  • 🎬 初の完全AI生成映画が劇場で上映される歴史的イベントが発生し、AI技術による映画制作の可能性を示唆している。
  • 📝 Idiogramは、テキストを含む画像の生成に特化したAIイメージジェネレーターで、1.0アップデートをリリースし、美的品質を向上させた。
  • 🤝 プラットフォームは、ユーザーがワークフローテンプレートを共有し、活発なコミュニティを形成することを目指している。
  • 💼 Emo Talkerは、250時間のビデオと1億5000万以上の画像で訓練され、複数言語に対応している。
  • 🌐 AI技術の進化は、撮影、編集、ポストプロダクションの従来の境界をぼやけさせ、これらを一つの連続したプロセスへと変化させている。

Q & A

  • What new video generation platform was announced?

    -Morph Studio partnered with Stability AI to create a new AI filmmaking platform called Morph Cloud.

  • What is unique about Morph Cloud's workflow?

    -Morph Cloud uses a node-based interface that allows users to visually connect different AI generated video clips and style transfers together into a final video.

  • What does the new Emo Talker tool allow?

    -Emo Talker allows adding lip syncing and emotive facial expressions to any still image.

  • What music AI project did Adobe announce?

    -Adobe announced Project Music AI Control, which can generate and extend music using AI.

  • What updates were made to the Idiogram image generator?

    -Idiogram 1.0 has improved image quality and aesthetics. It also added a magic prompt feature to help generate images.

  • What AI method does Sora apparently use?

    -Researchers found Sora likely uses SpaceTime Latent Patches to understand video in both space and time dimensions.

  • What was the first AI generated film screened in theaters?

    -A remake of Terminator 2 generated by AI and created by 50 artists was screened in theaters.

  • When and where can the AI Terminator 2 film be watched online?

    -The film will be live-streamed online on March 9th so people worldwide can watch alongside the cast and crew.

  • What post-production work was done for the AI Terminator 2 film?

    -Additional work was required to conform the AI footage to theatrical standards for picture and sound.

  • Who created the AI Terminator 2 film?

    -The film was a collaborative project created by a group of talented artists who have worked with AI generative art.

Outlines

00:00

🚀 AI映画製作プラットフォームとリップシンクツールの新着情報

このパラグラフでは、最近のAIツールの進化について語っています。特に、stability.aiの新しい動画生成プラットフォーム、進化したリップシンクツール、Adobeの音楽領域への新規参入、Soraの機能詳細、そして完全AI生成の映画が映画館で上映された最初の事例について触れています。また、morph Studioとstability AIの提携によるAI映画製作プラットフォーム、Emo talkerのリップシンク機能、及びその他のAIツールのアップデートについて詳しく説明しています。

05:00

🎶 AIによる音楽拡張とイメージ生成の最新動向

第二のパラグラフでは、Adobeが開発したAIによる音楽プロジェクト「Project Music」や、テキスト生成能力に優れるAIイメージ生成ツール「idiogram」のアップデートに焦点を当てています。また、YouTubeの人気チャンネル所有者であるMarques BrownleeがSoraを試用した経験や、最初のフルレングスAI生成映画の劇場上映についても言及しています。

10:01

🌟 今週のAIツール進化の要約と次週の予告

最後のパラグラフでは、今週のAI技術の進展に関する概要を提供しています。この週は多くの新発表やアップデートがあったことを強調し、来週に期待できる内容についても触れています。また、視聴者に感謝を示し、次回の更新に向けて期待を煽っています。

Mindmap

Keywords

💡クリエイティブAIツール

クリエイティブAIツールは、芸術やデザイン、映画製作などの創造的なプロセスを支援する人工知能技術を指します。このビデオでは、LTX Studio、Pika、Runwayの更新、そしてその他の多くのツールが紹介されています。これらのツールは、AIがクリエイティブ産業に革命をもたらす方法の例です。

💡ノードベースのワークフロー

ノードベースのワークフローは、プロセスやタスクをノードとして視覚的に表現し、これらを線で接続してデータの流れを示す方法です。Morph StudioとStability AIの提携により開発された映画製作プラットフォームは、このタイプのインターフェースを使用しています。このアプローチは、複雑なプロセスを直感的に理解しやすくする利点があります。

💡スタイル転送

スタイル転送は、ある画像のスタイル(色、質感など)を別の画像に適用するAI技術です。このビデオでは、Morph Studioのデモンストレーションでスタイル転送が使用されており、異なるビデオ間でスタイルの一貫性を保つ方法として紹介されています。

💡AIによるフィルム製作

AIによるフィルム製作は、脚本の生成から編集、特殊効果の適用に至るまで、映画製作の各段階にAI技術を活用することを指します。このビデオでは、AIが伝統的な映画製作の境界をぼかし、創造的プロセスを一元化する新しいワークフローを紹介しています。

💡コミュニティ共有

コミュニティ共有は、ユーザーが自分の作品やワークフローテンプレートを他のユーザーと共有することを可能にする機能です。このビデオで紹介されたプラットフォームは、ユーザーが作成したテンプレートをギャラリーを通じて共有できるように設計されており、クリエイティブなコラボレーションを促進します。

💡リップシンクツール

リップシンクツールは、静止画像に口パクを合わせるAI技術です。ビデオでは、Emo Talkerというツールが紹介されており、静止画像にリアルタイムで口パクを追加することができます。この技術は、アニメーションやビデオゲームのキャラクターをより生き生きとさせるのに役立ちます。

💡AI音楽プロジェクト

AI音楽プロジェクトは、音楽作成プロセスにAIを組み込む試みを指します。Adobeが発表したProject Musicは、このようなプロジェクトの一例であり、ユーザーがAIを利用して音楽を作成、編集、拡張することができます。

💡AI画像生成器

AI画像生成器は、テキストの説明から画像を生成するAI技術です。ビデオでは、Idiogramという画像生成器が1.0アップデートを発表したことが紹介されており、テキストベースのプロンプトを使用して高品質な画像を生成することができます。

💡Sora

Soraは、ビデオ内容生成に特化したAIモデルです。ビデオでは、Soraが新しいビデオモデルとして期待されているが、一般にはまだリリースされていないことが触れられています。このモデルは、ビデオの生成に新しいアプローチを提供し、クリエイティブな表現の可能性を広げています。

💡AIによる映画

AIによる映画は、脚本から撮影、編集までのプロセスにAIを活用して作成された映画を指します。ビデオでは、AIによって生成された「ターミネーター2」のカバーバージョンが、映画館で上映される最初のフルレングスのAI生成映画として紹介されています。この出来事は、AI技術が映画製作に与える影響の大きさを示しています。

Highlights

New video generation platform Morph Studio launched in partnership with Stability AI

EmoTalker adds high-quality lip sync to still images along with emotive facial expressions

Adobe releases Project MusicGenAI for AI-powered music generation and extension

Idiogram 1.0 update improves image quality and adds handy features like magic prompts

Insights into how YouTube's new Sora video generator may work based on reverse engineering

First ever fully AI-generated remake of Terminator 2 to be screened in theaters

Morph Studio offers easy nodal workflow for generating and combining AI videos

EmoTalker trained on large dataset of video and images to enable multilingual speech

Project MusicGenAI focuses on AI-powered music generation, extension and control

Idiogram's new 1.0 model improves image quality and aesthetics

Researchers attempting to reverse engineer Sora and develop similar video generators

Full AI remake of Terminator 2 involved 50 artists and extensive post work

Morph Studio allows combining multiple AI-generated videos into one workflow

EmoTalker utilizes Stable Diffusion but has limitations in tracking body movements

AI Terminator 2 remake getting theatrical premiere and livestream

Transcripts

play00:00

so it's been a pretty crazy week for

play00:02

Creative AI Tools in my last video I

play00:04

went over LTX Studio as well as the

play00:06

updates to Pika and Runway So today

play00:09

we're hitting all the other stuff well I

play00:11

mean at least as much as I can pack in

play00:13

today we've got news from stability. a

play00:15

on a new video generation platform a

play00:18

really impressive new lip sync tool a

play00:21

big release from Adobe in the music

play00:22

space details on how Sora actually works

play00:26

and the first fully AI generated film to

play00:29

be screened in movie theater all right

play00:31

grab a cup of coffee and buckle up

play00:34

kicking off morph Studio have partnered

play00:36

with stability AI to create a AI film

play00:38

making platform that has a really kind

play00:41

of cool workflow taking a look at this

play00:43

shot from a video that they've released

play00:45

you can see that it sort of has a

play00:46

vaguely comfy UI node-based structure to

play00:50

it but I do assure you this is much more

play00:52

simple than comfy UI if you've never use

play00:54

node-based workflows it's it takes a

play00:56

minute to get your head wrapped around

play00:58

but once you do it actually makes a lot

play01:00

of sense since you can see everything

play01:01

visually laid out here we have three

play01:03

different videos um with the style

play01:05

transfer on the third one and then as

play01:07

you connect them together you can export

play01:09

them out having control over the amount

play01:12

of influence each one gives not calling

play01:14

anybody out here but I did catch a typo

play01:16

in that first prompt a grill is looking

play01:18

at the camera uh again I'm not one to

play01:21

judge you guys catch me misspelling

play01:22

stuff all the time it does look like

play01:24

this video generator will be able to

play01:26

spell uh for example in that first video

play01:28

morph Cloud we laid see the prompt come

play01:31

up that says a cloud that spells morph

play01:33

Billows out morph's co-founder XII I I

play01:37

hope I pronounced that correctly uh said

play01:39

filming editing and post- production

play01:40

used to be separate steps in traditional

play01:42

film making but AI blurs the boundaries

play01:45

of these stages and turns them into one

play01:47

continuous process if you aren't happy

play01:49

with the shot you can regenerate it on

play01:51

our canvas AI has introduced a new

play01:54

workflow to Film Production the platform

play01:56

aims to create a Vibrant Community by

play01:58

allowing users to to share their

play02:00

workflow templates with one another via

play02:03

the gallery this one does fall under

play02:04

weight list alert I just signed up

play02:06

myself so once I get access I will

play02:08

definitely be bringing you a full look

play02:11

and I know any new video model that

play02:12

comes out in the back of everyone's head

play02:14

is like Sora Sora sore I I know we'll

play02:16

talk about that in just a second next up

play02:18

we have Emo talker which will not only

play02:20

add lip syncing to any still image but

play02:23

it also add heavy eyeliner to all of

play02:25

your

play02:28

characters

play02:31

seriously no matter what kind of music

play02:32

you're into go listen to the Black

play02:34

Parade it is a 1010 album that

play02:36

transcends any genre good is good anyhow

play02:38

emo talker which is actually emote

play02:41

portrait alive which is not an anacronym

play02:43

for emo is brought To Us by Alibaba

play02:46

let's take a quick look at it in action

play02:47

here crying is the most beautiful thing

play02:49

you can do I encourage people to cry I

play02:52

cry all the time and I think it's the

play02:54

most healthy expression of how you're

play02:57

feeling there are a number of other

play02:58

examples that you can check check out at

play03:00

the link down below most of them have

play03:02

music on them though and like in this

play03:04

case this is Eminem's rap God uh which

play03:08

is a very quick way for you know uh

play03:10

Marshall's lawyers to show up on the

play03:11

channel doorstep with a copyright strike

play03:13

so I can't play it here overall the

play03:16

thing that I'm actually super impressed

play03:17

with is kind of the emotive aspects of

play03:20

emo talker's performance there is still

play03:22

some issues I feel with like the lip

play03:24

flap but this is also an Eminem track in

play03:26

which he is rapping extremely fast uh

play03:30

but there is some issues with kind of

play03:32

like the lip movement tracking but where

play03:34

I think it really flies is with animated

play03:36

or you know kind of CG characters as we

play03:38

see in this example from the sleeper

play03:41

game Detroit Being Human game is really

play03:43

great when I was a kid I feel like you

play03:46

heard the thing you heard the term don't

play03:48

cry you don't need to cry digging into

play03:51

the paper emo talker was apparently

play03:52

trained on 250 hours of video and more

play03:56

than 150 million images it can also

play04:00

speak essentially in multiple languages

play04:02

it does apparently also use stable

play04:04

diffusion as its foundational framework

play04:06

although that said the results are very

play04:08

impressive but there are limitations for

play04:10

one you can only lips sync to still

play04:12

images so you can't like resync to video

play04:16

the paper also notes that they did not

play04:18

use any explicit control signals to

play04:20

control character movement which as they

play04:23

note uh may result in the inadvertent

play04:25

generation of other body parts such as

play04:26

hands leading to artifacts in the video

play04:29

so you know basically weird AI video my

play04:32

favorite emo talker has not been

play04:33

released yet but Pik did release their

play04:35

lip sync feature I did cover that in the

play04:37

last video link is down below moving on

play04:39

Adobe have released a new AI music

play04:42

project called Project Music gen AI

play04:45

Control I mean Adobe come on you got to

play04:47

step it up with the names here Photoshop

play04:49

Symphony InDesign Rhymes Lightroom

play04:51

lullabies I mean I'm not giving you any

play04:53

more for free call me Project Music was

play04:55

developed in collaboration with

play04:56

researchers at the University of

play04:58

California and Carnegie melon they

play05:00

released a promo video explaining some

play05:02

of the things that you can do with

play05:04

Project Music uh we'll take a look at

play05:06

the section on extending

play05:13

music all right here's the the

play05:15

lengthened

play05:25

one project music is just a research

play05:28

project we may see it in the future but

play05:31

not yet sliding back over to imagery

play05:33

idiogram the free AI image generator

play05:36

that spells better than I do has

play05:39

released a 1.0 update yeah this one is

play05:41

really cool idiogram always kind of sits

play05:43

in my back pocket when I'm trying to

play05:45

generate up something with text uh you

play05:47

know mid Journey claims that it's doing

play05:49

text but usually it's still kind of a

play05:51

garbled mess idiogram has always really

play05:53

had an edge on that front the new 1.0

play05:56

model has really upped idiograms

play05:59

aesthetic game and it actually now has a

play06:01

magic prompt button that you can turn on

play06:03

that kind of fills out your text if you

play06:04

want to use it the idiogram Community

play06:06

Feed is one of my favorites amongst all

play06:09

of the image generation Community feeds

play06:11

uh this one is actually really cool this

play06:12

is from Devil's tuna or this awesome

play06:14

penguin with the text cool people play

play06:16

base and use After Effects I don't know

play06:18

who made this but whoever did you are a

play06:19

cool person the best part about idiogram

play06:21

is that it is free it allows for 25

play06:24

Generations per day obviously if you

play06:26

want more you can then move into one of

play06:28

their subscription tiers idiogram also

play06:30

just secured $80 million in additional

play06:32

funding so that's great news hopefully

play06:34

it keeps that free tier free moving over

play06:36

to some sore news YouTube's own mares

play06:39

brownley got to play with it uh these

play06:40

are some of the generations that he got

play06:42

out of it I guess when you have 18.5

play06:45

million subscribers on YouTube you get

play06:46

to play with Sora and yes that is my

play06:49

subtle way of asking for like 18.4

play06:51

million of YouTube kindly hit the

play06:52

Subscribe button anyhow as great as Sora

play06:55

looks I still do not think that we will

play06:57

be getting it anytime soon although in

play06:59

my my last video I did mention that

play07:01

someone asked Chrystal ball valenzula

play07:02

Runway ml's CEO if there would be sore

play07:07

like outputs coming out of Runway

play07:08

anytime soon and he did say better on

play07:10

the heels of that a paper was recently

play07:12

released with a group of researchers who

play07:15

were basically reverse engineering what

play07:18

they saw in Sora I'm still going through

play07:20

the paper it is obviously very dense and

play07:22

it makes my head hurt quite a bit the

play07:25

paper does indicate that Sora utilizes

play07:27

SpaceTime latent patches which basically

play07:30

Break Down the video into smaller

play07:32

controllable pieces that it can

play07:34

understand in both space and time which

play07:36

allows for that sort of smooth

play07:38

continuity again the paper is super

play07:40

dense and while I have read a number of

play07:42

white papers at this point you know I'm

play07:44

still pretty much I'm like a caveman

play07:46

that's sitting in at a meeting at JPL

play07:48

you know I I can I can nod but you know

play07:50

at the same time do I really understand

play07:52

what's happening here but my overall

play07:54

point is that very smart people have

play07:56

already started to pull it apart and

play07:58

they are in the process of developing

play08:00

their own Sora like models even if open

play08:02

AI does not release Sora in say the next

play08:04

6 months I do think that we're going to

play08:07

see something that looks like Sora

play08:10

appearing within that time frame or

play08:13

maybe a little bit later basically

play08:14

anytime between tomorrow 6 months from

play08:16

now or some point in the future I've

play08:18

stopped predicting things because I'm

play08:20

always wrong I'm always wrong rounding

play08:21

out a historical event to my knowledge

play08:24

at least the first fulllength AI

play08:27

generated film to be screened out of

play08:29

theater so a little while back I was on

play08:31

the nerdy novelist podcast where I said

play08:33

this I think a full feature movie is

play08:35

kind of out of a fion currently with

play08:38

with the way the technology is right now

play08:39

although there are some lunatics right

play08:40

now that are creating um their own

play08:43

version of Terminator 2 wh generated by

play08:46

Ai and just to be clear when I say

play08:47

lunatics I do mean that as a term of

play08:49

endearment but yeah they did it it is 50

play08:52

artists doing essentially a cover

play08:53

version of Terminator 2 wholly generated

play08:56

by AI uh you know obviously as we see

play08:59

here you know a lot of extra work went

play09:01

into that AI but it is all still AI

play09:04

generated the entire project is

play09:06

basically this Rogues gallery of really

play09:08

awesome and talented artists uh many of

play09:11

whom have been featured on this channel

play09:13

as well uh so yeah kudos to every single

play09:16

one of you for pulling this Insanity off

play09:19

the film will have a theatrical Premiere

play09:22

in Los Angeles at the newart theater on

play09:24

March 6th uh but don't worry if you

play09:26

don't live in Los Angeles or near Los

play09:28

Angeles you can still see the movie

play09:30

online there'll be a live stream for it

play09:32

on March 9th in which the cast and crew

play09:35

will be in attendance so you can watch

play09:37

it alongside them I did also want to

play09:39

point out that because this is being

play09:40

theatrically screened there was like a

play09:42

ton of work put into it even after

play09:45

everything was done because you know you

play09:46

still have to conform the picture to

play09:48

theatrical standard and like the sound

play09:50

mix has to be ready for you know a

play09:53

theater system overall from the stuff

play09:55

that I've seen from this remake I mean

play09:56

it's it's a parody it's hilarious

play09:59

definitely please do check it out link

play10:01

is down below well that's it for this

play10:02

week I mean that's it like there was

play10:04

like 80 things that happened this week

play10:05

and I did not even get to everything but

play10:07

I don't know we'll see what's in store

play10:08

for next week I thank you for watching

play10:10

my name is

play10:16

Tim