Why Midjourney was created? And The Man Behind it

Goda Go
31 Jan 202308:47

Summary

TLDRThe video script delves into the enigmatic world of Midjourney, a leading Discord server with 9 million users and a significant consumer-driven AI model. Founded by the respected Silicon Valley entrepreneur David Holtz, the lab is renowned for its innovative use of over 10,000 GPUs and its unique approach to image generation, which doesn't involve training its first few models. Midjourney's long-term vision, as outlined by Holtz, is to build new human infrastructure that fosters reflection, imagination, and coordination. The platform's success is attributed to its agile, remote team and the strategic use of Discord for user testing and community building. Despite its impact, Midjourney maintains a low marketing profile, reflecting the computational limitations and the potential for significant advancements in custom chips that could revolutionize the industry.

Takeaways

  • 🚀 **Midjourney's Unique Position**: Midjourney is a significant player in the tech field, being the largest Discord server with 9 million users and one of the world's largest GPU users with over 10,000 GPUs.
  • 💡 **Founder's Influence**: David Holtz, a respected Silicon Valley founder, is behind Midjourney, leveraging his reputation to secure resources like GPUs without needing venture funding.
  • 📈 **Leap Motion's Legacy**: David was also the mastermind behind Leap Motion, a company that pioneered mid-air gesture control before being acquired by Ultra Leap.
  • 🌐 **Midjourney's Vision**: The long-term vision for Midjourney extends beyond creating image generation tools, aiming to build new human infrastructure and expand human imaginative powers.
  • 🤖 **Discord as a Testing Ground**: Midjourney chose Discord for its first user testing and continues to use it due to the team being fully remote and the platform's unique social layer for collective creativity.
  • 🧪 **User Interaction Insights**: Early user testing showed that users became more imaginative when interacting with the Midjourney bot in a social environment with others.
  • 🧩 **Open Source and Customization**: Midjourney built upon open-source technology, including OpenAI's CLIP, and trained their own model, version four, which took nine months.
  • 🌟 **Katherine Crowson's Contribution**: An independent researcher, Katherine Crowson, played a significant role in the foundation of diffusion models, which Midjourney utilized.
  • 🌍 **Global GPU Usage**: Midjourney's image production leverages GPUs across eight different regions worldwide, balancing usage based on local time zones to optimize efficiency.
  • ⚙️ **Scalability Challenges**: Scaling Midjourney to billions of users would require a significant increase in global computing resources, which presents both physical and energy expenditure challenges.
  • 🔮 **Future of Computing**: David predicts that the future may involve custom chips that could drastically reduce the computational requirements, potentially integrating neural networks directly onto the chip.

Q & A

  • Why was Midjourney created?

    -Midjourney was created with the vision of exploring new mediums of thought and expanding the imaginative powers of the human species. It aims to build new human infrastructure and focuses on themes like reflection, imagination, and coordination.

  • What is the significance of Midjourney being the largest Discord server?

    -Being the largest Discord server with 9 million users signifies the vast community engagement and collective creativity that Midjourney fosters. It also highlights the platform's popularity and the social layer it provides for interaction with a bot.

  • How does Midjourney utilize GPUs?

    -Midjourney is one of the largest GPU users in the world, utilizing more than 10,000 GPUs. They strategically distribute GPU usage across different regions to maintain a balance, leveraging lower electricity costs during nighttime in various parts of the world.

  • Who is behind the creation of Midjourney?

    -David Holtz, a well-respected second-time founder in Silicon Valley and the mastermind behind Leap Motion, is behind Midjourney. His reputation and connections played a significant role in the rapid access to GPUs.

  • What is the long-term vision for Midjourney?

    -The long-term vision for Midjourney is to create new human infrastructure and build new pillars of infrastructure that reflect, imagine, and coordinate. It also aims to create an environment that enhances people's imagination and capabilities.

  • Why are Midjourney images of high quality?

    -Midjourney's high-quality images are a result of their innovative approach, leveraging open-source technologies, and customizing them to understand the generative AI art space without the need to train their initial models.

  • Why did Midjourney choose Discord for its platform?

    -Midjourney chose Discord because the team is fully remote, and it provided a unique social layer for collective creativity. Additionally, it allowed for agile performance and user testing within the same environment the team was using.

  • How does user interaction evolve on Midjourney's Discord platform?

    -User interaction on Midjourney's Discord platform evolves from simple, individual prompts to more imaginative and collaborative creations when users are placed in an environment with strangers, sparking a more imaginative and engaging experience.

  • What is the role of Katherine Crowson in the development of Midjourney?

    -Katherine Crowson, an independent researcher, played a significant role in the foundation for diffusion models. Her work contributed to the understanding of generative AI art space, and she has recently started working for Stability AI.

  • Why is there limited media coverage and marketing for Midjourney?

    -There is limited media coverage and marketing for Midjourney because the company is focusing on computational development and scalability. They do not need to market extensively since they are not aiming to have a product for everybody at this stage.

  • What are the future challenges and opportunities for scaling Midjourney?

    -The future challenges for Midjourney include the need for significant computational power, which could lead to a computationally limited market for some time. Opportunities lie in the development of custom chips and new forms of infrastructure that could potentially increase efficiency and reduce energy consumption.

  • What is the potential future of AI and chip technology integration according to David?

    -David predicts a future where neural networks may be directly integrated into chips, eliminating the need for traditional memory structures. This could lead to a more efficient and energy-saving process where electricity inputs directly result in image outputs.

Outlines

00:00

🤔 Midjourney's Creation and Vision

The paragraph introduces the questions surrounding the creation of Midjourney, its long-term vision, and the exceptional quality of its images. It notes the lack of marketing and interviews, despite the company's significant achievements. Midjourney is highlighted as the largest Discord server with 9 million users and a major GPU user, leading to curiosity about how it rapidly acquired such resources. The paragraph also introduces David Holtz, the respected founder behind Midjourney and previous successful ventures, emphasizing his influence in securing resources without venture funding. The founder's goals extend beyond creating an image generation tool, aiming to build new human infrastructure and expand human imaginative powers through themes of reflection, imagination, and coordination. The unique social layer provided by Discord for collective creativity is mentioned as a key aspect of Midjourney's success.

05:00

🌐 Midjourney's Technical Insights and Future Scaling Challenges

This paragraph delves into the technical aspects of Midjourney's operations, explaining that the company did not train its initial models but instead utilized and customized open-source components, including OpenAI's CLIP. It highlights Katherine Crowson's significant contributions to the foundation of diffusion models and her recent affiliation with Stability AI. The paragraph discusses how Midjourney leverages GPU resources globally to maintain operational efficiency and the minimal media coverage due to the company's consumer-driven AI model approach. It also addresses the immense scaling challenges faced by Midjourney and similar AI models, considering the limitations of current cloud computing and GPU capabilities. The founder, David, predicts potential scenarios for the future of computation, including the development of custom chips that could revolutionize the field. The paragraph concludes by noting the company's quiet marketing stance, reflecting its focus on a computationally limited market rather than trying to cater to everyone.

Mindmap

Keywords

💡Midjourney

Midjourney is a large Discord server and independent research lab that focuses on exploring new mediums of thought and expanding the imaginative powers of the human species. It is unique for its size, with 9 million users, and its use of over 10,000 GPUs. The lab aims to build new human infrastructure and is led by David Holtz, a respected figure in Silicon Valley.

💡Discord

Discord is a platform primarily used for communication, where Midjourney has its largest server with 9 million users. It provides a social layer to collective creativity and community interaction, which is essential for Midjourney's operations and user engagement.

💡GPUs (Graphics Processing Units)

GPUs are essential for the operation of Midjourney, as the lab utilizes over 10,000 of them to generate images. GPUs are critical for the computational power needed to handle the complex tasks of image generation and AI processing.

💡David Holtz

David Holtz is a well-respected second-time founder in Silicon Valley and the mastermind behind Midjourney. His reputation and connections have been a significant advantage in acquiring resources like GPUs and cloud services for the project.

💡Leap Motion

Leap Motion is a company founded by David Holtz that was involved in mid-air gesture control in 3D before Windows supported touchscreens. It represents Holtz's previous venture in the tech field and showcases his history of innovation.

💡Open Source

Midjourney has utilized open-source technologies like OpenAI's CLIP in their development process. Open-source refers to software where the source code is available to the public, allowing for collaborative development and customization.

💡Generative AI Art

Generative AI art is a field where AI algorithms create original artwork. Midjourney is involved in this space, using AI to generate images based on textual prompts, showcasing the potential of AI in creative domains.

💡Katherine Crowson

Katherine Crowson is an independent researcher who contributed significantly to the foundation of diffusion models, which are a type of generative model used by Midjourney. Her work has been influential in the development of the lab's AI capabilities.

💡Stable Diffusion

Stable Diffusion is a model developed by Stability AI, which Katherine Crowson reportedly started working for. It represents another significant development in the field of generative AI and is mentioned as a noteworthy topic in the script.

💡Computational Limitations

The script discusses the computational limitations faced by AI models like Midjourney, particularly in terms of scaling to billions of users. It highlights the physical and energy constraints of increasing computational power through traditional means like GPUs.

💡Custom Chips

Custom chips are a potential solution to the computational limitations faced by AI models. The script suggests that significant investment in custom chips could lead to a dramatic increase in efficiency, possibly allowing for the integration of neural networks directly into the chip hardware.

💡Ethics of AI Art

The ethics of AI art is a topic that the script touches on, suggesting that there are moral and philosophical considerations in the creation and use of AI-generated art. This includes questions about creativity, authorship, and the role of humans in the artistic process.

Highlights

Midjourney is the largest Discord server with 9 million users, surpassing Genshin Impact's server which had around 1 million users.

Midjourney is one of the largest GPU users globally, utilizing over 10,000 GPUs.

David Holtz, a well-respected Silicon Valley founder, is behind Midjourney and has leveraged his reputation to secure resources like GPUs.

David Holtz was also the mastermind behind Leap Motion, a company focused on mid-air gesture control in 3D.

Midjourney's goals extend beyond creating a new image generation tool; it aims to build new human infrastructure and expand human imaginative powers.

Discord provides a unique social layer to collective creativity and community interaction with a bot, which is why Midjourney is on Discord.

Midjourney's team is fully remote, and the platform was tested with the same bot the team used, leading to significant discoveries.

User testing revealed that an environment with strangers can boost creativity and change users' beliefs about their capabilities.

Midjourney did not train their first two models but instead used and tinkered with open-source components like OpenAI's CLIP.

Version four of Midjourney, which took 9 months to train, was a significant milestone and partly due to Katherine Crowson's foundational work on diffusion models.

Katherine Crowson, an independent researcher, has recently started working for Stability AI, the company behind Stable Diffusion.

Midjourney's image generation can utilize GPUs from eight different regions worldwide to balance usage based on local time zones.

Scaling Midjourney to accommodate more users would require a significant increase in global data center and machine capacity.

David Holtz predicts that the future of scaling AI might involve new custom chips that could reduce energy requirements by a factor of 10.

One potential development is a chip that has neural networks burned directly into it, eliminating the need for memory and reducing energy consumption.

Midjourney has been relatively quiet on the marketing side because they don't need to cater to every potential user due to computational limitations.

The founder of Midjourney, David Holtz, did not directly respond to the inquiry, but his goals and vision for the platform have been inferred through research and interviews.

Transcripts

play00:00

dear founder of Midjourney, why did you create it?

play00:02

What is the longterm vision of it?

play00:04

And why Midjourney images are so damn good.

play00:07

I couldn't find answers to these questions online,

play00:10

There is no marketing, no interviews.

play00:12

it produces incredible images, and

play00:15

yet Nobody is interviewing the CEO.

play00:18

Nobody is talking.

play00:19

What is the future of it?

play00:21

so I did what any normal person would do...

play00:24

I wrote to the founder

play00:25

which is one reason why we actually are

play00:27

relatively quiet on the marketing side is because

play00:29

but before we dive in into what I discovered,

play00:32

we need to understand the context first,

play00:34

Why Mid journey is a big deal in the

play00:36

first place and what makes it so unique?

play00:38

I think at the end of this video

play00:39

you'll leave quite surprised.

play00:42

Let's cover basics.

play00:43

Did you know that Midjourney is the

play00:44

largest Discord server with 9 million users

play00:48

Previously Genshin Impact

play00:49

Server was the king of Discord.

play00:51

Around 1 million users.

play00:52

Midjourney is also one of the

play00:55

largest G P U users in the world

play00:58

utilizing more than 10,000 GPUs,

play01:01

that just the beginning.

play01:03

Suppose you are asking yourself how

play01:05

less than a year old research lab got

play01:08

access to this many GPUs So quickly?

play01:12

In that case, first we need to

play01:13

talk who is behind Midjourney?

play01:17

Tech field they kind of know that whatever

play01:18

David works on it's going to be cool right?

play01:20

meet David Holtz a well-respected

play01:23

second time founder in Silicon Valley,

play01:25

is a huge advantage of just

play01:26

like being a known factor.

play01:28

So when I needed to find a cloud

play01:30

vendor to give me 10,000 GPUs.

play01:31

I could just email the head of the cloud vendor

play01:33

and say: Hey, this is David doing a thing.

play01:36

And they go this David he's

play01:37

doing a thing and they would.

play01:38

If they could give me all the resources and it was

play01:40

kind of, I didn't need to have venture funding to

play01:42

do that because people effectively knew who I was,

play01:45

he is also the mastermind behind Lead Motion,

play01:47

before Windows supported touchscreen, Leap

play01:50

motion was doing mid-air gesture control in 3d.

play01:54

and I think that's the same company

play01:56

which was quite heavily criticized

play01:58

for not really considering that.

play02:01

people wanr the whole day

play02:04

to do this while working.

play02:06

ou get the drilll.

play02:07

I, I don't need to explain to you.

play02:09

Okay.

play02:10

But it's still quite impressive though.

play02:12

Very physical things I can

play02:13

do, like, you know, Um,

play02:16

systems were simply not designed

play02:18

at the time for this level.

play02:20

Human machine interaction

play02:22

And yeah, Leap Motion was similar

play02:24

to early Consequently, Leap Motion

play02:26

was acquired by Ultra Leap in 2019.

play02:29

While I was naively waiting for David to reply,

play02:33

I came across only one interview with him

play02:36

behind the paywall, $12 later, let me tell you

play02:39

his goals with Midjourney is far beyond when

play02:42

creating just a new image generation tool.

play02:45

on Midjourney website it says, Modjourney

play02:47

is an independent research lab exploring

play02:50

new mediums of thought and expanding

play02:53

the imaginative powers of human species.

play02:56

What does it even mean?

play02:58

One of my goals at Midjourney, is to build new.

play03:01

Human infrastructure.

play03:02

And I think about like the world's going

play03:05

to need a lot of new things and we need

play03:06

infrastructure on which to build new things.

play03:08

And so I kind of think a lot about building

play03:10

like new pillars of infrastructure.

play03:11

So I needed my themes and my, my pillars were

play03:14

reflection, imagination, and coordination.

play03:17

You have to like reflect about

play03:18

who you are, what you want.

play03:19

You have to imagine what could be, and

play03:20

you have to coordinate to get there.

play03:21

It turned out that Discord provides this unique

play03:24

social layer to collective creativity and

play03:26

community aspect of interacting with a bot.

play03:29

Fun fact.

play03:30

The real reason why MidJourney is

play03:32

on Discord drum rolls, it's because

play03:34

MidJourney team is fully remote.

play03:38

And MidJourney was quite a agile and performed

play03:41

its first user testing inside Discord with

play03:43

exactly the same bot that team was using.

play03:46

And what we discovered was game changer.

play03:48

When I did user testing, cuz

play03:49

it's, it was kind of unbelievable.

play03:50

It's like, you want like a let a person

play03:52

discover the product by themselves?

play03:53

We would do this and, and we'd

play03:54

be like, okay, here's a machine.

play03:55

It'll let you do a picture of anything you want,

play03:57

anything you can imagine, what do you want?

play03:58

And then just go.

play04:02

And, and, and it'll show 'em like

play04:03

a photo of a dog and they go.

play04:05

It's like, well, no, come on,

play04:06

because you're there first.

play04:08

What?

play04:08

On a little bit more than that.

play04:09

They go Big dog, and then

play04:11

it's like a, and keep pushing.

play04:13

They go big, fluffy dog, and it's still,

play04:15

and at the end they're so uninterested.

play04:17

It's like, this isn't as interesting,

play04:18

why would I care about this?

play04:19

But then you throw these people into the same

play04:21

environment all of a sudden with complete

play04:22

strangers, they go: Dog and someone else

play04:24

goes, space dog, space dog with lasers,

play04:27

space dog with lasers and angel wings.

play04:29

And all of a sudden this person's like, oh my God.

play04:31

Yeah.

play04:31

And they like, they've been put

play04:33

into this imaginative environment

play04:34

and it starts to kind of change...

play04:36

their, beliefs about themselves and what they

play04:38

can do and, and all of a sudden, like it's

play04:40

creating an imaginative environment that actually.

play04:43

People more imaginative too.

play04:45

And you might be wondering,

play04:46

how does it actually do that?

play04:48

let's talk about things which I think you didn't

play04:50

know, but you should know about Midjourney

play04:53

First MidJourney didn't train their first model.

play04:56

They didn't train their second model.

play04:58

In fact, they put a lot of open source

play05:00

stuff together and just started tinkering

play05:02

and making custom stuff with it

play05:04

They didn't need to train anything to

play05:07

understand generative AI art space.

play05:09

And some open source things included.

play05:11

Open AI's CLIP, which even though didn't generate

play05:14

images but helped with the language stuff.

play05:17

MidJourney only trained their version

play05:18

four, which took them 9 months.

play05:21

two, and I'm excited about this one.

play05:23

a lot of first model training and

play05:25

foundation for diffusion models

play05:27

is thanks to this woman named Katherine Crowson

play05:32

she was independent researcher in the

play05:34

middle of nowhere, just training stuff.

play05:36

She never worked for anybody else.

play05:38

and you can't find much information about her.

play05:41

However, it seems like as of recently

play05:43

she started working for Stability, ai,

play05:45

The guys and girls behind Stable Diffusion.

play05:48

watch out for a video on stable

play05:50

Diffusion on this channel

play05:51

and we are getting to a very important stuff here

play05:53

right now, if you make an image with

play05:55

MidJourney, Your image might be produced

play05:57

in eight different regions in the world.

play06:00

Midjourney will use a lot of

play06:01

GPUs in Korea or Netherlands

play06:04

while it is a nighttime,

play06:05

so basically they're racing the darkness

play06:08

of a night to keep G P U usage balanced,

play06:11

and you will understand very soon why...

play06:13

There is not much media coverage on Midjourney.

play06:16

There is no marketing, no interviews.

play06:19

And let me explain you why.

play06:21

when Midjourney got on everyone's radar in 2022,

play06:24

they have been the first consumer driven AI model.

play06:28

I would argue that changed with chatGPT

play06:31

entering the scene in November, 2022.

play06:33

however, both abound by the

play06:35

physics of cloud computing and GPUs

play06:38

and to scale their access, not million.

play06:41

But billions of people,

play06:43

They will need dramatically

play06:45

rethink the fundamentals.

play06:47

To give you an idea how significant this is.

play06:49

if they wanted to scale their users and usage

play06:53

by the factor of 10 today, they would run out of

play06:56

all the data centers and machines in the world.

play07:00

it's more of a question of like, if we need a

play07:01

thousand times compute in the cloud, there's

play07:03

nothing, there's, there's almost like having

play07:05

a thousand times more GPUs in the cloud,

play07:07

is going to be an incredible physical.

play07:10

Expenditure of energy, not, not electricity,

play07:12

but it's like literally just like making that

play07:14

many machines and making that many data centers.

play07:16

And so if we actually do need that much more

play07:19

compute, I think the question opportunity is

play07:21

probably in saying what maybe isn't a GPU?

play07:24

And to scale.

play07:25

David predicts with two scenarios.

play07:28

I don't know what it will look

play07:29

like, there's kind of two worlds.

play07:31

One is that it just takes us

play07:33

seven years to scale a thousand.

play07:35

and for the next seven years we're just,

play07:37

the market is computationally limited,

play07:38

which would be really interesting.

play07:39

Uh, and then the other argument is like sometime

play07:41

in that period was like huge new forms, like,

play07:45

like significant energy put into custom chips,

play07:47

which could maybe drive it down another factor of

play07:49

10, and then all of a sudden happens in like one

play07:51

year So I don't know what will happen there.

play08:00

I'm aware of like one really cool

play08:01

chip effort, which like maybe in a few

play08:03

years you would like burn the, the, the

play08:05

neural network into the chip directly.

play08:06

And then like, it's like there aren't,

play08:08

there aren't even any memory anymore.

play08:09

It's like the, the transistors

play08:10

themselves hold the weights.

play08:12

So like electricity comes in and images come out.

play08:14

There's like, not even necessarily even

play08:15

a clock and like you could just kind of

play08:17

make these, that would be really cool.

play08:19

And so for the next few years i think

play08:21

these markets are going to be like limited

play08:23

computationally more than anything which is

play08:26

one reason why we actually are Relatively

play08:28

quiet on the marketing side is because we, we

play08:30

don't have to have a product for everybody.

play08:32

So did David answer me?

play08:34

No.

play08:34

But in the meantime, I interviewed

play08:36

chatGPT on ethics of AI art,

play08:39

and I think some of that stuff

play08:40

might really surprise you,

Rate This

5.0 / 5 (0 votes)

Related Tags
AI ArtDiscordGenerative AISilicon ValleyLeap MotionDavid HoltzCloud ComputingGPU UsageInnovative TechImaginationInfrastructureRemote TeamsInterview InsightsEthics of AIComputational LimitationsChip TechnologyEnergy Efficiency