Why Midjourney was created? And The Man Behind it
Summary
TLDRThe video script delves into the enigmatic world of Midjourney, a leading Discord server with 9 million users and a significant consumer-driven AI model. Founded by the respected Silicon Valley entrepreneur David Holtz, the lab is renowned for its innovative use of over 10,000 GPUs and its unique approach to image generation, which doesn't involve training its first few models. Midjourney's long-term vision, as outlined by Holtz, is to build new human infrastructure that fosters reflection, imagination, and coordination. The platform's success is attributed to its agile, remote team and the strategic use of Discord for user testing and community building. Despite its impact, Midjourney maintains a low marketing profile, reflecting the computational limitations and the potential for significant advancements in custom chips that could revolutionize the industry.
Takeaways
- 🚀 **Midjourney's Unique Position**: Midjourney is a significant player in the tech field, being the largest Discord server with 9 million users and one of the world's largest GPU users with over 10,000 GPUs.
- 💡 **Founder's Influence**: David Holtz, a respected Silicon Valley founder, is behind Midjourney, leveraging his reputation to secure resources like GPUs without needing venture funding.
- 📈 **Leap Motion's Legacy**: David was also the mastermind behind Leap Motion, a company that pioneered mid-air gesture control before being acquired by Ultra Leap.
- 🌐 **Midjourney's Vision**: The long-term vision for Midjourney extends beyond creating image generation tools, aiming to build new human infrastructure and expand human imaginative powers.
- 🤖 **Discord as a Testing Ground**: Midjourney chose Discord for its first user testing and continues to use it due to the team being fully remote and the platform's unique social layer for collective creativity.
- 🧪 **User Interaction Insights**: Early user testing showed that users became more imaginative when interacting with the Midjourney bot in a social environment with others.
- 🧩 **Open Source and Customization**: Midjourney built upon open-source technology, including OpenAI's CLIP, and trained their own model, version four, which took nine months.
- 🌟 **Katherine Crowson's Contribution**: An independent researcher, Katherine Crowson, played a significant role in the foundation of diffusion models, which Midjourney utilized.
- 🌍 **Global GPU Usage**: Midjourney's image production leverages GPUs across eight different regions worldwide, balancing usage based on local time zones to optimize efficiency.
- ⚙️ **Scalability Challenges**: Scaling Midjourney to billions of users would require a significant increase in global computing resources, which presents both physical and energy expenditure challenges.
- 🔮 **Future of Computing**: David predicts that the future may involve custom chips that could drastically reduce the computational requirements, potentially integrating neural networks directly onto the chip.
Q & A
Why was Midjourney created?
-Midjourney was created with the vision of exploring new mediums of thought and expanding the imaginative powers of the human species. It aims to build new human infrastructure and focuses on themes like reflection, imagination, and coordination.
What is the significance of Midjourney being the largest Discord server?
-Being the largest Discord server with 9 million users signifies the vast community engagement and collective creativity that Midjourney fosters. It also highlights the platform's popularity and the social layer it provides for interaction with a bot.
How does Midjourney utilize GPUs?
-Midjourney is one of the largest GPU users in the world, utilizing more than 10,000 GPUs. They strategically distribute GPU usage across different regions to maintain a balance, leveraging lower electricity costs during nighttime in various parts of the world.
Who is behind the creation of Midjourney?
-David Holtz, a well-respected second-time founder in Silicon Valley and the mastermind behind Leap Motion, is behind Midjourney. His reputation and connections played a significant role in the rapid access to GPUs.
What is the long-term vision for Midjourney?
-The long-term vision for Midjourney is to create new human infrastructure and build new pillars of infrastructure that reflect, imagine, and coordinate. It also aims to create an environment that enhances people's imagination and capabilities.
Why are Midjourney images of high quality?
-Midjourney's high-quality images are a result of their innovative approach, leveraging open-source technologies, and customizing them to understand the generative AI art space without the need to train their initial models.
Why did Midjourney choose Discord for its platform?
-Midjourney chose Discord because the team is fully remote, and it provided a unique social layer for collective creativity. Additionally, it allowed for agile performance and user testing within the same environment the team was using.
How does user interaction evolve on Midjourney's Discord platform?
-User interaction on Midjourney's Discord platform evolves from simple, individual prompts to more imaginative and collaborative creations when users are placed in an environment with strangers, sparking a more imaginative and engaging experience.
What is the role of Katherine Crowson in the development of Midjourney?
-Katherine Crowson, an independent researcher, played a significant role in the foundation for diffusion models. Her work contributed to the understanding of generative AI art space, and she has recently started working for Stability AI.
Why is there limited media coverage and marketing for Midjourney?
-There is limited media coverage and marketing for Midjourney because the company is focusing on computational development and scalability. They do not need to market extensively since they are not aiming to have a product for everybody at this stage.
What are the future challenges and opportunities for scaling Midjourney?
-The future challenges for Midjourney include the need for significant computational power, which could lead to a computationally limited market for some time. Opportunities lie in the development of custom chips and new forms of infrastructure that could potentially increase efficiency and reduce energy consumption.
What is the potential future of AI and chip technology integration according to David?
-David predicts a future where neural networks may be directly integrated into chips, eliminating the need for traditional memory structures. This could lead to a more efficient and energy-saving process where electricity inputs directly result in image outputs.
Outlines
🤔 Midjourney's Creation and Vision
The paragraph introduces the questions surrounding the creation of Midjourney, its long-term vision, and the exceptional quality of its images. It notes the lack of marketing and interviews, despite the company's significant achievements. Midjourney is highlighted as the largest Discord server with 9 million users and a major GPU user, leading to curiosity about how it rapidly acquired such resources. The paragraph also introduces David Holtz, the respected founder behind Midjourney and previous successful ventures, emphasizing his influence in securing resources without venture funding. The founder's goals extend beyond creating an image generation tool, aiming to build new human infrastructure and expand human imaginative powers through themes of reflection, imagination, and coordination. The unique social layer provided by Discord for collective creativity is mentioned as a key aspect of Midjourney's success.
🌐 Midjourney's Technical Insights and Future Scaling Challenges
This paragraph delves into the technical aspects of Midjourney's operations, explaining that the company did not train its initial models but instead utilized and customized open-source components, including OpenAI's CLIP. It highlights Katherine Crowson's significant contributions to the foundation of diffusion models and her recent affiliation with Stability AI. The paragraph discusses how Midjourney leverages GPU resources globally to maintain operational efficiency and the minimal media coverage due to the company's consumer-driven AI model approach. It also addresses the immense scaling challenges faced by Midjourney and similar AI models, considering the limitations of current cloud computing and GPU capabilities. The founder, David, predicts potential scenarios for the future of computation, including the development of custom chips that could revolutionize the field. The paragraph concludes by noting the company's quiet marketing stance, reflecting its focus on a computationally limited market rather than trying to cater to everyone.
Mindmap
Keywords
💡Midjourney
💡Discord
💡GPUs (Graphics Processing Units)
💡David Holtz
💡Leap Motion
💡Open Source
💡Generative AI Art
💡Katherine Crowson
💡Stable Diffusion
💡Computational Limitations
💡Custom Chips
💡Ethics of AI Art
Highlights
Midjourney is the largest Discord server with 9 million users, surpassing Genshin Impact's server which had around 1 million users.
Midjourney is one of the largest GPU users globally, utilizing over 10,000 GPUs.
David Holtz, a well-respected Silicon Valley founder, is behind Midjourney and has leveraged his reputation to secure resources like GPUs.
David Holtz was also the mastermind behind Leap Motion, a company focused on mid-air gesture control in 3D.
Midjourney's goals extend beyond creating a new image generation tool; it aims to build new human infrastructure and expand human imaginative powers.
Discord provides a unique social layer to collective creativity and community interaction with a bot, which is why Midjourney is on Discord.
Midjourney's team is fully remote, and the platform was tested with the same bot the team used, leading to significant discoveries.
User testing revealed that an environment with strangers can boost creativity and change users' beliefs about their capabilities.
Midjourney did not train their first two models but instead used and tinkered with open-source components like OpenAI's CLIP.
Version four of Midjourney, which took 9 months to train, was a significant milestone and partly due to Katherine Crowson's foundational work on diffusion models.
Katherine Crowson, an independent researcher, has recently started working for Stability AI, the company behind Stable Diffusion.
Midjourney's image generation can utilize GPUs from eight different regions worldwide to balance usage based on local time zones.
Scaling Midjourney to accommodate more users would require a significant increase in global data center and machine capacity.
David Holtz predicts that the future of scaling AI might involve new custom chips that could reduce energy requirements by a factor of 10.
One potential development is a chip that has neural networks burned directly into it, eliminating the need for memory and reducing energy consumption.
Midjourney has been relatively quiet on the marketing side because they don't need to cater to every potential user due to computational limitations.
The founder of Midjourney, David Holtz, did not directly respond to the inquiry, but his goals and vision for the platform have been inferred through research and interviews.
Transcripts
dear founder of Midjourney, why did you create it?
What is the longterm vision of it?
And why Midjourney images are so damn good.
I couldn't find answers to these questions online,
There is no marketing, no interviews.
it produces incredible images, and
yet Nobody is interviewing the CEO.
Nobody is talking.
What is the future of it?
so I did what any normal person would do...
I wrote to the founder
which is one reason why we actually are
relatively quiet on the marketing side is because
but before we dive in into what I discovered,
we need to understand the context first,
Why Mid journey is a big deal in the
first place and what makes it so unique?
I think at the end of this video
you'll leave quite surprised.
Let's cover basics.
Did you know that Midjourney is the
largest Discord server with 9 million users
Previously Genshin Impact
Server was the king of Discord.
Around 1 million users.
Midjourney is also one of the
largest G P U users in the world
utilizing more than 10,000 GPUs,
that just the beginning.
Suppose you are asking yourself how
less than a year old research lab got
access to this many GPUs So quickly?
In that case, first we need to
talk who is behind Midjourney?
Tech field they kind of know that whatever
David works on it's going to be cool right?
meet David Holtz a well-respected
second time founder in Silicon Valley,
is a huge advantage of just
like being a known factor.
So when I needed to find a cloud
vendor to give me 10,000 GPUs.
I could just email the head of the cloud vendor
and say: Hey, this is David doing a thing.
And they go this David he's
doing a thing and they would.
If they could give me all the resources and it was
kind of, I didn't need to have venture funding to
do that because people effectively knew who I was,
he is also the mastermind behind Lead Motion,
before Windows supported touchscreen, Leap
motion was doing mid-air gesture control in 3d.
and I think that's the same company
which was quite heavily criticized
for not really considering that.
people wanr the whole day
to do this while working.
ou get the drilll.
I, I don't need to explain to you.
Okay.
But it's still quite impressive though.
Very physical things I can
do, like, you know, Um,
systems were simply not designed
at the time for this level.
Human machine interaction
And yeah, Leap Motion was similar
to early Consequently, Leap Motion
was acquired by Ultra Leap in 2019.
While I was naively waiting for David to reply,
I came across only one interview with him
behind the paywall, $12 later, let me tell you
his goals with Midjourney is far beyond when
creating just a new image generation tool.
on Midjourney website it says, Modjourney
is an independent research lab exploring
new mediums of thought and expanding
the imaginative powers of human species.
What does it even mean?
One of my goals at Midjourney, is to build new.
Human infrastructure.
And I think about like the world's going
to need a lot of new things and we need
infrastructure on which to build new things.
And so I kind of think a lot about building
like new pillars of infrastructure.
So I needed my themes and my, my pillars were
reflection, imagination, and coordination.
You have to like reflect about
who you are, what you want.
You have to imagine what could be, and
you have to coordinate to get there.
It turned out that Discord provides this unique
social layer to collective creativity and
community aspect of interacting with a bot.
Fun fact.
The real reason why MidJourney is
on Discord drum rolls, it's because
MidJourney team is fully remote.
And MidJourney was quite a agile and performed
its first user testing inside Discord with
exactly the same bot that team was using.
And what we discovered was game changer.
When I did user testing, cuz
it's, it was kind of unbelievable.
It's like, you want like a let a person
discover the product by themselves?
We would do this and, and we'd
be like, okay, here's a machine.
It'll let you do a picture of anything you want,
anything you can imagine, what do you want?
And then just go.
And, and, and it'll show 'em like
a photo of a dog and they go.
It's like, well, no, come on,
because you're there first.
What?
On a little bit more than that.
They go Big dog, and then
it's like a, and keep pushing.
They go big, fluffy dog, and it's still,
and at the end they're so uninterested.
It's like, this isn't as interesting,
why would I care about this?
But then you throw these people into the same
environment all of a sudden with complete
strangers, they go: Dog and someone else
goes, space dog, space dog with lasers,
space dog with lasers and angel wings.
And all of a sudden this person's like, oh my God.
Yeah.
And they like, they've been put
into this imaginative environment
and it starts to kind of change...
their, beliefs about themselves and what they
can do and, and all of a sudden, like it's
creating an imaginative environment that actually.
People more imaginative too.
And you might be wondering,
how does it actually do that?
let's talk about things which I think you didn't
know, but you should know about Midjourney
First MidJourney didn't train their first model.
They didn't train their second model.
In fact, they put a lot of open source
stuff together and just started tinkering
and making custom stuff with it
They didn't need to train anything to
understand generative AI art space.
And some open source things included.
Open AI's CLIP, which even though didn't generate
images but helped with the language stuff.
MidJourney only trained their version
four, which took them 9 months.
two, and I'm excited about this one.
a lot of first model training and
foundation for diffusion models
is thanks to this woman named Katherine Crowson
she was independent researcher in the
middle of nowhere, just training stuff.
She never worked for anybody else.
and you can't find much information about her.
However, it seems like as of recently
she started working for Stability, ai,
The guys and girls behind Stable Diffusion.
watch out for a video on stable
Diffusion on this channel
and we are getting to a very important stuff here
right now, if you make an image with
MidJourney, Your image might be produced
in eight different regions in the world.
Midjourney will use a lot of
GPUs in Korea or Netherlands
while it is a nighttime,
so basically they're racing the darkness
of a night to keep G P U usage balanced,
and you will understand very soon why...
There is not much media coverage on Midjourney.
There is no marketing, no interviews.
And let me explain you why.
when Midjourney got on everyone's radar in 2022,
they have been the first consumer driven AI model.
I would argue that changed with chatGPT
entering the scene in November, 2022.
however, both abound by the
physics of cloud computing and GPUs
and to scale their access, not million.
But billions of people,
They will need dramatically
rethink the fundamentals.
To give you an idea how significant this is.
if they wanted to scale their users and usage
by the factor of 10 today, they would run out of
all the data centers and machines in the world.
it's more of a question of like, if we need a
thousand times compute in the cloud, there's
nothing, there's, there's almost like having
a thousand times more GPUs in the cloud,
is going to be an incredible physical.
Expenditure of energy, not, not electricity,
but it's like literally just like making that
many machines and making that many data centers.
And so if we actually do need that much more
compute, I think the question opportunity is
probably in saying what maybe isn't a GPU?
And to scale.
David predicts with two scenarios.
I don't know what it will look
like, there's kind of two worlds.
One is that it just takes us
seven years to scale a thousand.
and for the next seven years we're just,
the market is computationally limited,
which would be really interesting.
Uh, and then the other argument is like sometime
in that period was like huge new forms, like,
like significant energy put into custom chips,
which could maybe drive it down another factor of
10, and then all of a sudden happens in like one
year So I don't know what will happen there.
I'm aware of like one really cool
chip effort, which like maybe in a few
years you would like burn the, the, the
neural network into the chip directly.
And then like, it's like there aren't,
there aren't even any memory anymore.
It's like the, the transistors
themselves hold the weights.
So like electricity comes in and images come out.
There's like, not even necessarily even
a clock and like you could just kind of
make these, that would be really cool.
And so for the next few years i think
these markets are going to be like limited
computationally more than anything which is
one reason why we actually are Relatively
quiet on the marketing side is because we, we
don't have to have a product for everybody.
So did David answer me?
No.
But in the meantime, I interviewed
chatGPT on ethics of AI art,
and I think some of that stuff
might really surprise you,
浏览更多相关视频
Is Adobe Firefly better than Midjourney and Stable Diffusion?
REALITY of ACHIEVERS CLUB!
The Real Reason Elon Musk is Building The DOJO Supercomputer
OpenAI's STUNNING "GPT-based agents" for Businesses | Custom Models for Industries | AI Flywheels
RIP MidJourney ! Utilisez FLUX 1 GRATUITEMENT et sans censure ! (Guide d'utilisation)
【人工智能】OpenAI o1模型背后的技术 | 后训练阶段的缩放法则 | 测试时计算 | 慢思考 | 隐式思维链CoT | STaR | Critic模型 | 大语言模型的天花板在哪里
5.0 / 5 (0 votes)