Ilya Sutskever | The future of AGI will be like what you see in the movies
Summary
TLDR这段视频剧本探讨了神经网络的复杂性和其在人工智能领域的应用。讨论了神经网络难以精确解释的特性,以及这种特性如何成为构建真正智能的潜在优势。剧本还回顾了神经网络的发展历程,包括早期的挑战和近年来的快速发展。此外,还讨论了OpenAI的成立初衷,以及对未来人工智能技术,特别是在编程、法律和其他白领职业中应用的展望。
Takeaways
- 🧠 神经网络的编写与传统编程不同,它涉及在循环内编写复杂方程,运行后难以精确理解其功能,这与神经网络的不可解释性相关。
- 🤖 神经网络的不可解释性可能是其特性而非缺陷,因为真正的智能本身就难以理解,正如人类自身的认知功能一样。
- 📈 神经网络的成功部分归因于它们在数学上的难以推理,这与早期AI领域的其他方法形成对比。
- 🕵️♂️ 早期对神经网络的研究者需要长期坚持,因为当时AI领域并不看好神经网络,直到2010年左右才开始受到关注。
- 🚀 神经网络的潜力被重新认识,随着计算能力的提升,人们对其潜力的期待也随之增加。
- 🌐 OpenAI的成立是出于将科学与工程紧密结合的愿景,以及对AI技术潜在影响的深刻理解。
- 🛡️ OpenAI的目标之一是确保AI技术的安全和有益使用,同时关注政策和伦理问题。
- 🔮 神经网络的未来发展不仅关乎技术进步,还涉及解决由AI带来的挑战和问题。
- 🔢 神经网络模型的扩展性表明,随着计算能力的增加和数据集的扩大,模型的性能也会随之提高。
- 💡 神经网络的泛化能力是其核心优势之一,但当前仍需大量数据来补偿其泛化能力的不足。
- 🎨 神经网络在创造性任务上的表现出乎意料,这可能与它们的生成模型特性有关,这些模型在艺术创作方面显示出巨大潜力。
Q & A
为什么作者认为神经网络的难以理解并不是一个缺陷,而是一个特性?
-作者认为,如果我们想要构建智能,那么它不应该简单易懂。正如我们无法解释自己如何进行认知功能,如视觉、听觉和语言理解一样,如果计算机能够产生同样难以理解的对象,这可能意味着我们正走在正确的道路上。
作者在什么时期开始对神经网络产生兴趣,并与Jeff Hinton合作?
-作者在2000年代初开始对神经网络感兴趣,并在2003年开始与Jeff Hinton合作。
为什么作者认为神经网络的成功部分是因为它们难以用数学来推理?
-作者认为,神经网络之所以成功,部分原因是因为它们难以用数学来推理,这使得它们能够避免陷入数学证明的陷阱,专注于实际的进展。
作者对于人工智能未来的希望是什么?
-作者希望在未来,我们不仅能构建强大且有用的AI,而且能够实现通用人工智能(AGI),并确保它能够被用来解决大量问题并创造出许多令人惊叹的应用。
为什么作者决定离开Google并创立OpenAI?
-作者离开Google并创立OpenAI的动机之一是他认为将科学和工程融合在一起是AI发展的正确方式。另一个动机是他对AI技术有了更成熟的认识,希望建立一个能够意识到AI带来的挑战并致力于解决这些问题的公司。
OpenAI的初始愿景和目标是什么?
-OpenAI的初始愿景是将科学和工程融合在一起,消除两者之间的界限,以科学的思想来指导工程实践,以工程的严谨性来推动科学发展。同时,作者希望OpenAI能够意识到AI的复杂性,并致力于使AI安全、有益。
为什么作者认为将科学和工程结合在一起是重要的?
-作者认为,随着AI领域的成熟,小规模的调整已经不足以推动实质性的进步。将科学和工程结合在一起可以确保科学发现得到精心的工程实施,同时确保工程实践得到科学思想的指导。
为什么作者认为数据的可用性对于神经网络的性能至关重要?
-作者指出,神经网络需要大量的数据来训练,以便更好地泛化。在数据丰富的领域,如语言处理,神经网络可以取得很好的效果。但在数据较少的领域,如法律,神经网络可能无法达到同样的性能水平。
作者如何看待神经网络在编程领域的应用,例如Codex模型?
-作者对Codex模型感到兴奋,因为它展示了神经网络能够从自然语言转换到代码,这是AI领域的一个重要进步。Codex模型能够理解自然语言描述的需求,并生成可运行的代码,这可能会改变编程职业的性质。
为什么作者认为神经网络在创造性任务上的表现可能优于简单机器人任务?
-作者认为,神经网络在创造性任务上的表现可能优于简单机器人任务,因为创造性任务如图像和文本生成有大量的数字数据可供学习,而机器人任务受限于现实世界中的数据收集。
作者如何看待神经网络的泛化能力?
-作者认为,神经网络的泛化能力是其核心优势之一。泛化能力使得神经网络能够在面对新的、不同于训练数据的情况时做出正确的反应。然而,这种能力尚未达到人类水平,因此需要大量的数据来补偿。
作者对神经网络在未来的发展有哪些期望?
-作者期望神经网络能够继续提高其泛化能力,减少对大量数据的依赖。他还期望神经网络能够更好地理解和生成代码,从而改变编程和其他白领职业的性质。
为什么作者认为神经网络的泛化能力对于创造性应用特别重要?
-作者认为,创造性应用特别适合神经网络,因为生成模型在机器学习中扮演着核心角色,它们生成新的、合理的数据的方式与艺术创作过程有相似之处。
Outlines
🤖 神经网络的不可解释性与智能发展
第一段主要讨论了神经网络与传统编程的区别,强调了神经网络的复杂性和不可解释性。作者认为,神经网络的这种特性并不是缺陷,而是其作为智能系统的一部分,反映了人类智能的复杂性。此外,提到了作者在2000年初对神经网络产生兴趣,并与Jeff Hinton合作,以及他对神经网络未来潜力的期待和对AI领域发展方向的早期看法。
🌐 AI技术的双刃剑特性及其全球影响
第二段中,讨论了AI技术潜在的积极影响和挑战,强调了AI的复杂性,以及它可能带来的问题和对人类生活的改善。作者表达了对AI技术未来发展的深刻认识,包括对政策和技术安全性的考虑,以及AI技术在全球范围内的分布和应用将如何定义未来几十年的世界格局。
🔬 科学与工程的融合以及AI的未来发展
第三段讲述了作者创立OpenAI的初衷,包括将科学与工程融合以推动AI发展,以及对AI技术潜在影响的深思熟虑。作者认为,科学与工程的紧密结合是AI领域成熟的标志,并且对于如何使用AI技术解决挑战和政策问题有着清晰的愿景。
📈 深度学习的扩展性与数据的重要性
第四段深入探讨了深度学习模型的扩展性,特别是随着计算能力和数据量的增加,模型性能的提升。作者指出,尽管有些人担心我们已经达到了深度学习的极限,但历史表明这种担忧往往是短视的。同时,作者也提出了数据量对于模型性能的影响,尤其是在数据较少的领域,如法律领域,AI的发展可能会面临挑战。
🛠️ Codex模型:从自然语言到代码的转变
第五段讨论了Codex模型,这是一个大型的GPT神经网络,它被训练来预测代码中的下一个词,而不是文本。作者强调了这种模型的实用性和创新性,以及它如何改变编程领域。Codex模型能够将自然语言转换为机器可执行的代码,这不仅展示了AI在编程领域的潜力,也预示着编程职业未来的变革。
🎨 AI在创意领域的应用及其对社会的影响
最后一段探讨了AI在创意领域,如艺术和写作中的应用,以及这些应用对社会的潜在影响。作者提到,尽管人们普遍认为简单的机器人任务将是自动化的首批目标,但实际上创意任务受到了较大的影响。AI在生成艺术作品和文本方面取得了显著进展,这可能会对社会经济和职业结构产生深远的影响。
Mindmap
Keywords
💡神经网络
💡解释性
💡深度学习
💡Transformer
💡预测
💡理解
💡数据集
💡计算能力
💡Codex
💡AGI(通用人工智能)
💡创新
💡泛化能力
💡创造性任务
Highlights
神经网络与传统编程方式不同,通过循环中的复杂方程式来实现,难以精确理解其功能。
神经网络的不透明性可能是其特性而非缺陷,类似于人类智能的复杂性。
作者在2000年代初对神经网络产生兴趣,并在2003年开始与Jeff Hinton合作。
AI领域曾长期处于探索阶段,需要大量坚持和毅力。
神经网络之所以成功,部分原因是它们在数学上的难以推理性。
OpenAI的创立初衷是将科学和工程融合,以推动AI发展。
AI技术的发展需要同时关注技术进步和安全性、政策等方面。
AI的发展将深刻影响国家间的力量对比和世界的未来走向。
跨学科融合是创新的重要途径,如苹果公司硬件与软件的结合。
GPT模型展示了深度学习模型在工程上的极致要求和科研上的创新需求。
预测能力与理解能力紧密相关,好预测需要一定程度的理解。
Transformer架构和模型规模的扩大是GPT模型成功的关键。
数据量的多少将影响深度学习模型在特定领域的应用效果。
深度学习历史上每年都有声音认为达到了极限,但每年都有新突破。
Codex模型能够将自然语言转换为代码,展示了AI在编程领域的潜力。
AI的发展可能首先影响创意产业,而非简单重复性劳动。
神经网络的泛化能力是AI进步的基础,但当前仍需大量数据支持。
生成模型在艺术创作中的应用,展示了AI在创意领域的潜力。
Transcripts
because it's a very different way of
writing code normally you write code and
you can kind of think it through and
understand yep whereas a neural network
it's this you write an equation a
complicated equation inside a loop and
then you run the loop and a good luck
figuring out what it does precisely and
that connects to neural Nets not being
interpretable but it could also argue
that the difficulty of understanding
what neural networks do is not a bug but
it's feature
like we want to build
intelligence intelligence is not simple
to understand we can't explain how we do
the cognitive functions that we do how
we see how we hear how we understand
language so therefore if we can get if
computers can produce objects that are
similarly difficult to understand not
impossible but similarly difficult it
means we're on the right track and so
all those things helped me uh converge
on neural networks fairly early on
yeah what year was it when you sort of
like remember initially getting excited
about neural networks and being pretty
convicted like early 2000s I started
working with Jeff Hinton in 2003 yeah so
quite a while ago now long before I mean
obviously the craze kind of started
around 2010 and so there was a there's a
good so like I think this is a common
theme whenever you look at sort of any
uh anybody who works like in in any sort
of field that becomes very big but
there's a long stretch of like you know
wandering in the desert maybe is one way
to put it yeah I mean definitely lots of
perseverance is required because you
don't know how long the how long you
want to stay in the desert you just got
to endure yeah and that's very helpful
and did you expect like I mean obviously
today neural networks do some pretty
incredible things like do did did you
expect back in 2003 or early 2000s that
like in your lifetime you would see sort
of the the things that we're seeing now
with AI machine
learning I was hoping but I did not
expect it back
then the field of AI was on the wrong
track it was in a mindset of rejection
of neural networks right and the reason
for that is that neural networks are
difficult to reason about mathematically
while other stuff you can prove theorems
about and there is something very
seductive and dangerous about proving
theorems about things y because it's a
way to to Showcase your skill but it's
not necessarily aligned with what makes
the most progress in the field but I
think that neural networks are as
successful as they are precisely because
they're difficult to reason about
mathematically and so anyway my earlier
hope was to Simply to convince the field
that they should work on neural networks
rather than the other stuff that they
were doing yeah but then when computers
started to get fast then my my level of
excitement and about their potential has
increased as well yeah and so what what
are your aspirations today like what do
you what is the like in your lifetime
what's the thing you I mean I think it's
obvious from the uh open ey Mission set
but yeah I mean exactly right so now now
thep now the hopes are much larger now I
think we can really try to build not
only really powerful and useful AI but
actually AGI make it useful make it
beneficial use it to solve and make it
so that it will be used to solve a large
number of problems and create lots of
amazing applications that's that's what
I'd like that's what I hope to see
happen yep
um and then you know obviously along the
way you had been doing a lot of uh a lot
of This research and doing a lot of
groundbreaking work at Google and then
you sort of left and started open AI
with Sam mman and and Greg Brockman and
a bunch of others um what was kind of
the what were your kind of goals with
starting opening eye at the outset what
was what was sort of like the initial um
conception or the initial vision and um
and and what did you hope to accomplish
by by starting sort of a new lab there
were multiple motivations on my for
starting open AI so the first motivation
was that I felt that the way to make the
most progress in AI was by merging
science and engineering into a single
hole into a unit to make it so that
there is no distinction or as little
distinction as possible between science
and engineering so that all the science
is in is infused with engineering
discipline and careful execution and all
the engineering is infused with the
scientific ideas and the the reason for
that is because the field is becoming
mature and so it is hard to just do
small scale tinkering without having a
lot of engineering skill and effort to
really make something work so that was
one motivation I really wanted to have a
comp a company that will be operating on
this principle another motivation was
that I came to
see AI technology in a more sober way I
used to think that AI will just be this
endless good and now I see it in a more
complex way where I think there will be
a lot of truly incredible inconceivable
applications that will improve our lives
in dramatic ways right but I also think
that there will be challenges I think
that there will be lots of problems that
will be posed by the misapplication of
AI and by its peculiar properties that
may be difficult for people to
understand stand and I wanted a company
that will be operating with this
awareness in mind and that will be
trying to address those challenges by
you know as best as possible by not only
working on advancing the technology but
also working on making it safe and also
working on the policy side of things as
much as is rational and reasonable to
make the whole be as useful and as
beneficial as possible totally and I
think something we agree on I mean I
think one thing that is very obvious to
me is that AI is something that's going
to you know which countries have access
to AI technology and the ways in which
they use them are going to Define how
the world plays out over the course of
the next few decades like it's just I
think that's how uh that's the path
we're on as a as a world that's right
among many other things right right and
you know I and I this thing that you
mentioned around sort of bringing
together the science and engineering I
think it's quite it's uh it's quite
profound I think for a few reasons right
because like one is that um uh you know
first of all I think a lot of the the
best the most incredible Innovative
things happen often times from sort of
blurring the lines between disciplines
like apple is one of the best examples
where from the very beginning they were
always like Hey we're blending hardware
and software and that's that's our
special sauce and obviously it's uh
produced some incredible things and I
think a lot of other research Labs you
know they operate in very sort of
scientists tell the engineers what to do
mindset which is counterproductive
because you really need to understand
both very well to understand what the
what kind of the limits of the
technology are yeah that's right and on
that point you may even say isn't it
obvious that the science and the
engineering should be together and on
some level it is but it just so happens
that historically it hasn't been this
way there's a certain kind of taste like
empirically it has been the case in the
past less so now that people who
gravitate to research would have a
certain taste that would also make them
less drawn to engine engineering and
vice versa and I think now because
people are also seeing this reality on
the ground that to do any kind of good
science you need a good engineering then
people then you have more and more
people who are strong in both of these
axes totally yeah and I think that you
know uh Switching gears a little bit to
to kind of the GPT models I this is a
great illustration right because the GPT
models are impossible without incredible
engineering like the that's sort of um
uh and and but yet they still require
novel research they still require novel
science to be able accomplish and they
they've obviously been some of the
biggest breakthroughs in the field of AI
as of late and sort of um blown open
many people's imaginations about what AI
can accomplish or at least increase
people's confidence that AI can
accomplish incredible things you know
I'm kind of curious about uh originally
when when at opening ey when you guys
were you've been working on these
language models for some time what were
the original sort of research
Inspirations behind it and what were the
original sort of um things that led you
all to say hey this is something that's
worth working on worth scaling up worth
continuing to double down on so there'
have been
multiple lines of thinking that led us
to convergent language
models there has been an idea that we
believed in relatively early on that you
can somehow link understanding to
prediction and specifically to
prediction of
whatever data you give to the to the
model where the idea is well let's let's
let's work out an example so before
before diving into the example I'll I'll
start with the conclusion first the
conclusion is that if you can make
really good guesses as to what's going
to come next you can't make it perfectly
it's impossible but if you can make a
really good guess you need to have a
meaningful degree of understanding you
know in the example of of a book suppose
that you read a book and it's a mystery
novel and in the last chapter all the
pieces are coming together and there is
a critical sentence and you start to
read you read the first word and the
second word now you say okay the
identity of some person is going to be
relieved and your mind is honing you
know like it's either this person or
that person you don't know which one it
is now maybe someone who read the book
and thought about it very carefully says
you know I think it's probably this
person maybe that but probably this so
what this example goes to show that
really good
prediction is connected to understanding
and this kind of thinking has led us to
EXP experiment with all kinds of
approaches of hey can we predict things
really well can we predict the next word
can we predict the next pixel and study
their their properties and through this
line of work we were able to get to we
did some work before the gpts before
Transformers before the Transformers
were
invented and with um something that we
call the sentiment neuron which is a
neural net which was trying to predict
the next word the next sorry the next
character in reviews of Amazon
products and it was a small neural net
because it was maybe four years ago but
it did prove the principle that if you
predict the next character well enough
you will eventually start to discover
the semantic properties of the text and
then with the gpts we took it further we
said okay well we have the Transformer
it's a better architecture so we have a
stronger effect and then later there
realization that if you make it larger
it will be better so let's make it
larger and it will be better yeah I mean
there's there's a lot of uh there's a
lot of great nuggets and what you just
mentioned right I think first is the
Elegance of this concept which is like
hey if you get really good at predicting
the next whatever get really good at
prediction you get that obligates you to
be good at all these other things if if
you're really good at that and it's it's
you know I think it's probably like
under underrated how um that required
some some Vision because it's like early
on you know you try to get really good
at predicting things and you know you
got the sentiment NE on which is cool
but it's like that's like a it's like a
a blip relative to what we obviously
have seen with the large language models
and so that I think is significant and I
think the other significant piece is um
where you just mentioned which is kind
of um scaling it up right and uh I think
you know you guys had had uh released
this paper about this kind of like a
scaling laws of what you have found as
you scaled up um compute data model size
sort of in concert with one another but
I'm I'm kind of curious like what's the
um obviously there's there's some
intuition which just like hey scaling
things up um is good and you see you see
great behaviors what's kind of your
intuition behind um sort of if you think
from now over the next few the next few
years or even the next few decades like
what what does scaling up mean why is it
likely to continue resulting in in great
results and and um what do you think the
limits are if any I think two the two
stat two statements are true at the same
time on the one hand it does look like
our models are quite large can we keep
scaling them up even further can we keep
finding more data for the scale up and I
want to spend a little bit of time on
the data question because I think it's
not obvious at all yeah
traditionally because of the roots of
the field of machine learning because of
the roots of because the field has been
fundamentally academic and fundamentally
concerned with discovering new meth
methods and less with the development of
very big and powerful systems the
mindset has been someone
builds someone creates a fixed
Benchmark so a data set of a certain of
a of a certain of certain shape of
certain characteristics and then
different people can compare their
methods on this data set but what it
does is that it forces everyone to work
with a fixed data set yep the thing with
the the gpts have shown in particular is
that scaling requires that you increase
the compute and the data and tend them
at the same time and if you do this then
you keep getting better and better
results and in some domains like
language there is quite a bit of data
valuable in other maybe more specialized
subdomains the amount of data is a lot
smaller and that could be for example if
you want to have an automated lawyer so
I think
your big language model will know quite
a bit about language and it will be able
to converse very intelligently about
many topics but it may perhaps not be as
good at being a lawyer as we'd like it
will be quite formidable but will it be
good enough so this is unknown because
the amount of data there is smaller but
any time where data is abundant then
it's possible to apply the Deep the
magic deep learning formula and to
produce these
increasingly good and increasingly more
powerful model
and then in terms of what are the limits
of scaling so I think one thing
that's notable about the history of deep
learning over the past 10 years is that
every year people said okay we had a
good run but now we've hit the limits
and that happened year after year after
year and so I think that I think that we
absolutely may hit the limits at some
point but I also think that it would
be unwise to bet against deep learning
yeah you know there's a number of things
I want to dig in here um because they're
they're all pretty interesting one is is
this this um just I think this you you
certainly have this mental model um that
I think is is is quite good which is
kind of like hey uh Mo's law is this
incredible this an incredible accelerant
for everything that we do right and the
more that there's Mor's law for
everything you know Mor's law for
different inputs that go into the deep
into the machine learning life cycle you
know we're just going to like push all
these things to the map and we're going
to see just incredible performance and I
think is significant because as you
mentioned about this data point it's
like hey if we if we get more efficient
at at we get more efficient at compute
which is something that's happening we
get more efficient at uh producing data
or finding data or or generating data we
get more effic obviously there's more
efficiency out of the algorithms you
know all these things are just going to
keep enabling us to do the next
incredible thing and the next incredible
thing the next incredible thing um so
first I guess like do like I we've
talked about this a little bit before so
I know you agree with that but like how
what do you think is
um where do you think like is there any
flaw are there any flaws to that logic
what would you be worried about in terms
of how everything will scale up over the
next few years I mean I think I think
over the next few years I don't have too
much concern about Contin continued
progress I think that we will we will
have faster computers we will find more
data and we'll train better models I
think that is I don't see to I don't see
particular risk there I think moving
forward we will need to start being more
creative about okay so what do you do
when you don't have a lot of data can
you somehow intelligently use the same
compute to compensate for that lack of
data and I think those are the questions
that we and the the field will need to
to Grapple these to continue our
progress yeah no and I think this point
about data the other thing I wanted to
touch on because this is this is
something obviously at scale that we're
we focus on and I think that the large
language models thankfully because you
can leverage the internet really like
all the fact that like all this dat has
you know existed and been accumulating
for a while you can show some pretty
incredible things in all new domains you
need efficient ways to to generate lots
of data right and I think that there's
there's this whole question was like how
do you make it so that you know each
ounce of human effort that goes into
generating some data produces as much
data as possible um and I think that
like something that we're passionate
about that I think we've talked a little
bit about is like how do you get like a
mors law for data right how do you get
you know more and more efficiency out of
like a human effort in producing data
and that might require novel new um
paradigms but uh is something that I
think is required for in this lawyer for
example uh that you mentioned like we
have a pretty finite set of lawyers how
do we get those lawyers to produce
enough data so you can create some great
legal uh legal AI the choices that we
have is either improve our methods so
that we can do more with the same data
or do the same with less data and the
second is like you say somehow increase
the efficiency of the
teachers yep and I think both will be
needed to make the most progress yeah
well it's kind of like you know I I
really think Mor's law is instructive
right like to get these chips performing
better people try all sorts of random
crap and then the end output is that you
have you have chips that have more
transistors right and I think this like
if we think about is like do we have
models that perform better with like
certain amounts of data or certain
amounts of of teaching um how do we how
do we make that go
yeah I mean I think I'm I'm sure that
there will be ways to do that I mean for
example if you ask the human teachers to
help you only in the hardest cases I
think that will allow you to move faster
uh I want to switch gears to uh one of
the offshoots of the large language
model efforts which is particularly
exciting especially to me as an engineer
probably most uh people who spend a lot
of time coding which is codex uh which
demonstrated some pretty incredible
capabilities of going from uh sort of
natural language and to code and sort of
being able to to uh interact with with a
program in in a very novel New Way um
you know I'm kind of curious for you
what what excites you about this effort
what do you think is the what do you
think are the reasonable expectations
for what codex and codex like systems
will enable in the next few years what
about far beyond that and and uh
ultimately why are you guys so excited
about
it for for some context codex is pretty
much a large GPT neural network that's
trained on code instead of training to
predict the next word in text it's
trying to predict the next word in code
the next I guess token in code and the
thing that's cool about it is that it
works at
all like I don't think it's self-evident
to most people
that it would be possible to train a
neural
net in such a way
so that if you just give it some
representation of text that describes
what you want and then the neural
network will just process this text and
produce code and this code will be
correct and it will run and it's
exciting for a variety of reasons so
first of all it is useful it is new it
shows
that I'd say when it code has been a
domain that hasn't really been touched
by AI too much
even though it's obviously very
important and it touches on aspects
where AI has been you know today's AI
deep learning has been perceived as weak
which is reasoning and
carefully laying
out plans
and not being
fuzzy and so so it turns out that in
fact they can do a quite quite a good
job here and like one analogy one
distinction between between codex and
language models is that the the Codex
models the code models they allow you to
they in effect they can control the
computer it's like they have the
computer as an actuator and so that
makes them much more it greatly expand
it it makes them much more useful you
can do so many more things with them and
of course we want to make them better
still I think they can improve in lots
of different ways those those are just
the preliminary code models
I expect them to be quite useful to
programmers and especially in areas
where you need to know random apis
because these neural networks they so
one thing that I think is small
digression the GPT neural networks they
don't learn quite like
people a person will often have somewhat
narrow knowledge in great depth while
these neural networks they want to know
everything that exists and they really
try to do that so their knowledge is
encyclopedic it's not as deep it's
pretty deep but not as deep as a
person and so because of that these
neural networks in their in the way they
work today they complement people with
their breaths so you might say I want to
do something with a library I don't
really know it could be some existing
library or maybe the neural network had
read all you know the code of all my of
all my colleagues and it knows it knows
what they've written and so I want to
use some Library I don't know how to use
the network will have a pretty good
guess of how to use it youd still need
to make sure that what it said is
correct because such is its level of
performance today you cannot trust it
blindly especially if the code is
important for some domains where it's
easy to undo anything that it writes any
code that it writes then I think you can
trust it just fine but if you actually
want have real code you want to check
it but I expect that in the future those
models will continue to improve I expect
that the neural network that the code
neural networks will keep getting better
and I think the nature of the
programming of the programming
profession will change in response to
these models I think that in a like in a
sense it's a it's a natural
continuation
of how in in in the software engineering
World we've been using higher and higher
level programming languages at first
people wrote assembly then they had
foron then they had C now we have python
now we have all these amazing python
libraries that's a layer on top of that
and now we can be a little bit more imp
precise we can be a little bit more
ambitious and the model the neural
network will do a lot of the work for us
and I do think that I should say I
expect something similar to happen
across the board in lots of other white
color professions as well you know
there's if you think about the economic
impact of AI there's been an inversion I
think I think there's been a lot of
thinking that uh maybe simple robotics
tasks will be the first you know the
first ones to be hit by automation but
instead we are finding that the creative
tasks counterintuitively they seem to be
affected quite a bit if you look at the
generative neural networks in the way
you generate images now now it's you can
find it on Twitter all kinds of stunning
images being generated ated you know
generating cool text is happening as
well but the images are getting most of
the attention and then with things like
code things like a lot of writing tasks
this is the uh White Collar tasks they
are also being affected by these AIS and
I do expect that Society will change as
progress continues to make Society will
change and I think that it is important
for economists and people who think
about these questions to pay pay careful
attention to these Trends so that as
technology continues to improve there
are good ideas in place to like in
effect to to be ready for this
technology yeah there's a number of like
really again uh interesting nuggets in
there I think one is that um I think
like the one of the big Ideas behind
codex or codex like models right is that
you go from being able to go from human
language to machine language right and
and you kind of mention like oh all of a
sudden the machine is an actuator and if
you think about like I think many of us
when we think about AI we think about
like the Star Trek computer you know you
can just ask a computer and it'll do
things you know that's that this is this
is a key enabling step right because if
if all of a sudden you can go from how
we speak how human speak to things that
a machine can understand then you like
bridge this like key translation step so
I think that's super interesting you
know another thing that this inversion
that you just mentioned about is is
super interesting because I think that
you know one of the things that um my
beliefs on this is like hey this is the
reason that some things have become much
easier than others you know it's all a
product of availability of data right
there's some areas where we've had there
just exists lots and lots of Digital
Data that you can kind of suck up into
the algorithms and it can do quite well
and then in things like robotic tasks or
setting a table or you know all these
things that that are that we've had very
a lot of trouble um building machines to
do you're like fundamentally limited by
amount of data you have first just by
the amount of data that's been collected
so far but also like you know you can
only have so much stuff happening in the
real world to collect that data I'm
curious how like it how do you think
about that or do you think it's actually
something intrinsic to the sort of like
creative tasks that is uh that is
somehow more uh suited to current heral
networks I think it's both I think it is
unquestionably true that with the so we
can take a step
backwards at the base of all AI
progress that has happened at least in
in all of deep learning and arguably
more is the ability of neural networks
to generalize now generalization is a
technical term which
means that you understand something
correctly or take the right action in a
situation that's unlike any situations
that you've seen in the past in your
experience and you can see and so now a
system generalizes better if from the
same data it can do the right thing or
understand the right situation in a
broader set of
situations and so to make an analogy
suppose you have a student at a
university studying for an exam that
student might say this is a very
important exam for me let me memorize
this let me make sure I can solve every
single exercise in the textbook you know
such a student will be very well
prepared and could achieve a very very
high grade in the
exam now consider a different student
who might say you know what I don't need
to to learn to to figure to know how to
solve all the exercise in the textbooks
As Long As I Got the fundamentals right
I read the first 20 pages and I feel I
got the fundamentals if the if that
second student also achieves a high
grade in the exam that second student
did something harder than the first
student that second student exhibited a
greater degree of generalization they
were able to even though the questions
were the same the situation was less
familiar for the second student and the
first student and so our neural networks
are a lot like the first students they
they have an incredible ability to
generalize for a
computer but we could we could do more
and because they generalization is
not yet perfect definitely not yet at a
human level we need to compensate for it
by training on very large amounts of
data that's where the data comes in the
the better you generalize the less data
you need slash the further you can go
with the same data so maybe once we find
figure out how to make our neural
networks generalize a lot better then
all those small domains we don't have a
lot of data it actually won't matter the
neuro electric will say it's okay I know
what to do well enough even with this
limited amount of data but today we need
a lot of data but now when it comes to
the creative applications in particular
there is some way in which they are
especially well suited for neural
networks and that's because generative
models play a very Central role in
machine learning and the nature of the
generations of generative models are
somehow analogous to the artistic
process it's not perfect it doesn't
capture everything and very and there is
certain kinds of art which our models
cannot do yet but I think this second
connection the the the generative aspect
of Art
and the ability of generative models to
generate new plausible data is another
reason why art has been we've seen so
much progress in generative art
Browse More Related Video
Ilya Sutskever | AI neurons work just like human neurons | AGI will be conscious like humans
Geoffrey Hinton: The Foundations of Deep Learning
Interview with Dr. Ilya Sutskever, co-founder of OPEN AI - at the Open University studios - English
Super Humanity | Transhumanism
Ilya Sutskever | AI will be omnipotent in the future | Everything is impossible becomes possible
Heroes of Deep Learning: Andrew Ng interviews Geoffrey Hinton
5.0 / 5 (0 votes)