Ilya Sutskever | The birth of AGI will subvert everything |AI can help humans but also cause trouble
Summary
TLDR在这段对话中,OpenAI的代表讨论了通用人工智能(AGI)的定义及其潜在能力。他们探讨了当前的技术,如Transformer和LSTM的优劣,并强调了模型扩展的重要性。安全性问题是另一个关键话题,特别是当AI变得极其强大时的潜在风险和解决方案。此外,他们分享了对未来AI技术的期望,并对使用大型语言模型的企业家提出了实用建议,包括关注独特数据和未来发展趋势。整个讨论充满了对AI未来潜力和挑战的深思与展望。
Takeaways
- 🧠 AGI的定义是能够自动化绝大多数智力劳动的计算机系统,可以被视为与人类智能相当的同事。
- 📈 目前Transformer模型已经非常强大,但未来可能还有更高效或更快的模型出现。
- 🔍 尽管Transformer模型可能不是最终解决方案,但它们已经足够好,并且随着规模的扩大,性能仍在提升。
- 🤖 LSTM与Transformer相比,如果经过适当的改进和训练,仍然可以走得很远,但可能不如Transformer。
- 📊 模型的扩展法则表明,输入神经网络的数据量与简单性能指标之间有很强的关系,但这种关系并不总是适用于更复杂的任务。
- 🚀 神经网络的能力提升,特别是在编程能力方面,是一个令人惊讶的进展,它从几乎无法编程发展到现在能够高效地生成代码。
- 🔐 AI安全是至关重要的,特别是当AI变得极其强大时,需要确保其与人类价值观的一致性,避免潜在的风险。
- 🌐 国际组织在制定AI标准和法规方面可以发挥重要作用,特别是在处理超智能技术时。
- ⏳ 对于构建在大型语言模型之上的产品,重要的是要考虑到技术在未来几年的发展方向,并据此进行规划。
- 🛠️ 利用独特的数据集可以为产品提供竞争优势,同时考虑如何利用模型的当前和潜在能力。
- 🔮 预见模型在未来的可靠性和性能提升,可以帮助企业家和开发者更好地规划他们的产品路线图。
Q & A
什么是AGI,它与普通计算机系统有何不同?
-AGI,即人工通用智能,是一种能够自动化绝大多数智力劳动的计算机系统。与普通计算机系统相比,AGI被认为具有与人类相似的智能水平,能够像人类同事一样工作,对各种问题给出合理的响应。
Transformer模型在实现AGI中扮演了什么角色?
-Transformer模型是当前实现AGI的关键技术之一。它通过注意力机制有效处理序列数据,已经在多个领域展现出强大的能力。尽管Transformer可能不是实现AGI的唯一途径,但它是目前已知的最有效架构之一。
为什么说Transformer模型的好坏并不是二元的?
-Transformer模型的好坏并不是绝对的,而是一个连续的谱系。随着模型规模的增大,它们的表现也会变得更好,但这种提升可能是逐渐放缓的。这意味着,尽管存在改进空间,但现有的Transformer模型已经足够强大。
LSTM与Transformer在AGI中的地位有何不同?
-LSTM是一种循环神经网络,如果对其进行适当的修改和扩展,理论上也可以达到与Transformer相似的效果。但由于目前对LSTM的训练和优化工作较少,因此在实际应用中,Transformer通常表现得更好。
如何理解模型的扩展性(scaling laws)?
-模型的扩展性描述了模型规模与其性能之间的关系。虽然这种关系在某些简单任务上表现得很强,但在更复杂的任务上,如预测模型的新兴能力,这种关系就变得难以预测。
在AGI的发展过程中,哪些新兴能力让你感到惊讶?
-虽然人类大脑能够执行许多复杂任务,但神经网络能够实现这些任务仍然令人惊讶。特别是代码生成能力的发展,从无到有,迅速超越了以往计算机科学领域的期望。
AI安全问题为何重要,它与AI的能力有何关联?
-AI安全问题与AI的能力直接相关。随着AI变得越来越强大,其潜在的风险也相应增加。确保AI的安全性,特别是当它达到超智能水平时,是避免其强大能力被滥用的关键。
什么是超智能(super intelligence)?
-超智能是指远超人类智能水平的AI。这种智能能够解决难以想象的复杂问题,但如果不能妥善管理,也可能带来巨大的风险。
在AI发展中,我们应该如何考虑和应对自然选择的挑战?
-自然选择不仅适用于生物,也适用于思想和组织。即使我们成功地管理了超智能的安全性和伦理问题,也必须考虑技术和社会的长期演变,以及它们如何适应不断变化的环境。
对于使用大型语言模型的企业家,你有哪些实用的建议?
-企业家应该关注两个方面:一是利用独特的数据资源,二是考虑技术的长期发展趋势,并据此规划产品发展。这有助于他们在AI技术不断进步的环境中保持竞争力。
如何看待当前AI技术的不稳定性,它对未来产品开发有何启示?
-当前AI技术的不稳定性提示我们,未来的产品开发需要考虑到技术的成熟度和可靠性。这意味着企业家需要对AI技术的进步保持敏感,并准备好在技术成熟时迅速适应。
Outlines
🧠 AGI定义与智能系统未来展望
本段讨论了人工通用智能(AGI)的定义,引用了OpenAI宪章中的描述,将其定义为能够自动化大部分智力劳动的计算机系统。讨论了AGI的直观理解,即与人类智能相当的计算机系统。同时,提到了Transformer模型在当前AI发展中的重要性,并探讨了是否只需Transformer架构就能实现AGI,以及LSTM等其他算法的潜力和效率问题。
📊 神经网络的扩展法则与预测能力
这一段深入探讨了神经网络扩展法则,即输入与性能指标之间的关系,以及我们对这种关系的理解程度。指出虽然这种关系很强,但我们真正关心的是间接性能,例如解决编程问题的能力,而不是单纯的单词预测准确性。此外,还提到了OpenAI在GPT-4开发过程中对更复杂任务的扩展法则研究,以及对未来AI能力的预测和潜在的惊人表现。
🔮 AI安全性与超级智能的未来挑战
讨论了随着AI技术发展,特别是超级智能的出现,我们面临的AI安全性问题。强调了超级智能的潜在能力和与之相关的巨大风险,包括对齐问题(alignment problem),即确保AI的行为与人类价值观一致。还提到了国际组织在制定高标准和规则方面的作用,以及超级智能可能带来的积极变化,如解决全球性问题和提高生活质量。
🛠️ 构建在大型语言模型之上的创业建议
为使用大型语言模型的企业家提供实用建议,强调了独特数据的重要性和对未来技术发展的预测。建议企业家不仅要关注当前的技术状态,还要考虑几年后的技术进步,以及这些进步如何影响他们的产品和业务模式。还讨论了技术不可靠性的问题,以及如何通过观察和实验来预测和准备技术的未来变化。
🚀 技术发展的未来趋势与机遇
最后一段讨论了技术发展的趋势,特别是上下文窗口的扩大和模型可靠性的提高,以及这些变化如何为企业家提供新的机遇。强调了通过观察模型的当前表现和潜力,来预测和准备未来的技术进步,从而在竞争激烈的市场中获得优势。
Mindmap
Keywords
💡AGI
💡Transformer
💡神经网络
💡超智能
💡对齐问题
💡AI安全
💡自然选择
💡编码能力
💡扩展法则
💡创业
Highlights
AGI定义为能够自动化大部分智力劳动的计算机系统,类似于与人类同事合作的智能。
Transformer模型是目前AI研究的基础,但未来可能存在更高效或更快的模型。
尽管LSTM与Transformer相比可能效率较低,但经过适当调整和训练,LSTM也能实现巨大进步。
当前对模型扩展规律的理解尚不完善,但已有一定的科学基础,特别是在预测编码问题解决能力方面。
神经网络的有效性是一个惊喜,因为它们在初期并不被看好。
代码生成能力的提升是神经网络令人惊讶的新兴能力之一。
AI安全是随着AI能力增强而日益重要的议题,特别是当AI变得极其强大时。
对超智能的监管和标准制定是必要的,以确保其安全和对人类有益。
超智能可能带来的挑战包括目标对齐问题、人类利益冲突和自然选择的影响。
AI的发展需要在创新和安全之间找到平衡,避免过度监管扼杀创新。
对于使用大型语言模型的企业家,应考虑数据的独特性和未来几年技术发展的趋势。
企业家应关注模型的不可靠性,并思考如果这些模型变得可靠将如何影响产品。
思考模型能力的潜在发展,如上下文窗口的扩大,对未来产品的影响。
通过观察和体验模型,进行思想实验,为中短期未来的变化做好准备。
Transcripts
H it interesting where um let's start
with so what's your definition of AGI
how what's your mental
picture yeah
so
AGI so at open AI we
have a document which we call the openai
charter which outlines the goal of open
aai and there we offer a definition of
AGI and we say that an AGI is a computer
system which can automate the great
majority of intellectual
labor that's one useful definition mhm
in some sense an AGI would
be the intuition there is it's a
computer that's as smart as a person so
you might for example have a coworker MH
that's a
computer so that would be a def a
definition of AGI which I think is
intuitively satisfying the term is a bit
ambiguous because AGI the g means
general so it's a generality that we
want that we care about in the AGI but
it's actually a bit more than generality
we care about generality and competence
needs to be General in a sense that it
can respond sensibly when you throw
things at it but it needs to be
competent so that when you when it does
something you ask it a question or ask
you to do something it will do it yeah I
like the sort of very practical
definition at the the end of the day
because it gives you some measurement
where you can can figure out how close
are you do do you think we have all the
ingredients to to get to AGI um if not
what's missing kind of in the deack it's
a complicated stack
already um a Transformer is really all
we need kind of paying homage to the
famous um attention paper yeah
you know I won't be overly specific in
my answer to this question but I will
say that I think
that no I'll comment on the second part
of the question is is is Transformers
all we
need and I think that the question is a
bit wrong because it implies something
binary it implies Transformers are are
either good enough or not good
enough but I I think it's better to
think about it in terms of tax where we
have Transformers and they're pretty
good mhm maybe we could have something
better that would be maybe more
efficient or maybe you'll be
faster but we as we know when you make
the Transformers large they still become
better they might just become might be
becoming better more
slowly so while I am totally
sure that it will be possible to improve
very significantly on the on the current
architectures that we have even if we
didn't we would be able to go extremely
far do you think it
matters what the algorithm is so so for
example an lstm versus a
Transformer just scaled up sufficiently
maybe that's an efficiency Delta or
something like that but don't we end up
in the same same place at the end
so I would say
almost entirely yes with a caveat so
there are two
caveats Alis so I'm just thinking of how
what level of detail to go here you know
maybe I will I will I will skip the
details how many people in the audience
know what an lstm is Oh see it's a quite
CR around here so I think we're mostly
okay let's dig let let's dig in then so
I would argue that with a few if we made
a few simple modifications to the lstm
their hidden states are quite small if
you somehow made it larger and then we
were to go through the trouble of
figuring out how to train them
CU lstms are recurrent neural networks
and we kind of forgot about them we
haven't put in the effort to cuz you
know how neural training works you have
the hyper parameters well how do you set
them it's like
you don't
know how do you set your learning rates
if it doesn't learn can you explain why
and so this kind of work has not been
done for lstms so that's why our ability
to train them is more reduced but had we
done that work so that we were able to
train the lstms and we just did some
simple things to increase their hidden
State size I think they would be worse
than
Transformers but we would still be able
to go extremely far with them also okay
um how good is our understanding of
scaling laws like if we if we scale
these models up how confident are you in
being able to predict capabilities of
these particular models how good is that
science so that's a very good question
the answer is so
so I was hoping for a more definitive
answer well go for it so so is a very
definitive answer it means we are not
great but we are not absolutely terrible
either but we are not great definitely
not great so what the scaling law tells
you it uh relates it's a relationship
between the inputs that you put into the
neural network and some kind of a simple
to
measure performance simple to evaluate
performance measure like your next word
prediction accuracy
MH and that relationship is very strong
but what is challenging is that we don't
don't really care about next word
prediction we care about it indirectly
we care about the other incidental
benefits that we get out of
it and our and so our so for example you
all know that if you predict the next
word accurately enough you get all kinds
of interesting emerging properties those
have been quite hard to predict or at
least I'll say I'm not aware of such
work and if anyone is looking for
interesting research work pro s to work
on that would be one I will say I will
mention one example something that we've
done at open AI in our in in our runup
to GPT 4 where we tried to do a scaling
law for a more interesting task which is
predicting accuracy at solving coding
problems we were able to do that
accurately very accurately and that's a
pretty good thing because this is a more
tangible metric it's not it's still it's
it's an improvement over next step next
word prediction accuracy as far as
things that are relevant to us so
another words it's more relevant to us
to know what the coding accuracy is
going to be ability to solve coding
problems compared to just ability to
predict and exp word it still doesn't
answer the really important question of
can you predict some emergent behavior
that you haven't seen
before okay um
speaking of these capabilities that are
kind of emerging capabilities which one
surprised you the most as these models
scaled what what was the thing where you
said like well I'm kind of astonished
these models can do
this it's a very difficult question to
answer
because it's too easy to get used to
where things
are so they definitely have been times
when I was surprised but you adapt so
fast it's kind of
crazy I think maybe the big surprise for
me
is you know it may it may sound a little
odd probably to most people in this
audience but the big surprise for me is
that neural networks work at
all because when I was starting my work
in this area they didn't work or it was
like let's define what it means to work
at all it means they could do they could
work a little bit but not really not in
any serious way not in a way that anyone
except for the most intense enthusiasts
would care
about and so now we see yeah like those
neural Nets work so I guess the
artificial neuron really
is at least somewhat related to the
biological neuron or at least that basic
assumption has been validated to some
degree
what about like an emergent property was
the one that sticks out to to to you
like for example I don't know code
generation or did you may maybe it was
different in your mind maybe you you
just once you saw like hey neural Nets
can work and they can scale yeah of
course all these sort of properties will
emerge because you know at at the limit
point we're building a human brain and
humans know how to code and humans know
how to reason about tasks and so on um
was that did you just expect all of that
or did uh I've definitely been surprised
and I'll mention why because the human
brain can do those things it's true but
does it follow that our training process
will produce something similar so so it
was definitely very amazing I
think yeah seeing
seeing the coding ability improve
quickly that was
quite quite a sight to be seen and for
coding in particular because you know it
went from no one has ever seen a
computer code anything at all ever there
was a little area of computer science
called program synthesis mhm which
maybe it was very Niche and it was very
Niche because they couldn't have any
accomplishments it was a very they had a
very difficult experience and then these
neural Nets came in and said oh yeah
code synthesis like we're going to do
we're going to accomplish what you hope
were hoping to achieve one day like
tomorrow
so that was
yeah deep
learning just just out of curiosity when
you write code how much of your code is
yours how much of your code is I mean
like collaboration but I
I I do enjoy I do enjoy it when the
neural net writes most of
it all right let's let's switch TCT here
a little bit um as this models get more
and more
powerful um it's worthwhile to to also
talk about AI safety and uh uh and open
AI has has released a document just uh
just recently that where you're one of
the unders signers um uh Sam has
testified in front of
Congress what what worries you most
about AI
safety yeah I can talk about
that
so let's take a step back and talk about
the state of the world so you know
you've had this AI research happening
and it was exciting and now you have the
GPT models and now you all get to play
with all the different chatbot and
assistants and you know bar and chat GPT
and you say okay that's pretty cool it
can do things
and indeed there already
are you can start perhaps worrying about
the implications of the tools that we
have today and I think that it is a very
valid thing to do but that's not where
I allocate my
concern the place where things get
really
tricky is when you imagine fast forward
in some number of years a decade let's
say how powerful will AI be of course
with this incredible future power of AI
which I think will be difficult to
imagine frankly with an AI this powerful
you could do incredible amazing
things that are perhaps even outside of
our
dreams like if you can really have a
dramatically powerful AI
but the place where things get
challenging are directly connected to
the power of the AI it is powerful it is
going to be extremely unbelievable
unbelievably powerful and it is because
of this
power that's where the safety issues
come up and I'll
mention three I I personally see
three you know when when you get so you
you alluded to the letter mhm that uh we
posted at open AI a few days ago
actually
yesterday about what we about some ideas
that we
think would be good to implement to
navigate the challenges of super
intelligence now what is super
intelligence why did we choose to use
the term super
intelligence the reason is that super
intelligence is meant to convey
something that's not just like an AGI
with AGI we said well you have something
it's kind of like a person kind of like
a
coworker super intelligence is meant to
convey something far more capable than
that when you have such a capability
it's like can we even imagine how it
will be but without question it's going
to be unbelievably
powerful it could be used to solve
incomprehensibly hard problems if it is
used well if we navigate the challenges
that super intelligence POS poses we
could we could
radically improve the quality of life
but the power of super intelligence is
so vast so the concerns the concern
number one has been expressed a lot and
this is the scientific problem of
alignment you might want to think of it
from the as as an analog to nuclear
safety you know you build a nuclear
reactor you want to get the energy you
need to make sure that it won't melt
down even if there's an earthquake and
even if someone tries to I don't know
smash a truck into it y so this is the
super intelligence safety and it must be
addressed in order to contain the vast
power of super intelligence this called
the alignment problem one of the
suggestions that we had in our in the
post
was an approach that an international
organization could do to create various
standards at this very high level of
capability and I want to make this other
point you know about the post and also
about um our CEO mman Congressional
testimony where he advocated for
regulation of AI the intention is
primarily to put rules and standards of
various kinds on the very high level of
capability you know you could maybe
start looking at GPT 4 but that's not
really what is interesting what is
relevant here but something which is
vastly more powerful than that when you
have a technology is so powerful it
becomes obvious that you need to do
something about this
power that's the first concern the first
challenge to overcome the Second
Challenge to overcome is that of course
we are people we are humans humans of
interests and if you have super
intelligences controlled by people well
who knows what's going to happen I do
hope that at this point we will have the
super intelligence itself try to help us
solve the challenge in world that it
creates this is not no longer an
unreasonable thing to say like if you
imagine a super intelligence that indeed
sees things more deeply than we do much
more deeply to understand reality better
than
us we could use it to help us solve the
challenges that it creates then there is
the third challenge which
is the challenge maybe of natural
selection you know what the Buddhists
say that change is the only constant so
even if you do have your super
intelligences in the world and they are
all we managed to solve alignment we
managed to solve no one wants to use
them in very destructive ways we managed
to create a life of unbelievable
abundance which really like not just not
just material abundance but Health
longevity like all the things we don't
even try dreaming about because they're
so obviously impossible if you've got to
this point then there is the third
challenge of natural selection things
change you know you know that natural
selection applies to ideas to
organizations and that's a challenge as
well maybe the neural link solution of
people becoming part AI will be one way
we will choose to address this I don't
know but I would say that this kind of
describes my concern and specifically
just as the concerns are big if you
manage man it is so worthwhile to
overcome them because then we could
create truly unbelievable
lives for ourselves that are completely
even
unimaginable so it is it is like a
challenge that's really really worse
overcoming I very much like the idea
that there needs to be the sort of
threshold above which we we really
really should pay attention because you
know speaking as a as as a German if
it's like European style regulation
often from people that don't really know
very much about the field you can also
completely kill
Innovation um which is a which be would
be a little bit of a Pity but let's
change TCT here a little bit so this is
a room mostly filled with
entrepreneurs um uh lots of of which are
actually using tools from from from open
AI so just practically
speaking um what are the main things or
the main pieces of advice you would give
folks that are building on top of large
language models like what is the let's
say
canonical set of uh things they should
read they should uh think about um in in
using these models
well yeah
advice
advice practical with a few minutes to
spare few minutes to spare I'll point
out that I am with the caveat that I am
not in similar
shoes I think that two things are value
two things are worth keeping in mind one
is obvious some kind of special data
that cannot be found anywhere else that
can be extremely helpful and I think the
second one is to always keep in mind
mind not just about where things are
right now but where things will be in 2
years in four years and try to plan for
that I think those two things are very
helpful the data is helpful today but
even a little bit kind of trying to get
an intuitive sense for yourself of where
do you imagine things being say in 3
years and how will it affect some of the
basic assumptions of what what the
product is trying to do I think that can
be a helpful thing so what's the sort of
thing that so when I think about this I
I think
about oh we used to be in a world with
really small context Windows right and
then you know I have embeddings I page
things into context Windows like all the
classic stuff but maybe that just goes
away maybe context Windows become really
large or something like that um I so I'm
trying to extrapolate from these sort of
past things is that what you mean
something like this I think it's wor
trying I'll give you another example
yeah like say you're playing with a
model and you can see that the model can
do something really cool and really
maybe amazing if it was reliable but
it's so
unreliable so you kind of like forget it
it's not there's no point using it so
that's the kind of thing which can
change something which is like if you
can for example something which is
unreliable can become reliable enough
and so if you're just kind of
experiencing those models you paying
attention to what people are sharing and
you say oh like look at this cool thing
which works once in a while but if it
worked what would happen so these kind
of thought experiments I would argue can
help prepare for the kind of near to
medium-term Future that that's super
good
advice I think we are we are
unfortunately at time we could we could
do this forever please join me in uh uh
thanking Ilia
[Applause]
Browse More Related Video
![](https://i.ytimg.com/vi/CC2W3KhaBsM/hq720.jpg)
In conversation with the Godfather of AI
![](https://i.ytimg.com/vi/3CvaAr72BnQ/hq720.jpg?sqp=-oaymwEmCIAKENAF8quKqQMa8AEB-AH-CYAC0AWKAgwIABABGBEgZShyMA8=&rs=AOn4CLAKsyfsBn9IEi_eLwhQTCCiFZcq4g)
《與楊立昆的對話:人工智能是生命線還是地雷?》- World Governments Summit
![](https://i.ytimg.com/vi/FWi-OV-VyXc/hq720.jpg)
Ilya Sutskever | AI will be omnipotent in the future | Everything is impossible becomes possible
![](https://i.ytimg.com/vi/kCre83853TM/hq720.jpg)
Ray Kurzweil & Geoff Hinton Debate the Future of AI | EP #95
![](https://i.ytimg.com/vi/SGSOCuByo24/hq720.jpg)
Yann LeCun: Deep Learning, ConvNets, and Self-Supervised Learning | Lex Fridman Podcast #36
![](https://i.ytimg.com/vi/oF7uB78-tDs/hq720.jpg?v=6511f2d7)
Supply Chain Management Trends for 2024 and Beyond
5.0 / 5 (0 votes)