OpenAI Insider Talks About the Future of AGI + Scaling Laws of Neural Nets
Summary
TLDRこの動画は、人工知能(AI)の発展と、将来的に人工汎用知能(AGI)に到達する可能性について議論しています。特に、ニューラルネットワークのパラメーター数が人間の脳と同等になった時に、AGIが実現する可能性があることを示唆しています。また、OpenAIの研究員であるScott Aronsonが、AGIの安全性と制御の問題に取り組んでいることにも言及しています。さらに、AGIの実現が人類に与える影響や倫理的課題についても触れられています。
Takeaways
- 🧠 人間の脳には約1000兆のパラメーター(シナプス)があり、AIモデルの能力は一般にパラメーター数に比例する。
- 📈 GPT-3は1750億のパラメーターを持ち、猫の脳に匹敵する規模。完全なAGI(人工般人工知能)を実現するには、人間の脳と同等の1000兆パラメーターが必要とされている。
- ⚡ AIの進化は主にデータ量とコンピューティングパワーの増加によって促進されてきた。AGIの基本的な理論は1950年代から存在していた。
- 🔮 一部の専門家は、AGIを実現するための技術的詳細はすでに解決済みで、あとはハードウェアの進化を待つだけだと主張している。
- 🤖 遠隔労働者が行える作業はほとんどすべてAGIが代替可能になると予測されている。
- 🕵️♂️ スコット・アロンソン氏は量子コンピューター研究者からOpenAIのAI安全性・整合性の研究に携わるようになった。
- 🤔 GPT-4やOpenAIの最新モデルが1000兆パラメーターに達したという噂は、作者自身があまり信用していないようだ。
- 📝 スコット・アロンソンの研究「線形光学の計算複雑性」は、量子コンピューターが古典コンピューターで効率的にシミュレートできないことを示唆している。
- 🔑 AGIを実現する上で重要なポイントは10個程度で、それらはこれまでの様々な研究テキストの中に隠されている可能性がある。
- ✍️ 作者は今回の内容をやや散漫で分かりづらいと認めているが、AGIの話題はますます興味深くなってきている。
Q & A
人工知能はいつ人間レベルの能力に到達すると考えられていますか?
-この記事によると、人工知能のパラメーター数が人間の脳のシナプス数に匹敵する約2*10^14個になると、人間レベルの能力に達すると予測されています。ただし、この予測には幅があり、最低でGPT-3レベル、最高で現在の1万倍のパラメーター数が必要となる可能性があると述べられています。
GPT-3とはどのようなモデルですか?
-GPT-3は2020年にリリースされた大規模な言語モデルで、1750億個のパラメーターを持ち、ある程度の推論や文章生成が可能でした。この能力には多くの人が驚きを示しましたが、人間レベルの知能とはまだ程遠い段階でした。
ScottAronsonはどのような人物ですか?
-ScottAronsonは量子コンピューターの研究者で、2022年半ばにOpenAIの人工知能の安全性と整合性に関する部門に入社しました。彼は人工知能の危険性について警鐘を鳴らす一方で、その進歩を阻止すべきではないと主張しています。
パラメーター数が多ければ多いほど、人工知能の性能は向上するのでしょうか?
-はい、おおむねその傾向があると考えられています。パラメーター数が多ければ、ニューラルネットワークに組み込めるデータの量が増え、より複雑なパターンを認識できるようになります。しかし、単にパラメーター数を増やすだけでは不十分で、適切な学習データとアルゴリズムも重要になります。
人工知能がAGI(汎用人工知能)に到達するための技術的詳細は、すでに解決されていると考えられていますか?
-はい、John Carmackら一部の専門家は、AGIを実現するための技術的詳細は過去数十年の研究で概ね解決されていると主張しています。彫琢が必要なのは、計算能力とデータの確保だけだと述べられています。ただし、この見解には異論もあります。
ニューラルネットワークにおけるパラメーターとは具体的に何を指しますか?
-ニューラルネットワークにおけるパラメーターとは、ノード間の結合の重みや、各ノードへの入力に対する閾値を表すバイアス値のことを指します。これらのパラメーターを調整することで、ネットワークの出力を最適化できます。生物の脳のシナプス結合に相当します。
AA (Anthropic AI)のTIIモデルとは何ですか?
-TIIは、Anthropic AIによって開発されているAGIモデルの名称です。このモデルは、リモートワーカーが行える作業のほとんど全てを人間レベルで実行できることを目標としています。パラメーター数の予測値も示されています。
AIの性能予測において、パラメーター数以外に重要な要素はありますか?
-パラメーター数だけでなく、学習データの質や量、使用されるアルゴリズム、モデルアーキテクチャなども、AIの性能に大きな影響を与えます。単にパラメーターを増やすだけでは、必ずしも性能が向上するわけではありません。適切な設計と学習が重要です。
AGI実現の鍵は、新しい技術ブレークスルーにあるのでしょうか?
-この記事では、AGIを実現するための重要な技術的詳細はすでに過去の研究で明らかになっている可能性があると示唆されています。新しいブレークスルーよりも、既存の知識を組み合わせて適用することが鍵となるかもしれません。
この動画の主な主張は何ですか?
-この動画の主な主張は、人工知能の進化はパラメーター数の増加と適切なデータの組み合わせによって加速する可能性があり、AGIに到達するための技術的詳細はすでに存在している可能性があるということです。ただし、具体的な時期予測には不確定要素が多いことも示唆されています。
Outlines
👾 AGIに対する懐疑的な見方とGPTの能力の進化
この段落では、GPTやChatGPTのような最新のAIシステムの能力は、単なる確率モデルや関数近似にすぎず、人間の脳や意識とは本質的に異なるものだという懐疑的な見方を紹介している。しかし同時に、GPT-3やChatGPTの登場により、AIが文章を理解し、学習し、道徳的判断ができるかのように見えるようになったことで、文明に大きな変化をもたらす可能性にも言及している。
🧠 ニューラルネットワークにおけるパラメータの役割
この段落では、ニューラルネットワークにおけるパラメータ(重み・バイアス)の役割について説明している。パラメータは、ニューロン同士の結合の強さを表し、より多くのパラメータを持つモデルほど、複雑なパターンを学習できる。また、パブロフの犬の実験を例に、条件付けによる学習プロセスとニューラルネットワークのパラメータ更新の類似性を示している。
🔢 ヒト脳とAIモデルのパラメータ数の比較
この段落では、ヒト脳のシナプス数(パラメータに相当)が1000億個程度であることを示し、AIモデルのパラメータ数とヒト脳のパラメータ数を比較している。GPT-3のパラメータ数は175億個であり、ネコの脳と同程度だが、ヒト脳に匹敵するには1000兆個のパラメータが必要であると指摘している。また、パラメータ数とAIの性能には相関関係があり、ヒト並みの能力を持つAIを実現するためには、ヒト脳と同等のパラメータ数が重要であると述べている。
📈 AGIを実現するためのパラメータ数の予測
この段落では、AGI(人工般化知能)を実現するために必要なパラメータ数の予測について述べている。あるレポートでは、AGIに必要なパラメータ数の中央値が1000万億個程度と推定されている。また、Quantum Computingの研究者でOpenAIのAI安全性の研究に従事しているScott Aronsonの見解に触れ、今後のAGIの進展に対する期待と懸念が示されている。
Mindmap
Keywords
💡ニューラルネットワーク
💡AGI(人工般化知能)
💡パラメータ
💡スケーリング則
💡GPT-3
💡量子コンピューター
💡データ
💡バックプロパゲーション
💡トランスフォーマーモデル
💡パブロフの実験
Highlights
The author discusses the concept of 'deflationary claims' about AI systems, where each person believes they are the first to make such claims, but the author sees these claims as reductionistic and lacking a principle that separates AI from human intelligence.
The author mentions Scott Aaronson, who worked for OpenAI on AI alignment and safety, and his blog post titled 'Letter to His 11-Year-Old Self' where he discusses the creation of an AI that can converse like humans and the ethical implications surrounding it.
The author expresses skepticism about a leaked paper claiming that GPT-4 or another model built in 2022 has 100 trillion parameters, stating that they don't buy it.
The author discusses the concept of scaling laws and when digital neural networks might exceed the complexity of the human brain, defining AGI as the ability to perform any intellectual task that a smart human can.
The author explains the basics of neural networks, including neurons, connections (parameters/synapses), and how they are trained through forward and backward propagation to adjust the weights and biases to produce desired outputs.
The author highlights John Carmack's belief that we've had the technical details of AGI solved for many decades, but lacked the computing power, data, and internet infrastructure to achieve it.
The author cites a figure that the human brain has around 100 trillion to 200 trillion synaptic connections (parameters), which is used as a benchmark for comparing the parameter count of AI models.
The author discusses a paper that predicts AI performance by parameter count, showing that as models approach the parameter count of the human brain, they are expected to reach human-level abilities.
The author mentions a transformative model called 'AA' (possibly a codename for a specific AI model) and its ability to perform tasks that remote human workers can do, as a potential indicator of AGI.
The author presents estimates for the number of parameters required for a 'transformative model' (potentially AGI) to achieve human-level abilities, ranging from the size of GPT-3 to one quintillion (10^18) parameters, which is 10,000 times more than the human brain.
The author expresses uncertainty about how much of the information presented is believable but finds Scott Aaronson's work on the computational complexity of linear optics and his role at OpenAI's AI safety and alignment team particularly interesting.
The author acknowledges that the video might have been a bit disjointed but expresses that the topic of AI and AGI is becoming increasingly interesting, with more to come soon.
Transcripts
I'm going to call it the religion of
jism okay so so there's like the you
know there's this whole sequence of
deflationary claims right like each
person who makes them thinks that
they're like the first one right and
they you know there there's like I've
seen like like 500 different variants of
this now right chat gbt you know it
doesn't matter how impressive it looks
because it is just a stochastic paret it
is just a next token predictor it is
just a function approximator it is just
a gargantuan autocomplete right and what
these people never do what it never
occurs to them to do is to ask the next
question what are you Justa
right right aren't you just the bundle
of neurons and synapses right I mean
like we could take that deflationary
reductionistic stance about you also
right or or if not then we have to give
some principle that separates the one
from the other right you know it is our
burden to give that principle so the way
that someone was putting it on my blog
was okay you know they they gave this
giant litany you know look GPT does not
interpret sentences it seems to
interpret them it does not learn it
seems to learn it does not judge moral
questions it seems to judge moral
questions and so I just responded to
this I said you know that's great and it
won't change civilization it will seem
to change
it so the person that was talking his
name is Scott arenson and he recently
went to work for openi of the AI
alignment and safety his previous work
in research was into Quantum Computing
and so he started working for open AI in
2022 probably around the middle of the
year and by the end of the year he put
out a blog post titled letter to his
11-year-old self in it he says this
there's a company building an AI that
fills giant rooms eats a Town's worth of
electricity and has recently gained an
astounding ability to converse like
people we can write essays or poetry on
any topic it can Ace college level exams
it's daily gaining new capabilities that
the engineers who tend to the AI can't
even talk about in public yet those
Engineers do however sit in the company
cafeteria and debate the meaning of what
they're creating what will it learn to
do next week which jobs might it render
obsolete should they slow down or stop
so as not to tickle the Tail of the
Dragon but wouldn't that mean someone
else probably someone with less Scruples
would wake the Dragon first is there an
ethical obligation to tell the world
more about this is there an obligation
to tell it less and he's saying that his
job at the company is to develop a
mathematical theory of how to prevent
the AI and its successors from wreaking
havoc so that's Aron right there in 2011
this is from this paper that was leaked
uh and my take is I I think this is BS
after reading it and trying to verify
some of it I mean it's I just don't buy
it here's the thing it starts out really
good it it had me going but at some
point it kind of rapidly falls apart and
it's trying to push this idea that GPT 4
or some other model that they built in
2022 has 100 trillion parameters now
again I I don't buy it I'll post it down
below if you guys want to take a look at
it but anyways my take is is a lot of
this is is nonsense but in this PDF
there are three interesting links to
papers or things that other very
credible researchers have wrote and
specific spefically this guy talking
about AI is also really interesting so
in this video Let's briefly look at
scaling laws and when we can expect
digital neural Nets to exceed kind of
the complexity of the human brain and
basically the definition of a gii that
the author kind of states is it can do
any intellectual task that a smart human
can 2020 was the first time I was
shocked by an AI system that was gpt3 so
the world was catching up just something
that these people were interacting with
years before so people were surprised by
its ability to reason even as early as
gpt3 which gpt3 I feel like most people
haven't even interacted with this
because Chad GPT the big thing that most
people got their hands on that was GPT
3.5 kind of an updated version and he's
saying that somewhere in there there was
this massive leap because before that
Chad Bots had no ability to respond
coherently at all why was gpt3 such a
massive leap and so here we're getting
into parameter count so really fast
exactly what is a parameter so in neural
networks we're kind of replicating the
human brain so here's kind of a diagram
of a neural network these little round
things they're called neurons which is
the digital version of the neuron that's
in our brain basically these neurons
connect to each other and pass
information back and forth so let's say
there's a neuron in your brain that's
responsible for food I'm simplifying
obviously but let's say there's another
one that's it's the smell of cooking so
when you when you smell something
delicious cooking on the stove that
triggers this neuron now obviously the
actual brain is much more complicated
there's a whole it's not like one neuron
does this or that but just for
illustrative purposes like let's say
this neuron is food and this neuron is
smell of cooking each time you smell
something cooking and then you get food
the connection between these two neurons
gets a little bit stronger over time it
gets stronger and stronger and stronger
until the smell of cooking gets to be
kind of a predictive thing for you
getting food whenever you smell cooking
you know that there's food around you're
going to get food this is kind of how
your brain is able to predict the future
if you will and this is how brains work
in humans now so in dogs if a dog smells
something cooking or the smell of food
whatever smells trigger food for them
you know it might start salivating
because it knows food is coming so one
day this handsome fella decided to
decided to see if he can trick these
dogs into creating other neur neural
connections that aren't triggered by
smells but instead by something kind of
random like ringing a bell so this is
Ivan Pavlov uh if you've ever hear that
term pavlovian response that's kind of
his doing he would ring a bell every
time before he served dogs food so he'd
ring a bell and give him some food and
this would go over a course of however
along ring a bell give him food ring a
bell give the dog food it was a whole
thing they really went all in on this
now obviously beforehand if you just
rang a bell the dog didn't really have
any response to it it didn't mean
anything to the dog but after doing this
for a Time the dogs started salivating
after hearing a bell the dogs were
conditioned to salivate and expect food
whenever that bell rang by the way this
is why the office was such a great show
cuz that whole prank that Jim plays on
Dwight with the breath mints was
literally him conditioning that
pavlovian response by giving him a
breath mint every time there was a the
Microsoft Office ding or whatever but
the point here is with the dogs and
Dwight I guess as this this thing kept
happening where a bell would ring and
then he and the dog would get a treat
the actual like physical wiring in the
brain these neural connections would get
stronger and stronger so the Bell became
a stronger and stronger signal for you
know there's food coming until the dog
was like okay anytime I hear a bell that
means I get food like I was convinced of
that so in the neural Nets in the AI
weights and biases they determine kind
of the strength of that connection how
often those connections get called how
strong they are so for example before
the pavlovian conditioning of the dog
you know a bell ringing might have a
very low connection to you know getting
food the dog doesn't connect a bell to
getting food but as he keeps hearing you
know being food being food this
connection gets stronger to where
there's a stronger there's a stronger
predictive ability between the being and
the food between the Bell and the food
and all these various connections are
referred to as parameters and so the
more parameters the more connections the
more possible I guess predictive
abilities and so when we refer to the
the size of the AI model the size of the
llm we refer to it as the number of
parameters the number of total
connections and then when we train the
model when we give it data you can think
of all these as little knobs and Dot FES
that we kind of twist and turn to try to
create these connections that make sense
then we then we have our input and the
output and we try to understand like how
good is this brain this series of
connections weights and biases how good
is it at producing the response if it's
way off then we have a process called
back propagation where we go back and
kind of like flip these dials into
different positions and we try again and
these back and forward passes over time
set all the little dials and knobs into
the correct position to get the outputs
that we're looking for so I kind of
think of this as that game where you say
if you're getting hotter or colder right
so you move in a certain direction
that's the forward pass and then the
person you're playing with goes you're
getting warmer and so you do the back
propagation so that's where you know
maybe you turn in a slightly different
direction and you head in that direction
right and then the person goes oh you're
almost there you're getting hot right so
basically the hotter you get the less
changes you make to what you're doing if
they're saying oh your ice cold then you
make a lot of changes and you flip all
these dials in in different directions I
mean slightly more complicated than that
but I feel like what I've described is a
pretty good analogy and so the paper
continues deep learning is a concept
that essentially goes back to the
beginning of AI research in the 1950s
the first neural network was created in
the ' 50s and modern neural networks are
just deeper meaning they contain more
layers these are the layers so there's
just more more layers across the network
and most of the major techniques in AI
today I rooted in the basic 1950s
research combined with a few minor
engineering Solutions like back
propagation and Transformer models yeah
it's not it's just a few minor tweaks I
think most people would say that these
are kind of I mean big deals but his
point is a lot of this the idea of
neural networks isn't exactly new so
he's saying there's only two reasons for
the recent explosion of AI capability
size and data uh so maybe a different
way of saying that is just we can have
massive progress massive improvements we
we've just improving the size and the
data like this alone will create massive
progress without necessarily other
breakthroughs I think that's fair to say
and a growing number of people in the
field are beginning to believe we've had
the technical details of AGI solved for
many decades we just didn't have the
computing power we didn't have the data
and we didn't have the internet for all
the data so this is John CarMax so he
was the guy that created the original
Doom him and John Romero and so he's
he's kind of a big deal like he's
wellknown highly respected here's him
with Elon Musk here's him with Notch
that's the creator of Minecraft who sold
it for billions I think to Microsoft I
think he like single-handedly coded
Minecraft way back in the day here's him
on stage with Steve Jobs I think he
worked for meta on the whole virtual
reality for for quite some time and on
the Lex freedom in podcast he talked
about kind of this very idea and he
actually founded recently announced that
he started his own AGI lab and he's
saying like for the first time in kind
of humor history just one or a handful
of people can have like an incredible
result on the world this leverage by
creating AGI and he kind of said
something similar that we probably have
the technical details of AGI like we
we've had it solved and I believe he
also said that if you had to write write
out like all the things that you needed
to know to solve AGI it would probably
fit on a napkin like there might be 10
things that we kind of needed to solve
that would allow for egi to happen and a
lot of them are probably hidden away in
various texts and textbooks over the
past you know power many decades so this
idea that it's probably not going to
come from some brand new thing that no
one has expected but rather from
something that has been already talked
about right like just like neural Nets
you know the first one was created in
the 50s right so it's been around for a
while and so they're saying what is this
parameter well it's kind of like a
sinapse synapse sinapse however you
pronounce that so it's like a syapse in
a biological brain connection between
neurons and each neuron in the
biological brain has roughly a thousand
connections to other neurons and of
course digital neural networks are
analogous to biological brains so this
is interesting how many parameters right
synapses or parameters are in a human
brain so the figure that is commonly
cited is 100 trillion so keep that
number in M 100 trillion 100 trillion
parameters in the human brain so with
AGI we're trying to achieve something
similar to a human brain or the human
brain's capabilities the general
intelligence so in nature that's 100
quote unquote parameters so Yale
Neuroscience 100 trillion synaptic
connections human brain there are more
neurons in a single brain than there are
stars in the Milky Way and a cat has 250
billion synapses a dog has 530 billion
sinaps says sinapse count generally
seems to predict higher intelligence
this guy is now just talking crap about
cats I'm not sure how I feel about that
and he know so yeah there's there's some
exceptions for example elephants have
higher count than humans yet display
lower intelligence and he kind of
explains that that the quality of data
might answer for those uh exceptions so
human brains evolved from higher quality
socialization and communication data
than elephants but the point is syapse
count is definitely important and so
gpt2 the syapse count is less than a
mouse's brain gpt3 is approaching a
cat's brain so it's intuitively obvious
that an AI system the size of a cat's
brain would be superior to an AI system
the size of a mouse's brain so all other
things equal certainly that's the case
predicting AI performance in 2020 after
the release of the 175 billion parameter
gpt3 many speculate about the potential
performance of a model that is 600 times
larger so that's the 100 trillion
parameters kind of where it's equivalent
to the human brain so gpt3 is like.
175% so it's like a tenth of 1% of what
the human brain is in terms of
parameters so is it possible to predict
AI performance by parameter count and as
it turns out the answer is yes and so
here's the paper so saying there are
roughly 2 2 * 10 14th power synapses in
the human brain
which that's that's 200 trillion so it's
double than that Yale quoted earlier
Yale NE science so here they're saying
200 you know double what double that
amount right and this line here looks
like it's this line on the chart that's
where the parameters equal synapses in
the brain so this is kind of where that
line when we cross it over that's when
neural networks match the parameters in
of the human brain according to this
article and the dark green line that's
the next line here so again this is
where it matches the human brain this is
the tii the transformative model so this
is AA cotra and so in this little speech
she gave the introducer said AA so I'm
just going to go with that AA so this is
a ja and this is from less wrong.com
I'll post this in the show notes we
probably it's a little bit older at this
point August 2022 but we might look into
it but she talks about tii which is I
mean you can think of it as AGI so
basically kind of a similar idea so like
at what point is it going to get to the
point of maybe replacing human workers
uh or or at least being as capable as
human workers but but this kind of
jumped out of me so she was saying when
writing my report I was imagining that a
transformative model would likely need
to be able to do almost all the tasks
that remote human workers can do and
again we might do a deep dive into this
article but I got to say in my mind I
think this is a much better sort of
conceptual way thinking about AGI like
at what point can it do anything that a
remote human worker can do so basically
if you have a job where a person doesn't
need to come into the office and you
communicate through emails and they do
whatever whatever it is that they do
whether that's Excel or writing or
coding design whatever like when will
tii or AGI when can AI kind of just do
all of that or in other words if you're
a remote worker and currently you spend
half of the time that you're supposed to
supposed to be working playing hell
divers 2 at what point can you spend all
of your time playing hell divers 2 but
the point is this green line that's the
median that's the average estimate for
the number of parameters in that
transformative model that tii and the
80% confidence interval so kind of is
between these two number of parameters
so meaning at what point are we fairly
certain that we've achieved that at how
many parameters have we achieved AGI
well it's as low as gpt3 and uh and as
high as so 10 to the 18th that's one
quintilian so if I'm doing my math right
so that's 10,000 times more than what
the human brain would be so this
extrapolation shows that AI performance
will reach human level abilities as it
reaches human level size parameter count
so what we just went over I don't know
how much of it I believe I went through
it I ended up cutting most of it but
this Scott Aronson guy really jumped out
to me here's a paper that he did the
computational complexity of linear
Optics talking about giving new evidence
that quantum computers cannot be
efficiently simulated by classical
computers it's interesting that a
quantum computer guy is working at open
ai's you know AI safety and Alignment
there's some more interesting stuff
ahead I apologize if today's video was a
little bit disjointed but this whole
thing is getting uh a lot more
interesting more to come very soon
تصفح المزيد من مقاطع الفيديو ذات الصلة
松田語録:OpenAIはすでにAGIを完成している!?
OpenAI CEO Sam Altman and CTO Mira Murati on the Future of AI and ChatGPT | WSJ Tech Live 2023
【ゆっくり解説】3年後にAGIが到来する。私達は驚く程準備が出来ていない。OpenAI元社員が警告する衝撃の未来予測
擬似自己〜PFN・花王・東大 丸山宏さんのビジョン①
Will AI kill us? Or Save us?
Sam Altman: OpenAI, GPT-5, Sora, Board Saga, Elon Musk, Ilya, Power & AGI | Lex Fridman Podcast #419
5.0 / 5 (0 votes)