SHOCKING Robots EVOLVE in the SIMULATION plus OpenAI Leadership Just... LEAVES?
Summary
TLDRThe video script discusses recent advancements in AI, focusing on the rapid progress of Figure 1's AI and robotics, particularly their robot's ability to perform tasks like handling an orange using visual reasoning. It also touches on the ethical considerations of training robots by kicking them and the potential legal issues surrounding AI and copyright. The script introduces Dr. Eureka, an AI agent that automates the process of training robots in simulation and bridging the gap to real-world deployment. Additionally, it covers the departure of two executives from OpenAI and the potential implications for AI-generated content and copyright law. The video also explores the concept of multi-token prediction to improve language models' efficiency and the release of Devon 2.0, an AI agent capable of performing complex tasks. Finally, it mentions the development of wearable AI devices, such as open-source AI glasses that can provide real-time information and assistance.
Takeaways
- π€ The advancements in AI robotics are significant, with Figure 01 showcasing an AI-driven robot that can perform tasks like identifying healthy food options through visual reasoning.
- π Figure One is utilizing pre-trained models via OpenAI to output common sense reasoning, indicating a trend towards integrating AI with robotics for enhanced functionality.
- π§ The robot's ability to grapple an orange is facilitated by an in-house trained neural network, highlighting the role of neural networks in translating visual data into physical actions.
- π± Concerns are raised about the practice of 'kicking' robots for demonstration purposes and the ethical implications of training AI through adversarial means.
- π Dr. Jim Fan discusses training a robot dog to balance on a yoga ball using simulation, emphasizing the potential for zero-shot learning transfers to the real world without fine-tuning.
- π The introduction of Dr. Eureka, an LLM agent that writes code for robot skill training in simulation and bridges the simulation-reality gap, represents a step towards automating the entire robot learning pipeline.
- π Eureka's ability to generate novel rewards for complex tasks suggests that AI can devise solutions that differ from human approaches, potentially offering better outcomes for advanced tasks.
- π Two senior executives from OpenAI, Diane Yun and Chris Clark, have left the company, raising questions about the reasons behind their departure and the impact on the organization.
- π A paper by Ethan M discusses copyright issues for AI-generated content, proposing a framework for compensating copyright owners based on their contribution to AI generative content.
- π The paper suggests that the act of 'reading' or training on copyrighted material by AI models may not be copyright infringement in itself, but rather the reproduction of similar works.
- π Research indicates that training language models to predict multiple future tokens at once can lead to higher sample efficiency and faster inference times, which could significantly improve the performance of large language models.
- π§ Devon 2.0, an AI agent, is capable of performing complex tasks such as creating a website to play chess against a language model and visualizing data, although it may encounter bugs that need fixing.
Q & A
What is the significance of Figure 01's robot and its AI capabilities as mentioned in the transcript?
-Figure 01's robot, equipped with AI, is significant because it demonstrates the integration of robots and AI as the next frontier in technology. The robot showcased on '60 Minutes' is capable of common sense reasoning and can perform tasks like selecting a healthy food item over an unhealthy one based on visual cues, which is a step towards more autonomous and intelligent machines.
How does the robot in the transcript determine which object to pick based on the request for something healthy?
-The robot uses visual reasoning via its cameras to identify objects within its field of view. It is connected to a pre-trained model via Open AI, which helps it to output common sense reasoning. When asked to hand over something healthy, it recognizes the orange as the healthy choice instead of the chips.
What is the role of Dr. Jim Fan in the development of AI and robots as described in the transcript?
-Dr. Jim Fan is involved in training robots using simulations and transferring those skills to the real world without fine-tuning. He is also associated with the development of Dr. Eureka, an LLM agent that writes code to train robot skills in simulation and bridges the simulation-reality gap, automating the pipeline from new skill learning to real-world deployment.
What is the concern raised about training AI on internet footage?
-The concern raised is the ethical implications of training AI systems using footage that may have been obtained without consent, such as recording robots being kicked or abused and then using that footage to train AI. This raises questions about consent, privacy, and the potential for misuse of such technology.
How does the proposed Dr. Eureka system differ from traditional simulation-to-real transfer methods?
-The Dr. Eureka system automates the process of transferring skills from simulation to the real world, which traditionally required domain randomization and manual adjustments by expert roboticists. Instead of tedious manual work, Dr. Eureka uses AI to search over a vast space of sim-to-real configurations, enabling more efficient and effective training of robots.
What is the potential impact of GPT-5 on the process described in the transcript?
-The potential impact of GPT-5, as inferred from the capabilities of GPT-4, could be significant. It suggests that with the advancement to GPT-5, the process of sim-to-real transfer and the tuning of physical parameters such as friction, damping, and gravity could become even more efficient and accurate, potentially leading to better performance in real-world applications.
What is the main idea behind training robots in simulation as discussed in the transcript?
-The main idea is to allow robots to learn and master various skills in a simulated environment that mimics the real world's physics. This enables the robots to learn complex tasks like walking, balancing, opening doors, and picking up objects, which can then be transferred to real-world scenarios, increasing efficiency and reducing the need for physical trials and errors.
What is the role of Nvidia's Isaac SIM in the context of the transcript?
-Nvidia's Isaac SIM is mentioned as a platform where the physics of the simulated world exist just like in the real world, but it allows for running simulations at a much faster pace. This high-speed simulation capability is crucial for training robots efficiently and testing various scenarios before deploying them in the real world.
How does the Eureka algorithm contribute to the automation of the robot learning pipeline?
-The Eureka algorithm contributes by teaching a robot hand to perform complex tasks like pen spinning within a simulation. It takes the process further by automating the entire pipeline from learning new skills in simulation to deploying those skills in the real world, reducing the need for human intervention in the training process.
What is the proposed framework for dealing with copyright issues for AI-generated content as mentioned in the transcript?
-The proposed framework aims to compensate copyright owners based on their contribution to the creation of AI-generated content. The metric for contributions is determined quantitatively by leveraging the probabilistic nature of modern generative AI models, suggesting a potential solution for the debate on copyright infringement in AI training.
What is the potential impact of multi-token prediction in training language models as discussed in the transcript?
-Multi-token prediction could lead to higher sample efficiency and improved performance on generative benchmarks, especially in tasks like coding. It also suggests that models trained this way can have up to 3 times faster inference, even with large batch sizes, which could significantly enhance the development of algorithmic reasoning capabilities in AI.
Outlines
π€ Advancements in AI and Robotics
The video discusses recent breakthroughs in AI and robotics, highlighting Figure 01's advancements and the potential of combining AI with robots. Brett Adcock, the founder of Figure AI, is featured discussing a robot that uses pre-trained models to perform tasks like selecting a healthy food item. The video also touches on the ethical concerns of training robots through internet footage and the progress in sim-to-real transfer of skills, as demonstrated by a robot dog balancing on a yoga ball. Dr. Jim Fan's work on training robots in simulation is mentioned, along with Dr. Eureka, an AI agent that automates the process of training robot skills in simulation and deploying them in the real world.
π Eureka's Iterative Learning and Novel Rewards
The script explains the iterative process behind Eureka's learning mechanism, where GPT 4 is used to generate and test reward functions for training robots. As tasks become more complex, Eureka generates novel rewards that differ from human-crafted solutions, proving to be more effective for advanced tasks. The video also discusses the latest advancements with Dr. Eureka, which focuses on language model-guided sim-to-real transfer, taking the learnings from simulated environments and applying them to real-world robotics, as showcased by a robot balancing on a ball.
π Domain Randomization and Open AI Drama
The video talks about domain randomization in simulations to prepare robots for the imperfections of the real world, making them more robust. It then shifts to discussing recent departures at Open AI, where two senior executives left the company. The content also covers a paper by Ethan M on dealing with copyright issues for AI-generated art, proposing a framework for compensating copyright owners based on their contribution to AI-generated content.
π§ Multi-Token Prediction in Language Models
The script delves into a paper suggesting that training language models to predict multiple future tokens at once can lead to higher sample efficiency and better performance on generative benchmarks like coding. It discusses the potential for faster inference times with large models and the implications for algorithmic reasoning capabilities. The video also mentions a speech by Andrej Karpathy, who has worked on AI agents and large language models, and his insights into the development and future of these technologies.
π‘ Devon 2.0 and AI's Impact on Software Development
The video discusses the release of Devon 2.0, an AI agent capable of performing tasks like creating websites and data visualizations. It provides examples of Devon's capabilities, including handling multiple asynchronous tasks and generating live apps. The speaker expresses skepticism about extreme views on AI replacing software developers, suggesting that Devon and similar AI agents will likely improve the efficiency of developers rather than replacing them.
πΆοΈ Wearable AI Devices and Their Potential
The video introduces a wearable AI device that uses augmented reality and artificial intelligence to assist users in various tasks. The device, called Frame, has an AI assistant named Noah that can orchestrate a potentially infinite number of AI models. It demonstrates the device's ability to provide fashion advice, find matching clothing, and generate images based on the user's surroundings. The speaker expresses enthusiasm for the potential of such devices to become a part of everyday life.
π¦ Frame AI Glasses Shipping Update
The video concludes with an update on the shipping status of Frame AI glasses, which are set to start shipping to customers in the coming weeks.
Mindmap
Keywords
π‘AI News
π‘Figure 01
π‘Simulation-to-Real Transfer
π‘GPT-4
π‘Domain Randomization
π‘Eureka Algorithm
π‘Wearable Device
π‘AI Agents
π‘Multi-Token Prediction
π‘Copyright Issues for AI
π‘Devin (AI Assistant)
Highlights
Figure 01, an opening eye backed robot, is advancing rapidly in AI integration.
Brett Adcock, founder of Figure AI, discusses the potential of robots plus AI as the next great frontier.
AI on Figure One's robot demonstrated common sense reasoning during a 60 Minutes feature.
The robot's neural network maps camera pixels to robot action at 10 Hz.
Dr. Jim Fan's team trained a robot dog to balance and walk on a yoga ball using simulation, requiring no fine-tuning for real-world application.
NVIDIA's Isaac GYM allows for AI training at accelerated speeds, similar to Dragon Ball Z.
Dr. Eureka, an LLM agent, automates the pipeline from new skill learning to real-world deployment.
Eureka's algorithm teaches a FiveFinger robot hand to spin a pen in a simulation.
GPT-4 is utilized to improve the robot training process through positive and negative reinforcements.
Eureka generates novel rewards, offering solutions that surpass human-generated models for complex tasks.
Dr. Eureka's language model guides sim-to-real transfer, enhancing real-world robot performance.
Open AI faces internal changes with two senior executives leaving the company.
A framework is proposed to compensate copyright owners for contributions to AI generative content.
Multi-token prediction in language models could lead to faster and more efficient AI systems.
Andrej Karpathy discusses the potential impact of new AI research on large language models.
Devin 2.0 showcases capabilities in creating complex applications like a chess-playing website and a data visualization map of Antarctica's sea temperatures.
Devin's asynchronous task handling allows for multiple tasks to be processed simultaneously.
Brilliant Labs introduces an open-source, lightweight AI glasses platform with an AI assistant named Noah.
Noah, the AI assistant in the glasses, can generate images and perform tasks based on the user's visual input.
Transcripts
there's some massive AI news on the
horizon and it's quite
shocking first on to scene is figure 01
the opening eye backed robot that is
advancing quite rapidly here's Brett
Adcock the founder of figure1 of figure
AI the company he's saying robots plus
AI are the next great Frontier we demoed
some of the latest AI on our robot
during 60 Minutes last week I'm not sure
if 60 Minutes is a Twitter account or
what but seems big figure one is
connected to a pertained I'm guessing
that's pre-trained model via open AI to
Output Common Sense reasoning so when
bill says hand me something healthy the
robot by visual reasoning via robot
cameras knows this is the orange in the
scene and not the chips the behavior of
this robot grappling the orange was an
in-house trained neural network the
models is mapping camera pixels at 10 HZ
to robot action
speaking of robots you know how like we
tend to kick them for demonstrations
like we got to stop doing that now it's
even worse we're kicking whatever
they're standing on from underneath them
or trying to at least we're doing that
we're recording the evidence we're
taking the footage we're putting that
footage on the internet and then we
train the AI brains on the internet like
is no one else concerned about this no
just me all right fine so Dr Jim fan is
saying we trained a robot dog to balance
and walk on top of a yoga ball purely in
similation and then transfer zero shot
to the real world no fine-tuning just
works and we've been talking about this
on this channel for quite a bit this
idea of training robots in simulation
having them figure out the skills to to
walk to move to balance to open doors to
pick up and transfer packages around
something like nvidia's Isaac JY can run
these assimilated worlds where the
physics exists just like they would in
our world but they're able to run it at
10,000 times the speed like Dragon Ball
Z and he continues I'm excited to
announc Dr Eureka an llm agent that
writes code to train robot skills in
simulation and writes more code to
bridge the difficult simulation reality
Gap it fully automates the pipeline from
new skill learning to real world
deployment if you recall when we covered
the original Eureka paper on this
channel you realize how past this is
moving the yoga ball task is
particularly hard because it's not
possible to accurately simulate the
bouncy ball surface yet Dr Eureka has no
trouble searching over a vast space of
sim tooreal configurations and enables
the doc to steer the ball on various
terrains even walking sideways
traditionally the Sim tooreal transfer
is achieved by domain randomization a
tedious process that requires expert
human roboticists to stare at every
parameter and adjust by hand that was
the big breakthrough that we saw in
Eureka instead of tedious work done by
humans we just quote unquote Outsource
it to GPT 4 what do you think happens
when we have GPT 5 in the scene might
this process get better maybe Dr Jim fan
continues Frontier lmms like GPT 4 have
tons of built-in physical intuition for
friction damping stiffness gravity Etc
we are mildly surprised to find that Dr
can tune these parameters competently
and explain its reasoning well Dr Eureka
Builds on our work on our prior work
Eureka the algorithm that teaches a
FiveFinger robot hand to do pen pen
spinning it takes one step further on
our quest to automate the entire robot
learning pipeline by an AI agent system
one model that outputs strings will
supervise another model that outputs
torque control all right so here's
Eureka this was the original one right
so this is not what he's talking about
this is the predecessor let's see when
this was published because it wasn't
that long ago so looks like revised
April 30th looks like originally
appeared October 2023 and Eureka did
something really cool it taught a robot
hand this Shadow hand inside a
simulation right inside of something
like nvidia's Isaac Jim to to spin a pen
or a pencil you know how kids do in
school sometimes in various different
right directions
upside down right side up Etc different
configuration of fingers Etc which has
never been done before it was a very
complex problem but how they did it was
perhaps even more interesting because
what they did is they set up this
architecture where so this is GPT 4
right so large language model in this
case GPT 4 keep this in mind this this
section is uh this area is improving
constantly so this was you know at the
end of last year this is how good it was
where is this going to be a year from
now 5 years from now right so we have
the environment code and the task
descriptions the environment code is the
Isaac gem so we're writing reward
functions for these robots we're
basically saying when it does something
right we give it an attab boy or if it's
doing something wrong we write in a
penalty we're trying to teach it how to
do things with positive and negative
reinforcements it's the environment code
we feed it into GPT 4 and GPT 4 produces
sampling on the various reward functions
that it could run so it just produces a
bunch of them right and they're they're
getting tested they're put into the GPU
accelerated reinforcement learning so
the simulation if you will right and
these shadow hands right sit there in
the simulation and uh spin these things
at you know 10,000 times the speed of
the flow of time in our universe if you
will right and then we get the results
back and we put those results back we
provided two GP pt4 and they're like
well here's how well your code did right
and it's saying please carefully analyze
the policy feedback and provide a new
improved reward function right so it
does it we test it we give it back to it
and we say here's your results do better
and it does it again and we're like do
better and it does it again and we're
like do better and it figures out how to
train these robots better and better
over the different iterations it keeps
improving so these are the Eureka
iterations right so you see as it goes
from 1 2 3 4 5
the success rate right it keeps
improving but there's one last thing
that's going to blow your mind and that
is this Eureka generates novel rewards
so I encourage everybody to read this
paper because I'm not going to do it
justice by explaining it simply if you
want to dive deep into it I strongly
suggest you read it but basically it
seems like for simple tasks you know
both the AI trained model and the human
results right they're they're similar
both the AI and the humans we write
similar types of code for these reward
models but where it gets interesting is
as those tasks get more and more
complicated right so for example with
pen spinning or this thing where it's
has that Rubik's Cube or whatever it's
like rotating it or these little
helicopters quadcopters whatever you
want to call them so for those more
complex tasks the harder task is the
less correlated the Eureka rewards and
the rewards that it writes they're novel
so that means as the tasks get harder
humans have a hard time coming up with
the good Solutions the AI comes up with
different solutions than the humans
would come up with and those Solutions
are better for those Advanced tasks if
you're getting what I'm saying it's
probably kind of blowing your mind but
but getting back to this new thing that
just came out Dr Eureka so it's language
model guided simp to real transfer now
we're beginning to to take the learnings
of those robots in simulation and we're
applying it to the real world and we're
getting this dog that's able to balance
on top of a ball and as far as I can
tell I mean I don't think these are
cherry-picked results results I'm sure
fell off once or twice during testing
but I got to say number one certainly
it's better than any human can balance
on something like that but I almost feel
like we're getting to the point where
maybe it's better than any animal I mean
it's either already better than any
animal that size good balance on it I
mean without like using its claws to dig
in notice it's pretty smooth the legs
are smooth right so maybe it's got some
rubber padding on there for friction but
it's not really gripping the ball it's
specifically take a look at this one
right here on the right in the center
how what it does when it starts slipping
off of the ball there it goes so it's
kind of like getting faster and faster
almost overcorrects still catches itself
and still gets to kind of the center
it's it's insane anyway so this is the
Dr Eureka component so it's probably
very similar to the original Eureka I
haven't read the paper yet but I I am
planning to we might have to do a deep
dive because this is absolutely out of
this world but as far as I can tell so I
mean we're taking this simulation right
similar to what we did with Eureka in
fact this is the Eureka reward design
initial policy learning so that's what
we covered previously in the Eureka
right and that reward function that gp4
or whatever comes up with that gets fed
into this final Sim toore policy
learning right so it learns those things
in the simulation and then that that
code that neural Nets that sort of data
gets loaded into real world deployment
and here they have kind of the uncut
deployment video so we'll just take a
look at this for just a little bit so
here I mean it's looking very good and
they have more of a long shot of this
thing moving around I'll post it below
if you want to take a look at it but but
I kind of fast forwarded through this
thing as far as I can tell it never
falls off it stays on that ball walking
for 5 minutes straight doesn't mess up
doesn't fall off and here's one where
they're actually deflating the ball as
it as it goes I mean the sound is on
just because you can hear the the thing
deflating so it's rapidly deflating as
the thing is balancing and here I'll
skip forward just a little bit so it's
like maybe 80% inflated and it's still
managing to stay on top of it even
though it's a completely kind of
different environment and it looks like
finally kind of falls off and I think
that has uh something to do with the
domain randomization so basically if you
saw those deep mind uh soccer robots
basically one of the things they were
talking about there is they randomize a
lot of the little things that go into
that simulation that world right so
they'll slightly randomize the friction
coefficient they'll slightly randomize
you know how much current electrical
current is going to the robot's left leg
versus its right leg right so just like
in the real world not everything is
perfect so we can't assimilate it as
being perfect so we throw these random
minor tweaks in there of various physics
that the robots have to deal with and
what they're saying is that that tends
to make and that tends to make the
robots very robust when they emerge in
the real world they're able to deal with
I mean as you saw there if the thing
flating it's able to adapt anyways we
have to come back and do the whole thing
but I got to say at one point these
things are going to get too fast too
good and they might want to have a chat
with us about all the stuff that we've
put them through so I just want to say
for the record I West R I am against
this practice this is bad I do not stand
for this I am and always have been on
the robot side for the record don't kill
me next up we have more open AI drama
open AI is nothing without a drama two
senior openi Executives leave the
company two senior openi Executives VI
vice president of people Diane Yun and
Chris Clark head of nonprofit and
strategic initiatives left the company
earlier this week a company spokesperson
said and yeah it's strange to hear this
because you know both Executives were
among the most long tenured managers at
the developer of Chad GPT at openi right
recently worth 86 billion in an employee
share sale so I mean if you've been with
the company from the beginning it grew
to be worth 86 billion what would make
you leave so that's according to the
information I'm not going to to read the
article because I mean it's their scoop
it's the work that they did so I want to
be respectful of that I'll link it down
below the information is subscription
based but if you're into AI I got to say
and I'm not being paid to say this but
that's the only subscription to like a
publication that I have I think yeah
like if you think about news
Publications I I'm certain this is the
only subscri description that I have
right now Ethan M posts a paper about
how to deal with copyright issues for AI
arts and this is interesting somebody
has a plain English rewrite of that
paper we're probably going to see more
of this so has kind of an overview plain
English explanation technical
explanation critical analysis conclusion
so I'm not going to read the AI
generated stuff because well it could be
really good it could be really accurate
but you really got to make sure you do
it correctly cuz it's very easy to get
nonsense but this is the abstract from
the actual paper so this isn't AI
generated this is the actual abstract
from the actual paper so it's saying
general you know AI systems are trained
on large data there's a growing concern
that this main Fringe on copyright
interests right so if you're if m journe
is training on artists images right
those artists now have a competing AI
model and they're not getting paid for
it and there's been a whole debate about
whether that's correct or not right so
the people that are saying this is
copyright infrigment of course are
saying well you can't train the model
not data whereas the people kind of argu
against it are saying copyright
protections were always about
reproducing something not reading it
right so like George R Martin the writer
of Game of Thrones was a big fan of
tolken he even borrowed the whole RR
like JRR tolken we know he read the
works and he created something similar
but we're not saying it's copyright
infringement even though we know he read
tolken works so the reading is an
infringement but he if he produced
something very similar that would be
infringement and so I think the big
question is how we're going to treat AI
models right does AI does an AI model
reading something is that copyright
infringement because we we haven't
really applied that criteria before so
it's it's an interesting legal situation
that will be unfolding and basically
here they're proposing a framework that
compensates copyright owners
proportionately to the contribution to
the creation of AI gener content the
metric for contributions is
quantitatively determined by leveraging
the probabilistic nature of modern
generative AI models from cerative Game
Theory and economics so it sounds like
basically if let's say you're an artist
and your artwork comprises 1% of the
data the model is trained on and then
they find that that model tends to use
that data that portion of the data or
whatever there's a certain probability
of relying on that to produce artwork
then maybe there's a formula I haven't
read the space
and maybe something like this will be
adopted in fact this is what people are
talking about but again this
presupposes that reading something or
looking at something by these AI systems
is breaking copyright now of course if
you've ever had a website you know how
many crawl Bots hit your website you
have Google and Bing and nowadays a lot
of open Ai and just tons of other stuff
right so they they're constantly
crawling your site right we're not
saying that's copyright infringement but
if they copy and paste it to their own
site word for word then it becomes
copyright infringement right if I make a
photocopy of a book that's not copyright
infringement if I start selling those
copies then that is copyright
infringement so again I'm not arguing
one way or another but we got to get
some clarity on is reading something is
training on something is that in itself
copyright infringement or is copyright
infringement strictly the act of
reproducing something that that's
similar and this is Elvis Omar SAR so
we've looked at a few of his uh he
covers various papers in Ai and I think
he just recently launched a YouTube
channel so Elvis saravia so really good
person to follow really enjoy his stuff
so he breaks this paper down I'll link
everything down below if you want to
follow him but basically my
understanding is that the paper is about
predicting multiple tokens that comes
next so tokens are the smallest unit
that these large language models like
Chad GPT GPT 4 that they deal with right
and usually it's a word right the is one
token sky is one token is is one token
but sometimes bigger words are multiple
tokens or if you have a misspelling that
could be multiple tokens so if I recall
correctly like a a million tokens will
be
750,000 English words on average so
that's kind of the rough breakdown uh
and that can vary of course but how
these large language models work right
so we say the sky is right and it uses
the neural networks to kind of figure
out okay like what's the most likely
word that comes next is it the No
usually no blue you know is out of all
the choices is the most likely one the
sky is blue and that's kind of the
underlying mechanism for a lot of this
it's prediction when they say inference
it means prediction sort of the output
of these models but this paper is asking
what would happen is if instead of one
token prediction so the sky is blue one
token what if we ask to predict multiple
tokens and this could be kind of a big
deal because as Elis says the most
exciting paper of the week was that one
from glockley at all that aims to train
better and faster LM via multi- token
prediction it's an impressive research
paper but it sounds like in this work
this is the paper that we're talking
about better and faster language large
language models via multi- toen
prediction or I guess multi- token
prediction is the right emphasis there
in this work we suggest that training
language models to predict multiple
future tokens at once results in higher
sample efficiency gains are especially
so they're saying it works well that's
that's the point of this paper they're
saying these gains in in Effectiveness
and efficiency are especially pronounced
on generative benchmarks like coding
coding is kind of a big deal as far as
llms are concerned and they're saying
where are models consistently outperform
Strong baselines by SE several
percentage points right the smaller 13
billion parameter model solves 12% more
problems on one Benchmark 177% more on
another experiments on these like
smaller tasks demonstrate that this new
approach the multi- token approach is
favorable so it works better for the
development of algorithmic reasoning
capabilities and as an extra benefit
these models are up to 3x faster are at
inference even with large badge sizes so
meaning the outputs are three times
faster even when we're working with
large amounts of data with large amounts
of output there's a speech by Andre
karpathy he used to work on Tesla's
autonomous driving then he went to open
Ai and recently he left open AI or not
recently I think it was last year and
started his own YouTube channel also
another great resource for understanding
how these large language models work he
in the past worked closely with Dr Jim
fan and his main kind of area of work is
AI agents but he was saying how when he
was working for open AI they have their
internal slack Channel where they're all
messaging and I guess they post papers
like this as they're coming out and I
think he was saying that you know on an
internal slack Channel they'll say
things like oh yeah no we try that
that's never going to work that's a dead
end or they'll be like oh yeah they're
onto something here so in other words it
sounds like you know for us we see
something like this and you know time
will tell we'll see how big of an impact
this will make on large language models
maybe it'll be huge and it drastically
improve how good they are how fast they
are how easy it is to build them or
maybe it'll be a dead end but there's
this slack Channel at open AI where
they're kind of like they just know
ahead of time since they are kind of
more advanced than than a lot of the
stuff that that we're seeing which kind
of interesting to think about I think
next we have the release of Devon 2.0
now I don't have access to this uh and
you know as we've talked about on this
channel you know a lot of people think
oh Devon has been debunked or whatever I
I I have my doubts I have I really have
my doubts that it's vaporware of or or
anything like that as more and more
people are getting their hands on it I
mean the results seem to be good right
now it's important to understand that
right there's some people that are kind
of on this spectrum and they're saying
like Devon these AI agents they're just
they're nothing right they're completely
useless and there's no applications and
it's just all fake right and there's
people kind on this side they're like
that's it this is the end of you know
this will put 100% of software
developers out of business out of work
etc right and the reality is I mean
neither one of those stances is correct
I mean if there's one thing we know is
the the hyperbolic stuff is probably not
correct I think Devon and these various
AI agents I mean they're probably like
somewhere here right and then as they
improve they'll probably move like
somewhere here right maybe over time
they'll get to 100% but we're seeing
this progress kind of happen kind of
fast are they going to put people out of
business or are they going to just aent
how well software developers can work
you know we'll see pretty soon what I'm
saying is don't expect a complete flop I
guarantee you it's not we've seen people
credible people posting their results of
Devon it obviously does some cool stuff
we've also seen it do some not so smart
stuff right so also don't expect this
where it just takes over the whole world
overnight it's somewhere here and and
we're going to I see exactly where it is
as it gets rolled out so here Andrew gal
he's saying my first task I asked course
this is I think he's talking about Devin
2.0 or maybe it's either the original or
the new and improved version actually
he's saying this is so March 12th so I
think this is the original so he's
saying he asked it to create a website
where you play chess against a large
language model you make a move the moves
communicate to GPT 4 gp4 replies Etc
right so here was the prompt given to
Devon Devon starts working and D if
you're not aware so it has like four
windows it's got like it's to-do list
where it's breaking the tasks into
subcategories it's got a browser
developer developing environment and a
way to chat back and forth with with you
the user and it kind of works on its own
time right so it kind of goes off and
does stuff and it might run for a while
right so it's not like immediate
response it can do long Horizon tasks
and so it looks like Devon goes through
tons of stuff to figure out how to build
this thing asking for API Keys it looks
like and then you know it's currently
trying to debug something with the
chessboard not rendering but in the mean
time asked it for you know this other
task which is interesting right so you
can have multiple sort of asynchronous
tasks running also he's asking to create
a map of Antarctica sea temperatures in
last 50 years it looks like it opens up
web pages that it needs to find the
information that it needs so Andrew had
to specify to make the hot temperatures
blue blue and cold temperatures red
which is unintuitive right so giving it
a kind of a weird prompt not something
that's expected and Devon notes it notes
the color preference saying ohow
deployed to net netlify automatically a
live app and now the first version is
fully functional but it has a bug which
is to be expected now let's see if D can
fix it but in conclusion it got stuck on
the chest thing but it worked for
Antarctica data Vis visualization even
deployed it here's the kind of live app
that you can go and check out it nailed
the prompt to create a fully functional
Chrome extension that turns a GitHub
repo into a claw prompt so let me know
in the comments what you think of what
Devon could do CU to me it almost seems
like a what is it called a roar shark
test right we're all kind of looking at
the same thing and we're all seeing just
drastically different results different
things cuz yeah again some people are
like oh it's horrible it's useless right
it's zero this will have no effect some
people are like oh put everybody out of
a job but I feel like those two extremes
are not reasonable right I mean it's
obviously somewhere in the center it's
doing cool stuff it's doing stuff we
would think is impossible like 5 years
ago and it still kind of sucks at some
of it and can't quite figure out how to
get around certain issues but this thing
isn't released to the public yet right
we're judging this thing by right it's
like pre relase capabilities of this
brand new tech never-before seen
technology I don't know comic crazy but
I think it's going to have a big impact
take a look at the founders uh if you
look at who the founders are I mean
they're pretty like insanely impressive
people take a look at the backers take a
look at the invest the investors they're
they're people who let's say tend not to
lose a lot they they tend to invest in
in unicorns and and head it out of the
park so I don't know I have no idea
what's going to happen but I would not
be betting against Devon and finally
I'll leave you off with this
yet another wearable device and yes I
know we're all kind of maybe hesitant to
get too excited about these wearable
devices this one looks weird but fear
not I was concerned too this orange
piece in the center I think that's just
like a placeholder to make sure that
it's you know for shipping uh purposes
this does does get removed this is not
what your face is going to look like
when you wear them I'm putting it
towards the end because I know some
people don't like me showcasing products
they think I'm getting paid to Showcase
this I I'm not taking any payments it is
interesting to see what's coming a lot
of these things will be flops a lot of
them won't be good but some of them will
be some of them will stick and become
part of our everyday lives and I
personally always wanted something that
could whisper comebacks in your ear if
you remember that Seinfeld episode where
somebody insults George Costanza and
later he's driving down the road and he
comes back he comes up with the perfect
comeback and so he recreates the
scenario days later just so he can get
up and say well the jerks are called and
they're running out of you and it was
the it was supposed to be the perfect
comeback it wasn't but my point still
stands imagine a life where if somebody
says something that's annoying or
insulting immediately your AI assistant
pops into your ear I believe these are
little microphones here on the side
these microphones I'm not sure those
might be bone conduction microphones so
they transmit the vibrations right so
your AI assistant Whispers The Perfect
comeback in your ear and you just
immediately recite it I don't know I
would pay for that seems useful make
sure you're subscribe to this channel
we're rolling out something really cool
right now so you don't want to miss it
like for real you don't want to miss it
I'll leave you off of this quick video
from frame the open source AI glasses
for developers hackers and
superheroes my name is Wes rth and thank
you for watching hi I'm bobac co-founder
of brilliant labs and today I want to
share with you a device that is as
simple as it is capable this is
frame frame sets theater section um
artificial intelligence and M mented
reality it is designed to be open source
light and affordable and we believe this
is important as we enter a new era of a
Bonnie
daily our in frame is Noah Noah is our
fun little AI assistant in the cloud and
its job is to orchestrate acrost
potentially infinite number of AI models
and new models can always be added to
his Library either by us or by a TR
developer but let's take Noah for a
spin you Noah you ready to show people
what you're capable
of absolutely let's do it Noah can see
what I
see take a look at what I'm wearing am I
dressed appropriately for the
weather a Navy Bloom hooded sweatshirt
should be suitable for the part Le
conditions withl a current temperature
of 72
F now let's take this one step further
find me a matching BD and tell me where
I can buy it and how much it's going to
cost yes it say you can find a nabe blue
beanie at Target Macy's San frano Prices
range from $7.99 to $65 depending on the
brand and style cool now let's make this
actionable for me
choose the closest location and tell me
the address so I can walk
there the closest location is San frano
located at 505 to sedo
Street great now Noah can also generate
images and we thought this would be
super fun because gener VII is news
especially when it comes to AI being
embodied and seeing what we see in the
world around
so let's try that out I'll just double
tap on the side
here show me in that navy blue beanie as
a spring typ superhero
a frame is exiting the factories
starting this week and heading toward
fulfillment customs in the weeks that
follow
Browse More Related Video
Will "Claude Investor" DOMINATE the Future of Investment Research?" - AI Agent Proliferation Begins
ChatGPT Can Now Talk Like a Human [Latest Updates]
Can You Train an AI to Think Exactly Like You?
IMT - AI (1) - Apa itu AI?, Sejarah, dan Penerapan AI
Microsoft Reveals SECRET NEW MODEL | GPT-5 DELAYED | Sam Altman speaks out against "Doomers"
STUNNING Step for Autonomous AI Agents PLUS OpenAI Defense Against JAILBROKEN Agents
5.0 / 5 (0 votes)