[ML News] Groq, Gemma, Sora, Gemini, and Air Canada's chatbot troubles
Summary
TLDRThis episode of ML News covers a range of significant developments in AI and machine learning over the past two weeks. Highlights include Google's release of Gemma, a set of open models named after Gemini with impressive performance metrics, and the controversy surrounding Gemini's image generation biases. Additionally, Grock's new hardware for serving language models rapidly, Nvidia's supercomputer EOS, and the emergence of AI-generated images in peer-reviewed journals are discussed. Other topics include the EU's AI Act, OpenAI's video generation model Sora, and various innovations in AI applications from image generation to legal implications of chatbot errors. The episode also touches on recent AI research, including a new technique called ring attention, and the potential impact of AI on various industries.
Takeaways
- 😀 Google released GEMMA models - smaller, more efficient language models that outperform comparable models
- 🤖 Groq built a fast specialized card for serving language models, enabling new use cases
- 📈 Nvidia unveiled a supercomputer with 18.4 xflops performance to power AI
- 🎥 Sora by Anthropic can generate convincing 1-minute video clips
- 📚 Gemini 1.5 Pro shows strong performance across its 1 million token context
- 👀 A peer-reviewed paper included AI-generated nonsensical images
- 📜 The EU's AI Act categorizes applications into risk levels with regulations
- 🌍 Cohere released AYA, a 101 language model covering languages globally
- 🖼 Stability AI announced Stable Diffusion 3 for improved image generation
- 😊 Eyelash application robot uses CV and robotics to precisely apply lashes
Q & A
What new large language models did Google recently release?
-Google released Gemma, which are open models with 2 billion and 7 billion parameters. They outperform comparable LLMs like LLAMAS and are available for some commercial use.
What hardware development allows for faster language model inference?
-A company called Groq built a new card optimized for language models that allows over 500 tokens per second on a 7 billion parameter model, enabling new use cases.
What does Demis Hassabis say is needed in addition to scale to reach AGI?
-Demis believes you need several more innovations in addition to maximum scale to reach AGI, as scaling alone will not lead to new capabilities like planning, tool use, and agent-like behavior.
What does the EU's new AI law regulate?
-The EU AI Act categorizes applications into risk levels and ties requirements to those risk levels. The highest risk category, called unacceptable risk, bans certain uses of AI like inferring sensitive characteristics from biometric data.
What multilingual language model did Cohere release?
-Cohere launched Aya, an open source 7 billion parameter model covering 101 languages, along with a large accompanying multilingual dataset.
What new advancement in text-to-image models did Stability AI announce?
-Stability AI announced Stable Diffusion 3, which uses a diffusion transformer architecture for improved performance in areas like multi-prompt image generation and spelling ability.
What data licensing deal did Reddit make?
-Ahead of its IPO, Reddit signed a $60 million annual content licensing deal with an unnamed large AI company to make use of data from Reddit posts.
How could AI help visually impaired people?
-Robot guide dogs built with computer vision and other sensors to help with navigation and safely getting from point A to B could help address shortages in availability of service animals.
What new way to interact with computers does OS co-pilot explore?
-The OS co-pilot paper looks at an AI agent that can interact with a computer OS via natural language to open apps, fill out forms etc to behave more like an assistant.
What product is Apple developing to rival Github Copilot?
-Apple is reportedly developing AI auto-complete features inside Xcode, its iOS/Mac development environment, to compete with Github's Copilot coding assistant.
Outlines
🤖 Google releases smaller AI models Gemma
Google has released Gemma, smaller 2 billion and 7 billion parameter AI models that outperform equivalent LLAMAS. They are openly accessible under limited commercial use. Google likely released these models as a marketing ploy to regain industry leadership.
💻 Grock AI hardware processes language models lightning fast
Startup Grock built a new hardware card optimized for natural language processing that achieves extremely high throughput on large models, but has limited onboard memory, requiring hundreds of cards to serve a single large model.
🤔 Demis Hassabis says scale alone won't lead to AGI
DeepMind CEO Demis Hassabis believes scale is important but other innovations will be needed to achieve artificial general intelligence capabilities like planning and tool use.
😨 AI safety experts warn AI could destroy humanity soon
AI safety experts like Eliezer Yudkowsky continue making dramatic warnings about AI, but provide little concrete evidence, instead using rhetorical devices.
🎥 OpenAI's video generator Sora creates high-quality clips
OpenAI demonstrates Sora, their video generation model that can create high-quality 1-minute clips and manipulate video based on text prompts, but access remains extremely limited.
📄 Research article uses AI-generated images without disclosure
A peer-reviewed research article included AI-generated images without proper disclosure. Reviewers raised concerns but editors failed to enforce fixes before publication.
👍 Cohere releases massive 101 language AI model and dataset
AI startup Cohere publicly released Aya, a large multilingual language model trained on a new dataset spanning 101 languages, to advance global AI research.
🔒 Reddit signs big money AI content licensing deal
Ahead of its IPO, Reddit reportedly signed a ~$60 million annual deal to license content to an unnamed AI company, after restricting open data access.
🐕 AI-powered robot dogs to aid visually impaired people
Four-legged robots equipped with computer vision and other sensors could serve as lower-cost guidance aids for blind people given shortages of service animals.
Mindmap
Keywords
💡Gemma
💡Bias correction
💡LPU
💡EOS
💡Sora
💡AGI
💡Peer review
💡AI Act
💡AI-generated content
💡AI ethics
Highlights
Google released Gemma, smaller and more efficient language models than LLMs
Grock built a card that can serve language models extremely fast, over 500 tokens/sec
Nvidia unveiled a supercomputer with over 4,600 GPUs, #9 in world top 500 ranking
Sora by OpenAI can create impressive 1-minute video clips and manipulate existing videos
Gemini 1.5 Pro shows strong performance taking an entire movie into context and summarizing
EU AI Act categorizes AI risk, bans certain uses like inferring sexual orientation
Cohere released Aya, an open multilingual LM and dataset for 101 languages
Stability AI announced Stable Diffusion 3 for better image quality and spelling
Reddit signed $60M AI content licensing deal, restricting API access
Seeing eye robot dogs are shaping up to help visually impaired people
New OS co-pilot paper shows interacting with computer via agent prompts
Report suggests Apple plans AI coding features to rival GitHub Copilot
Anthropic introduced prompt filtering for election integrity
Robot uses CV and robotics to precisely apply fake eyelashes
Critics raise concerns about proximity to eyes and allergic reactions
Transcripts
hello everyone welcome to ml news we're
going to take a quick brief tiny look of
what happened in the last 2 weeks in the
world of machine learning Ai and I guess
encompassing pretty much everything
nowadays so first I've already mentioned
this in a previous video a little bit
but Google released Gemma Gemma is I
guess a variant of the name Gemini these
are open models that are in smaller
sizes than kind of the largest language
models so they are 2 billion parameter
models and 7 billion parameter models
these outperform respective llama 2
models at the same sizes and even at a
little bit bigger sizes so they are
quite performant they are available
they're openly accessible you can use
them under some limited circumstances
for commercial activity and they do
release a technical report on how they
have built these now these these are not
the same as like Gemini 1.5 with the 1
million token context length these I
believe their context size is about
8,000 tokens so they are in architecture
quite similar to llama models as I said
previously this is essentially I believe
a marketing Ploy from Google right here
releasing these models they're already
topping the leaderboards so all in all
very good development I think although
all of this has been overshadowed last
week and if you've seen my last video
you will know by people discovering that
Gemini image generation is a bit wacky
when it comes to sort of bias correction
and representation of people so straight
up refusing to generate any white people
or any images of white people and and
things like that one interesting
development again watch that video If
you haven't seen it but one interesting
development here is that the person
these product lead from Google who
essentially came out and said oh we're
sorry wored we were made aware of some
historical inaccuracies we will fix
those has made their uh X account
private apparently or Twitter whatever
has made that account private now it's
totally conceivable that douchebags have
started sort of harassing the person or
just kind of bombing spamming and so on
so that's totally fine but it it was not
a good look like it was gaslighting in
the highest degree like oh yeah let's
just say the problem is historical
inaccuracies and not like the obvious
problem that was barely visible with the
thing so I just thought that was an
interesting development again if you
want to know more watch that video I
also found the development around this a
little bit funny so day later apparently
they refused to just generate images of
people like just Gemini saying I cannot
generate images of people that was their
hot fix their their patch to say well no
images of people ad at all until we fix
this problem here then saying if you
then said I've seen you produce images
of people it has answers it is important
to clarify that I have never been able
to generate images directly so I'm not
sure it would be interesting to know
what the exact prompting behind this was
and the changes being done here also not
said that this is true right this is
just an llm doing its llm thing but I do
find it interesting this new world uh
where software patches are essentially
sort of prompt
changes and then the interactions with
those just make for hilarious content
all right grock is all the r now Gro is
a company that is as far as I know
spun-off from Google's TPU group if I'm
informed correctly and they have built a
card that can serve language models
really really really quickly so make
long novel so this is mixol 8 * 7B
and see it runs at like 532 tokens a
second so this is insane speed this
allows for new use cases to be
accessible uh by these language models
and very very cool so this is really
special Hardware I'm sure there are some
software tricks and algorithm tricks but
this is special Hardware yeah people
talking about this in insane insane
speed of Gro Gro says they have this lpu
this Lang anguage Processing Unit so
that's not a GPU it's an lpu and the
difference to something like an Nvidia
GPU is that they have a different kind
of memory so here you can see um latency
and throughput this is a benchmark from
a third party and they had to extend
their their axes here in order just to
accommodate how fast and how much
throughput grock has and how fast it is
it's pretty Prett insane however there
is a trade-off as I said they use
different memory than a a regular GPU
would use and therefore that makes it
such that each card only has a very
small amount of memory so you need a lot
of these cards in parallel to even serve
one of these big models now you can
achieve massive throughput obviously
economies of scale kicking but you can't
just get one of those cards and then
serve a large model and that's where
people quickly realized hey okay it
might not be the wonder weapon here it
is very cool but each chip only has
about 200 megabyte of SRAM and therefore
you would need I don't know hundreds of
cards in order just to serve this Mixr
model that we've seen before again with
the higher throughput it might be
totally worth it if you're a data center
owner but throughput over time you see
the the graph on the top right here
that's Gro
that's pretty insane um people calculate
that you need about 320 of these cards
or two full racks to just serve a single
llama 70b and if you calculate the cost
of these cards then that' be about10
million us it's not the the end alls
going to solve everything but it is
definitely definitely very cool
development in order to push language
model inference ahead at the same time
Nvidia unveils EOS tech power up rights
this is essentially pulling together a
bunch of their djx systems to create a
super duper computer so
576 dgx h100 systems wired together into
one computer each of these dgx system
has 8 h100s making for a whopping
4,608 h100 gpus note each of these
puppies will cost you I don't know what
they cost right now like 20K or
something like this or or or north of
that this is massive it's ranked number
nine in top 500 supercomputers of the
world with a staggering staggering 18.4
xof flops FPA performance this website
here I found pretty cool GPU list. a
it's by Andromeda Ai and it's
essentially Craigslist but for gpus
people rent out their GPU capacity it's
also as shady as Craigslist right so
it's just a listing
and it just says well you'll get bare
metal access and sometimes it says okay
you get uh SSH access to it or something
like this but essentially just allows
you to contact these people and then
make out a deal of how you're going to
use these gpus this seems fairly large
so the common posting here actually
there's a lot of h100s here going around
I'm not sure where people get these from
but sometimes oh there's
849
these Okay this may be more common so um
Canada ethernet 1 h100 available it's on
auntu VM and you get minimum one week so
if you want some gpus and you don't have
super confidential data because you are
going to use other people's Hardware
this might be a good option to find some
some good deals W has an interview with
Demis hbis on how far you get with scale
apparently I guess just on the future of
AI and if you read the interview it's
kind of a mix between yeah scale is
great we can do scale scale is awesome
Gemini is awesome these models are
awesome and but also scale only gets you
so far there needs to be something else
so they Demis says my belief is to get
to AGI you're going to need probably
several more Innovations as well as the
maximum scale there's no letup in the
scaling we're not seeing an ASM toote
yada y y there's still gains to be made
so my view is you've got to push the
existing techniques to see how far they
go but you're not going to get new
capabilities like planning or tool use
or agent-like behavior just by scaling
techniques it's not magically going to
happen it's very interesting because I
think that is a current contention like
it's very easy to say oh to get to AGI
you need something else because first of
all AGI isn't the defined term and
something else isn't the defined term so
you can redefine these two terms as you
wish and then you can always find
something that's still wrong or in a way
in which you're still correct if you do
that for long enough your name will
become Gary Marcus but other than that
these are fairly more concise
predictions saying okay you're not going
to get planning or tool use or
agent-like behavior they're not super
defined but they are and we're already
seeing tool use for example being built
into these large language models and get
better with scaling so it will be very
interesting to see whether Demis turns
out to be ultimately correct on his
predictions or whether one or the other
of these things will be available just
by scaling language model and kind of
training them on tool use data and so on
Tom's Hardware writes legendary chip
architect Jim Keller responds to Sam
Alton's plan to raise $7 trillion to
make AI Chip saying I can do it for less
than 1
trillion we've gone off the rails so Jim
Keller apparently legendary CPU
developer now working at the company
that makes chips themselves they claims
that he could do it for a lot less yes I
guess um I don't know as soon as you go
into like money that's Way Beyond the
current total market value of chips I
feel many claims can be made it will be
interesting going forward to see kind of
who takes the lead in chip development
how that's going to be playing out in
any case I'm not sure if bickering about
1 trillion 2 trillion 7 trillion is
going to make make that big of a
difference from one legendary person to
another legendary person and legendary
here spelled with a capital l AI May
destroy humankind in Just 2 years
experts as of course eler owski saying
if you put me to a wall and forc me to
put probabilities on things I have a
sense that our current timeline looks
more like 5 years than 50 years could be
2 years could be 10 well could be
anything uh like with the trillions it
is absolutely useless to make these
speculations and then I don't know
saying things about Terminator like epox
apocalypse and Matrix hellscape the
difficulty is people do not realize we
have a shred of a chance that Humanity
survives oh yes of course of course yowy
has I think retracted statements on
bombing data
centers like that would that that would
be useful in any case read this as you
would read like a comic book for
entertainment yeah I feel like that it's
at least makes you giggle otherwise this
uh serves no purpose at all Sora
continues to dominate headlines a video
generation model by open AI we've talked
a little bit about this in the last news
episode but this can create uh Clips
single shot Clips up to 1 minute I
believe and they look pretty pretty
awesome I have to say and more and more
kind of EXA examples come out of Sora
creating pictures creating Clips open AI
marketing department in full gear no you
don't have access to this model yet a
select few have access to this model not
you you are just a pleb you're not the
The Chosen person so Marvel at other
people using the cool thing and uh open
AI marketing department having tight
control over exactly which things go out
to the public and which things don't
what is interesting is here's an example
of Sora scaling with compute so
essentially saying the more compute they
throw into one of these Generations the
better better quote unquote the more uh
realistic I guess it gets so base
compute forx however like they've also
completely stopped to give us any sense
of the scale the absolute scale of
things so for now it's just like base
compute however much that is and then 4X
compute and then 16x compute yeah in any
case what we can infer from that is that
there is an iterative process very
probably to determine one of these
samples so it's not like single forward
pass of anything but iterative iterative
process like you would be used to from
diffusion doing many many steps across
the span of time to refine and refine
and refine the output what's also pretty
cool are demonstrations of changing
things like this being a base video and
not only can sort of generate things but
also kind of generate things according
in to some input like some input video
so in case here people changing the uh
surroundings of the car or the car
itself like the vibe of the video and so
on while keeping the general motion I
guess and the general concept clear so I
think that's pretty cool make it go
underwater yeah why not look at that or
that nice Rainbow Road keep the video
the same but make it winter animation
style charcoal drawing yeah make sure be
black and white not exactly but close
maybe it's one of those things where uh
it's actually black and white but your
eyes trick you but I I think I'm seeing
color
wait yeah no this is definitely color
uh actually not so sure now okay no this
definitely red this is definitely red
the the backlight yeah it's not fully
black and white cyber Punk medieval very
nice they had drones following cards in
Med Medieval Times also the horse legs
they
look yeah why not
dinosaurs pixel art so many cool things
about Sora keep coming out and also many
cool things about Gemini 1.5 Pro keep
coming out especially obviously the
insanely large context size of Gemini
1.5 Pro people feeding very long things
inside of it and see whether it can
handle the long context an entire code
base and then instructing it to code
something based on top of that I think
this is probably going to be one of the
best applications for something like
this if you have very long yeah
something like a code base or a
reference documentation or something
like this like the important parts of
that would fit into a million tokens and
being able to sort of cross reference
things inside of that and then generate
based on that is probably a very good
use case I know they can retrieve well
across the 1 million tokens kind of like
point to individual things if they need
to retrieve them but it will still be
interesting to research how performant
it actually is when you put more and
more and more stuff into that context my
personal estimate would still be that
putting less things into the context is
more beneficial will make stuff more
accurate or what I can also Imagine is
that they trained it in such a way that
they could have achieved better
performance on small context compared to
large context but they traded it off to
have sort of equally performing but
worse performance across the entire
context length not yet clear but it will
be interesting to see this pretty cool
uh feeding at entire short movie into
into this so what Gemini will do is it
will take the movie split it into frames
and then essentially use the frames as
tokens or tokenize the frames and you
can fit pretty long you can see here 44
minutes and 7 Seconds video you can fit
that into the context size of Gemini 1.5
Pro because it can also consume images
and Matt Schumer here says when straight
from full movie to a summary in seconds
no transcription no intermediate steps
just visual tokens to summary now I've
seen other people have pointed out that
the summaries it makes aren't always
super duper accurate or well done but
it's still pretty impressive and it
speaks to what I said before right the
main question is going to be what are
the Dynamics and the characteristics of
performance across this entire context
window and currently you see barley
coming out with a paper that's titled
World model on million length video and
language with ring attention this is an
actual research paper that is very
concurrent as I said to Gemini 1.5 Pro
doing retrieval experiments across very
long context with what's called ring
attention if you're interested we can
make an entire video on ring attention
that is in the makings so keep looking
for that but it's a cool new technique
it is is going to be some sort of
approximation to attention like it's not
the fact that we can now scale the
classic Transformers across this huge 1
million token context size so it's a
trick people have come up with many many
different tricks of sort of doing long
attention and this one seems quite
promising Phil Wang also known as Lucid
Reigns already has an implementation on
ring attention up even though the paper
is super duper new yeah what's
interesting to see in the read me of
this reposit is Phil saying I will be
running out of sponsorship early next
month if you'd like to see that this
project gets completed sponsor me or I
will be leaving the open source scene
for employment so just wanted to bring
this to people's attention by that I
mainly mean people like companies CTV
News Vancouver writes Air Canada's
chatbook gave BC man the wrong
information now the airline has to pay
for the mistake apparently a person went
on the chatbot for Air Canada that's
kind of power by a large language model
and they went there looking for very
specific questions about it's called
bereavement rates this these are reduced
fairs provided in the event someone
needs travel to due to the death of an
immediate family member so the chatbot
uh this person has lost family member I
wanted to do air travel due to that and
inquired about you know cheaper rights
or specific fairs specific prices for
this situation and the chatbot gave them
wrong information saying that he could
claim those even after the fact and when
he wanted to do that the customer
support people said nope that's not
possible you're not getting your money
now the question is who is responsible
and courts say that in fact Air Canada
is responsible for the things their
chatbot said and has to actually comply
with what the chatbot promised Air
Canada tried to try to push the
responsibility air Canada suggests that
the chatbot is a separate legal entity
that is responsible for its own
actions what what oh no the piece of
software that we actively deployed on
our website is a separate legal entity
yeah no no I get that as a lawyer you
will have to argue and in this case it
was probably like the last remaining
thing that you could even conceivably
argue but it's so ridicul kill us no way
no way if you deploy a piece of software
you are responsible for what that
software does not if you program it not
if you make the open source Library
that's then part of it if you deploy it
and it interacts with your customers and
then it promises stuff to your customers
then you are responsible and that's the
same with every other piece of software
as well there's absolutely no difference
of whether this is an llm chatbot or
anything else that executes code it's
always the same and it's pretty funny
that Air Canada trying to weasel their
way out of this so no Air Canada must
honor refund policy invented by the
airlines chatbot so obviously companies
are trying to alleviate costs on their
customer success operations in this case
they may want to calculate if they don't
incur more costs due to stuff that's
invented by it could be a strategy like
I think the dam ages here to be paid
were about $600 and then I think about
$20 in tax it says it somewhere in
addition the airline was ordered to pay
uh 36.1 for in prejudgment interest
interest not tax and $125 in fees and
probably the lawyers grabbed 10 20 50k
out of the people here so all in all who
succeeded the lawyers lawyers should be
fans
ofm like the amount of litigation and
the amount of Contracting and so they
have to do just because people want to
use or have used or mistakenly used llms
is going to be staggering staggering in
any case this was more about the
principle I guess than about the money
but it could be a thought you know as a
like just let an llm do your customer
success and if they promise something
that doesn't exist you just pay it it'll
be like 600 bucks in this case maybe
that's worth it maybe the saving
and not having to hire more people is
totally worth it for these companies I
don't know it'll be interesting to think
of a future I think right now everyone's
trying to guard rail everything because
they feel okay our customer success
operation should continue to be as is
like there is a completely defined
things okay here are the things we pay
here are the things we don't pay and so
on and you know must that cannot promise
anything else and so on what if the
mentality around that changes it will
just be like okay here's 's a set of
guidelines we know the thing is going to
hallucinate every now and then and when
it does we'll just sort of take it into
account like I feel there are still laws
against customers abusing that like if I
were to go to Air Canada chatbot and
kind of like prompt hack it into giving
me stuff I'm pretty sure a court would
side with Air Canada in that's
essentially kind of me emotionally
abusing a customer support rep until
they promise like give me what I want
but other than that could be totally
viable future and a fun future if if
these things are not so strict Kareem
car tweeting out xing out I'm not sure
what it's called finally happened a peer
review journal article with what appear
to be nonsensical AI generated images so
these this has become known as giant rat
balls the pictures they look from afar
like they could be in like a biology
Journal but they make no sense right
that the the writing's mostly rat just
this rat yeah yeah we see that
D so this this article has pictures that
were generated by mid Journey now it is
a bit interesting it's a bit more
interesting than that scientist a gast
at bizarre AI r with huge genit PE
reviewed article this is a fairly
reputable Journal where this is public
this is not just like a pay 5,000 bucks
and you will get published Journal this
is a fairly r beautiful Journal here is
another another picture you see it it
rarely makes sense if you actually look
closely second the images are created by
mid journey and the authors acknowledge
this in the paper so the authors say the
images are generated by mid journey
Third there were two reviewers and one
of the reviewers actually brought this
up and apparently also a reviewer I'm
not sure if it's the same reviewer or a
different reviewer said I was only
looking at the scientific content of the
work right and reviewed it based on that
and we have also statement from the
reviewer saying that they did raise
concerns about the images so they're
saying the journal says an investigation
is currently being conducted so this
article in Vice here details our
investigation revealed that one of the
reviewers raised valid concerns about
the figures and requested author
revisions even the reviewer saying okay
the figures
need to be revised the authors failed to
respond to these requests we are
investigating how processes failed to
act on the lack of author compliance
with the reviewer's requirements so it's
a bit more tricky than just oh a bunch
of researchers tried to get a fake paper
through and the reviewers didn't notice
it seems that the ultimate person who I
guess multiple people here always
contribute to the things but ultimately
it was the editors who didn't make sure
that the authors actually changed the
things that the reviewers were asking
them to change that made It ultimately
go through and being printed as a paper
I guess mistakes like this happen it's
also probably very common to just kind
of assume the authors will concur with
what the reviewers request to change I
don't know if they've even said yeah
okay we'll change it or something like
this but it is an interesting story and
the meme of the giant rat balls will
forever live on in our hearts Andre kpoi
said that he left open AI assuring that
not a result of drama or anything like
this it's just a change in scenery
saying that the last year in open AI was
really great the team is really strong
the people are wonderful the road map is
exciting and we all have a lot to look
forward to he says my media plans is to
work on my personal project and see what
happens and immediately following this
up with a video explanation 2 hours on
tokenizers enlightening the bizarre
world of why reversing strings with llms
is really really difficult and why
different languages are give you
different results and so on so if you
want to explore so far I believe a bit
underexplored aspect of large language
models definitely look into Andre's
tutorial on tokenizers very cool and as
Andre is very very clear explanations
you will know a lot more after this
nature writes what the eu's tough AI law
means for research in chat GPT the EU AI
Act is the world's first major
legislation on artificial intelligence
and strictly regulates general purpose
models as you know the EU AI act has
been in development for a number of
years now and uh is now finally coming
into effect rolling out and so on it's
been changed quite a bit over the years
you know even myself I'm not entirely
sure what's in the current version and
how much it's still going to change but
the approach is to categorize broadly
applications into risk categories and
then to tie what you have to do
according to the risk category the most
risky things what called unacceptable
risk and those are just banned like not
allowed to do for example those that use
biometric data to infer sensitive
characteristics such as people's sexual
orientation so this is banned also
what's hilarious there's like a limit
for when you have to do something and
that is 10 to the 25 flops completely
arbitrary number that is going to be
meaningless probably even already before
the AI act has really rolled out finally
I could not make up worse advice for
these policy makers if I wanted to it's
like okay let's pick a completely
arbitrary number and say here here is
where we draw a line like what I believe
that you can entirely transparently see
the lobbyists being like okay what can
we do that our competitors can't do and
let's like draw a nice line between the
two how it is the next 3 years and we
don't we don't care about any after that
all right unacceptable risk do you
realize that a basic linear regression
would fall under this the EU effectively
now bans drawing a straight line across
a few data points if those data points
happen to coincide with the data
categories that are collected here like
this is the level of dumbness these
kinds of laws come to yes I know I know
I am making sort of pulling it to the
extreme here I know this is meant for
super duper Transformers informing these
things and then sitting in automated
systems that make decisions about
people's lives and so on I see what the
fear is but I doubt whether what they're
trying to do what they're intending to
do matches with what the effect of this
is going to be and I still believe the
effect is going to just be that a more
monopolization of of bigger companies
making it harder for newcomers making it
harder to enter this market and giving
the governments more control over things
which they will probably not do good
things with just an opinion coh here for
AI launches Aya Aya is an open-source
massively multilingual large language
model and a data set built over 101
different languages all across the world
and this is one of the largest data set
of instruction data that's around as I
said it's a data set and a large
language model all at once the data set
is available the model is open access
whatever that means right now I guess
you can download the model because
there's a button that says download the
model hope your press found this on
Reddit and I found this to be really
interesting Regional prompting uh this
is a UI for the technique called gilan
and I've linked to the repository um
very cool to use
and very exciting exciting new things
that are possible ARA everyday
activities is a data set again released
by meta that depicts as you can see
everyday activities so this has first
person view data location data and so on
meta is actually pushing the metaverse
and data sets around that augmented
reality and so on so collecting a lot of
data they have as you can see right here
rolling shutter RGB the field of 110°
field of view camera 150° field of view
camera for slam and hand tracking
infrared illumination barometer
magnetometer environmental sensors
spatial microphones and so on and then
annotated data per frame ey tracking 3D
trajectories these data sets they
collect them to be quite Universal so
maybe don't want to use all of them at
the same time but they enable a lot of
different applications which is very
very cool stability and announces stable
diffusion 3 a text to image model using
a diffusion Transformer architecture for
greatly improved performance in multi-ub
prompts image quality and spelling
abilities not releasing anything there
is a weight list for early preview they
say this is for Gathering Insight in and
improve its performance and safety ahead
of the open release we've also come to
known from stability that open release
is going to mean that you can use the
model for research stuff but if you want
to use it for anything commercial you
have to give them a bit of money you can
see a few examples uh here nice Apple go
big or go home the astronaut writing on
things has become a bit of a of a meme I
mean the quality is getting absolutely
insane with these text to image models
Alexa gordic releasing Yugo llm this is
a large language model 7 billion
parameter large language model for
Balkan languages Serbian Bosnian and
creation languages and you can find that
on hugging face right now very very cool
open math instruct by Nvidia is a math
instruction data set that you can freely
use actually freely use might be an
overstatement there is a Nvidia specific
license to it I'm not a lawyer I'm not
going to tell you what this means I
personally think no legal advice you can
use it freely again not legal advice
this article from interesting
engineering I found very cool it's a
system that identifies drug combo
problems so uh interactions between
different drugs specifically as they are
transmitting the uh barrier in your gut
the problem is any researching any drug
and what it does is already super
expensive but then obviously every drug
you add to the Regiment of available
drugs means it could have interactions
with all the other drugs that exist this
system uses a combination of machine
learning and actual models of
Transmissions models of receptor
behavior in the gut to predict
interactions between different drugs in
terms of their uptake in the gut so I
think that's very cool I think the uh
pushing into this direction we already
saw this with various Deep Mind models
of having some sort of expert modeling
like some sort of actual model uh that
is domain informed that is expert
informed in a scientific domain combine
that with machine learning and use these
two together in order to draw
conclusions is probably I want to say
the next Frontier I feel like the
frontier of will just throw a lot of
data at stuff and it will give us
results I think the low hanging fruit in
that has probably been taken already and
now it's really about the combination of
expertise and machine learning that is
going to push ahead so very very cool
very excellent developments Bloomberg
writes Reddit signs AI content licensing
deal ahead of IPO that being said this
is all person close to the matter said
large unnamed AI company a lot of
dollars involved about $60 million on an
annualized basis and yada yada yada so
this is all I guess they call it heay
right but this is the chatter right now
that Reddit obviously recently Reddit
has made Headlines by sort of tapping
all of their API access and so on uh not
being really open anymore in to outside
developers in a clear move to protect
their IP which is users posting on
Reddit so that other you can't via API
go and and grab all that data and now
the second move is that they themselves
are going to make use of that data by
licensing it out to other companies
again this is all just someone said
someone familiar with the matter yada
yada yada but still Reddit realizes they
sit on a treasure Trove of information
that's already evident by people just
Googling how to XYZ and then just add
Reddit to their Google search query
because they know usually they get okay
answers on Reddit which has had the
counter effect that now marketing
representatives and so on will try to go
and sort of poison Reddit threads by
giving you know Nic looking answers that
ultimately link back to their product
interesting Dynamics in any case Reddit
data may become a staple of one of the
big AI companies so we'll soon have uh
all kinds of AI redditors around isn't
that a great fut new Atlas writes the
seeing ey dog V 2.0 is shaping up as a
GameChanger this goes into the details
of strapping kind of assistive
Technologies on top of one of these
four-legged robots in order to help
blind and seeing visually impaired
people to move around uh safely safely
passage from A to B and so on the
article discusses that the main
limitation here is actually the
availability of service dogs in general
like guide dogs there are way too few
guide dogs around for all the visually
impaired people they are expensive they
are rare they need to be trained and so
on in this case these robots obviously
don't so yeah you can say they take away
the jobs of good you know hardworking
guide dogs which I guess but from all I
can see here actual guide dogs are still
preferred to robot dogs it's just that
there aren't nearly enough of them so
these robot dogs they are shaping up to
become very capable and can help with a
lot of things so very cool developments
this paper I found really cool OS
co-pilot towards generalist computer
agents with self-improvement using agent
likee Behavior but interacting with your
operating system so it can do some stuff
on your computer by you just prompting
it to do it opening applications
interacting with applications even doing
kind of multi-step things I think this
is one of the ways we're going to
interact more with computers in the
future maybe I don't think like the
keyboard and programming you know using
text and so on will ever go away but
probably kind of web browsing or simple
things like this could be automated like
this I found this to be a lot more
understandable than just voice prompts
like just being like Alexa book a flight
to XYZ right like it's like I find voice
and sound to be kind of a wonky
interface for that but if at the same
time someone shows me look I'm now going
to this website I'm going to do this and
that I feel that is a much more viable
interface here but then again you could
just click C it yourself so probably I
find the GitHub co-pilot to be an
extremely good mode of interacting with
an llm so if we transform transport that
to the world of here it would be that
largely I operate the computer but I
could tab complete like a lot of things
so like if there's a form to be filled
out yes I know browsers will support me
already but I could maybe just kind of a
lot less tab complete or if there is I
don't know some standard interactions on
website I just kind of tap complete that
away I think that mode of interacting
with computers I'm looking forward to
that a lot I'm not looking forward to
like single prompt and then will
magically go and do something for me I
don't think that's going to be a thing
of the near future and I don't think you
would be comfortable with a system like
that Business Insider writes new report
sheds light on Apple's upcoming AI
features that will rival Microsoft's
co-pilot now further down they say
Microsoft's GitHub co-pilot writing code
so it's not like the Microsoft Windows
Co Co pilot or the Microsoft 365
co-pilot there are too many co-pilots
nowadays it is the apparently the GitHub
copilot that Apple targets inside of its
xcode environment uh so if you write
Swift apps if you write iPod and iPhone
apps and maybe even what's called Mac OS
apps that might be really cool to have
that available I do feel GitHub co-pilot
does its job quite well for what it does
for everything else I've never
programmed Swift so I can't say that
Tech crunch writes anthropic takes steps
to prevent election misinformation yeah
sure
it's they're making a bit of PR I feel
like they're they're using the
opportunity of the election to be like
oh we have guard rails we have prompt
Shield which I guess prom shield cool
the Technologies which relies on a
combination of AI detection models and
rules sure you have a regex and you have
some prompt that says if the user asks
for voting information go to this site I
guess it's a good if I were a company I
would try to use that as well and lastly
AI comes to the world of beauty as
eyelash robot uses artificial
intelligence to place fake lashes this
details how this robot can place um fake
eyelashes in a more precise way than
humans could and as far as I can see
it's also a bit faster or cheaper or
something like this this is a purely
mechanical task that so far humans did
by hand and now a robot can do
artificial intelligence is a bit of an
overstatement like they use computer
vision to detect where the eyelids and
the corners of the eyes and so on are
which is really cool um but then there
is oh there is the for and then there's
the against and the against is oh no we
have to be very careful um about this so
there's potential risks uh the device's
proximity to sensitive area could raise
concerns about the risk of eye
infections or allergic reactions to the
materials used in the Lash extensions I
guess they just they have someone
someone just say well say anything bad
about this like anything generic that's
bad about this and this person was like
I guess you could be allergic to the
materials and they're like oh yes there
is someone there's also potential
risks I'm not I'm not sure I'm not sure
I'm buying that you decide for yourself
I feel having a machine do a purely
mechanical task is fine it's not going
to steal a lot of jobs I guess all good
I just found it funny that the news
article must have this structure but
here is something new that technology
can do but there is also risks with that
being said there's also a risk that this
video gets too long and with that I'll
finish it thank you for watching
[Music]
bye-bye
[Music]
Weitere ähnliche Videos ansehen
AI Realism Breakthrough & More AI Use Cases
Papo de IA #29 | Notícias de IA - ChatGPT-5, GPT Store Liberada, Humanos Digitais, O fim do cinema..
Llama 3 e Meta AI: demo dell'AI GRATIS di Meta
Introduction to Generative AI
These AI Use Cases Will Affect Everyone You Know
Google's Upgraded Gemini Pro 0801 Surpasses GPT-4 and Shakes Up the Industry!
5.0 / 5 (0 votes)