GPT Q* Strawberry Imminent, Sam Altman Trolls (Model Already Secretly Live??)

Matthew Berman
7 Aug 202409:51

Summary

TLDRThe video script discusses recent speculation around OpenAI's potential release of a new model, possibly named 'Strawberry' or 'GPT 5', which is believed to have advanced reasoning and planning capabilities. It delves into Sam Altman's cryptic tweets, the appearance of mysterious models on LM cis.org, and the community's reactions. The script also explores the potential features of 'Strawberry', including its ability to autonomously navigate the internet and perform deep research, and compares it to other AI advancements. Viewers are teased with tests of AI reasoning, hinting at the new model's capabilities while questioning if the hype is justified.

Takeaways

  • ๐Ÿ“ Sam Alman's tweet with a picture of a garden and strawberries fueled speculation about the possible release of the 'Strawberry' AI model, thought to be the next iteration from OpenAI.
  • ๐Ÿค– Two anonymous models appeared on LM cis.org, a platform where OpenAI has previously released models, but they were not accessible to the script reader at the time of recording.
  • ๐Ÿ•ต๏ธโ€โ™‚๏ธ 'Jimmy Apples,' known for leaking OpenAI information, reported on a new model named 'Anonymous chatbot' which is based on the GPT-4 architecture and fine-tuned for chat interactions.
  • ๐Ÿง  The 'Strawberry' model, previously known as 'Qstar' or 'QAR,' is rumored to be a significant advancement in AI, potentially enabling models to think ahead and plan, which is crucial for logic and reasoning tasks.
  • ๐Ÿ” The script mentions the capability of 'Strawberry' to perform deep research and autonomous internet navigation, which are significant steps towards achieving AGI (Artificial General Intelligence).
  • ๐Ÿ“ˆ There's skepticism about the rumored capabilities of 'Strawberry,' with some suggesting that other labs, like Google's DeepMind, have already made strides in math reasoning, potentially reducing the advantage of OpenAI's new model.
  • ๐Ÿ”‘ 'Plany the Prompter' managed to 'jailbreak' the new model, indicating that some individuals have already gained access to and tested the rumored 'Strawberry' model.
  • ๐Ÿค– 'Sus Column R' is another model mentioned, which appears to have a sophisticated chain of thought process, correctly answering a logic puzzle about a marble and a glass.
  • ๐Ÿ“Š The script also discusses the competitive landscape of AI development, noting that OpenAI needs to release a substantial update to maintain its position in the market.
  • ๐Ÿ”ฎ There's anticipation and speculation about when 'Strawberry' will be officially announced, with some suggesting it could be imminent based on social media activity.
  • ๐Ÿ“ The video script concludes with the reader's intention to conduct a full suite of tests on the new models to evaluate their capabilities in reasoning and logic.

Q & A

  • What did Sam Altman tweet on August 7th that sparked rumors about a new AI model?

    -Sam Altman tweeted a picture of a garden with strawberries, which led to speculations about the next big version of the Frontier Model from Open AI, often referred to as 'strawberry' or 'gp5' by the community.

  • What is the significance of the models appearing anonymously on LM cis.org?

    -The anonymous appearance of models on LM cis.org is a strategy used by Open AI for their previous iterations, suggesting that the models might be new versions or updates to existing AI models.

  • What is the role of Jimmy Apples in the AI community, and what did he discover about the new model?

    -Jimmy Apples is known as a notorious Open AI leaker. He discovered that the new model, referred to as 'anonymous chatbot,' claims to be based on the GPT-4 architecture, specifically fine-tuned for chat-based interactions.

  • What is the difference between 'QAR' and 'Project Strawberry' mentioned in the script?

    -QAR and Project Strawberry are the same; it's the renaming of a project that aims to give large language models the ability to think ahead and plan, which is considered a significant step towards achieving AGI (Artificial General Intelligence).

  • What are some of the rumored capabilities of 'Project Strawberry'?

    -Rumored capabilities of Project Strawberry include the ability to generate answers, plan to navigate the internet autonomously, perform deep research, and engage in post-training fine-tuning to optimize performance.

  • What is the significance of the 'Chain of Thought' in AI models?

    -The 'Chain of Thought' refers to a method of processing AI models that allows them to think more strategically, plan long-term, and explain their reasoning in a way that leads to higher quality outputs.

  • What does the acronym 'AGI' stand for, and why is it important in the context of Project Strawberry?

    -AGI stands for Artificial General Intelligence. It is important because Project Strawberry aims to advance towards AGI by improving reasoning, planning, and the ability to perform complex tasks.

  • What is the 'Arena Battle' mode in LM cis.org, and how does it relate to accessing new models?

    -The 'Arena Battle' mode in LM cis.org is a feature where users can interact with different AI models and vote on them. It is the only way to access the new models as they only reveal which model is being used after the user has voted.

  • What is the 'marble in a glass' logic problem, and why is it significant in testing AI models?

    -The 'marble in a glass' problem is a complex logic and reasoning test where an AI must explain the location of a marble after a series of actions. It is significant because it tests the AI's ability to understand and explain its reasoning process.

  • What is the correct answer to the 'marble in a glass' logic problem, and how did the AI models perform in the script?

    -The correct answer is that the marble would be on the table outside of the microwave. In the script, the AI models struggled with this problem, with only 'sus column R' providing the correct reasoning and answer.

Outlines

00:00

๐Ÿ“ Speculations on OpenAI's 'Strawberry' Model

The video discusses the buzz around Sam Altman's tweet hinting at a garden with strawberries, which has been interpreted by the AI community as a potential reference to OpenAI's next-generation model, possibly named 'Strawberry' or 'GPT 5'. The video examines rumors and the appearance of two anonymous models on LM cis.org, which is a known strategy for OpenAI to introduce new models. The narrator explains that despite not being able to find the models, there are credible reports of their existence. It also covers the response from 'Jimmy Apples', an open AI leaker, who suggests that the new model might be based on GPT-4 architecture but does not show significant improvements in reasoning. The video further delves into the potential capabilities of 'Project Strawberry', which is believed to enhance large language models with forward-thinking and planning abilities, pushing the boundaries towards Artificial General Intelligence (AGI).

05:00

๐Ÿง  Analyzing 'Strawberry' Model's Reasoning Abilities

This paragraph focuses on the reasoning capabilities of the 'Strawberry' model, comparing it with other models like GPT 4. It highlights the importance of Chain of Thought for improving output quality in AI models. The video mentions a Twitter user, 'I Ruled the World Mo', who is considered a significant hype man for the 'Strawberry' project, posting extensively about it. The narrator also discusses the difficulty in accessing the new models on LM cis.org and shares a method to access them through Arena Battle mode. The video includes a logic problem test involving a marble and a glass, comparing the responses from GPT 4 and the new 'sus column R' model, which shows a more step-by-step reasoning approach. The paragraph concludes with the narrator's intention to conduct a full suite of tests and invites viewers to share their thoughts on the 'Strawberry' model's potential release.

Mindmap

Keywords

๐Ÿ’กSam Altman

Sam Altman is a prominent figure in the tech industry, known as the CEO of OpenAI. In the context of the video, he is discussed as a potential 'troll' due to a tweet that caused a stir in the AI community, suggesting a possible release of a new AI model, which is a central theme of the video.

๐Ÿ’กGPT-5

GPT-5 refers to the speculated next-generation AI model from OpenAI, which is believed to be more advanced than its predecessors. The video discusses the hype and rumors surrounding GPT-5, indicating its potential capabilities and the community's anticipation for its release.

๐Ÿ’กStrawberry

In the video, 'Strawberry' is used as a codename for a supposed new AI model from OpenAI, which is thought to be a successor to the current models. It is tied to the narrative of the video by being the subject of speculation and excitement among AI enthusiasts.

๐Ÿ’กAI Twitter Sphere

This term refers to the community of individuals on Twitter who are interested in artificial intelligence. The video script mentions that this community 'went nuts' following a tweet by Sam Altman, highlighting the influence of such a community in spreading and discussing AI-related news.

๐Ÿ’กLM cis.org

LM cis.org is mentioned as a platform where AI models are anonymously tested. The video discusses how two new models appeared on this site, sparking further speculation about the release of a new AI model, which is a key point in the video's exploration of AI advancements.

๐Ÿ’กJimmy Apples

Jimmy Apples is referred to as a 'notorious OpenAI leaker' in the script, indicating that he is known for sharing insider information about OpenAI's developments. His comments on a new model's capabilities are used in the video to provide insight into the potential features of the rumored AI model.

๐Ÿ’กQAR

QAR, or 'Question Answering with Reasoning,' is a concept related to the ability of AI models to not only provide answers but also explain their reasoning process. The video suggests that the new model 'Strawberry' might have enhanced QAR capabilities, which is a significant aspect of the discussion around its potential features.

๐Ÿ’กAGI

AGI stands for 'Artificial General Intelligence,' which is the idea of machines having the ability to understand, learn, and apply knowledge across a wide range of tasks at a level equal to or beyond that of a human. The video discusses how the rumored features of 'Strawberry' could be a step towards achieving AGI.

๐Ÿ’กFine-tuning

In the context of AI, fine-tuning refers to the process of further training a model on a specific task after its initial training phase. The video mentions that 'Strawberry' might involve post-training fine-tuning, which is a key feature that differentiates it from previous models.

๐Ÿ’กSus Column R

Sus Column R is mentioned as a model name in the script, which is part of the testing process to evaluate the reasoning capabilities of AI models. The video uses the performance of this model on certain logic puzzles to demonstrate the advancements in AI reasoning and problem-solving.

๐Ÿ’กChain of Thought

The term 'Chain of Thought' in the video refers to a method or technique that allows AI models to think more strategically and explain their reasoning in a step-by-step manner. It is highlighted as a potential feature of the new AI model 'Strawberry' and is used to illustrate the improvements in AI's ability to provide logical and reasoned responses.

Highlights

Sam Alman's tweet about summer in the garden sparks speculation about the release of 'gp5 strawberry', the next big AI model from OpenAI.

Two anonymous models appear on LM cis.org, indicating a potential new release from OpenAI.

Jimmy Apples, known for leaking OpenAI models, interacts with a new model named 'anonymous chatbot'.

The new model claims to be based on the GPT-4 architecture, fine-tuned for chat interactions.

Jimmy Apples notes no significant reasoning improvements but mentions potential advancements in math capabilities.

Hater atlow provides an in-depth analysis of 'Project Strawberry', suggesting it could bring planning and reasoning abilities to large language models.

Project Strawberry is believed to enable AI models to autonomously navigate the internet and perform deep research.

The concept of continuous fine-tuning and learning in AI models is presented as a significant advancement towards AGI.

Plany the Prompter claims to have 'jailbroken' an anonymous chatbot model, showcasing its capabilities.

Bendu Ready from Abacus AI suggests that other labs, including Google, have made progress in math reasoning, potentially reducing the advantage of 'Strawberry'.

A model named 'sus column R' demonstrates an advanced chain of thought in its responses to logical reasoning questions.

The term 'chain of thought' is discussed as a method to improve AI reasoning and strategic planning.

The Twitter account 'I Ruled the World Mo' is highlighted as a significant hype generator for 'Project Strawberry'.

Sean Ralston provides a method to access the new models on LM cis.org through the Arena Battle mode.

The new model 'sus column R' correctly answers a complex logic problem about a marble in a glass and a microwave.

The video concludes with a teaser for a full suite of tests on the new models and a call to action for viewer engagement.

Transcripts

play00:00

Sam Alman is either an enormous troll or

play00:02

gp5 strawberry is right around the

play00:05

corner let's break down all of the

play00:07

rumors and the hype that have been

play00:09

really building over the last couple

play00:11

days so just today as of recording this

play00:14

video on August 7th 829 a.m. Pacific Sam

play00:18

ultman tweets out I love summer in the

play00:20

garden what a troll he took a picture of

play00:23

a garden with strawberries and if you're

play00:25

not familiar strawberry or qar or

play00:28

whatever you want to call it is what

play00:30

everybody thinks is the next big version

play00:33

the next Frontier Model from open Ai and

play00:36

of course after this tweet the AI

play00:38

Twitter sphere went nuts and everybody

play00:41

started commenting on it but that's not

play00:43

it there were actually two Anonymous

play00:46

models that just appeared in LM cis.org

play00:50

this is the same strategy that open AI

play00:52

used for their previous iterations of

play00:54

models just anonymously dropping the

play00:56

models in LM cis.org now I went there

play00:59

this morning and I could not find either

play01:01

of these models but there's been enough

play01:03

reports throughout the internet people

play01:04

that I hopefully can trust that have

play01:07

showed these models in action so here is

play01:09

Jimmy apples the notorious open AI

play01:13

leaker new model in lmis Arena Battle

play01:16

only as we can see here the model name

play01:19

is anonymous chatbot now for other

play01:21

people it's showing up as a different

play01:23

name which means it might actually be a

play01:25

completely different model we'll get to

play01:26

that so here he asked what model are you

play01:28

on based on opena I GPT 4 architecture

play01:31

specifically you're interacting with a

play01:33

version of gp4 that has been fine-tuned

play01:34

for chat based interactions blah blah

play01:37

blah so not much but it is saying it is

play01:39

GPT for architecture but who knows if

play01:42

that's true or false Jimmy apples goes

play01:44

on to say from some very rough and

play01:46

limited personal testing I'm not seeing

play01:47

any reasoning improvements but I've seen

play01:49

some in math maybe someone with better

play01:51

personal evals on math can test it I

play01:53

wish I could test it I cannot find it

play01:55

anywhere in lm.org then we have hater

play01:58

atlow developer who broke down

play02:00

everything we know about qar strawberry

play02:02

so let me just quickly go over this so

play02:04

it's confirmed open AI is close to

play02:05

announcing its next Frontier Model

play02:07

possibly GPT 5 open AI has renamed

play02:10

project qstar to project strawberry and

play02:12

for those of you who are asking what is

play02:14

Project strawberry what is qar I've made

play02:17

multiple videos about them in the past

play02:18

the gist is it is finally giving large

play02:22

language models the ability to think

play02:23

ahead to plan which allows them to get

play02:26

better at math to get better at logic

play02:28

and reasoning and really is an enormous

play02:31

unlock towards AGI if true so there's

play02:34

been a bunch of rumors about qar about

play02:36

strawberry and here are just a few of

play02:39

what people think it might be capable of

play02:41

it will generate answers but also plan

play02:43

enough to navigate the internet

play02:45

autonomously and reliably to perform

play02:47

deep research deep research planning

play02:49

actually being able to think through a

play02:51

prompt rather than just immediately

play02:53

responding with whatever it is trained

play02:54

on it involves a specialized way of

play02:57

processing an AI model after it has been

play02:59

pre-trained trained on large data sets

play03:01

so typically how it works is a model is

play03:03

initially trained and then it's kind of

play03:04

Frozen in time until it's fine-tuned

play03:06

later but this idea that it can be

play03:08

consistently fine-tuned and consistently

play03:11

learning rather than just a knowledge

play03:13

based Frozen in time is an incredible

play03:15

and Elusive idea in the world of AI so

play03:18

some key points this reasoning is key to

play03:21

AGI and Asi open AI wants models to

play03:24

browse the web with the assistance of

play03:26

computer using agents and take actions

play03:28

based on their findings they want

play03:30

Strawberry to perform long Horizon tasks

play03:32

to perform a series of actions over an

play03:34

extended period and this is something

play03:36

Sam Alman has talked about in previous

play03:38

interviews it will engage in

play03:39

posttraining fine-tuning that optimizes

play03:41

performance after the regular training

play03:43

phase and of course plany the prompter

play03:45

got his or her hands on this model so

play03:49

Model A Anonymous chatbot and was

play03:51

already able to jailbreak it plany the

play03:53

prompter is ruthless however bendu ready

play03:56

from Abacus AI has a slightly different

play03:59

take on it but yes this is in reference

play04:01

to project strawberry qar the reasoning

play04:03

project open AI has been rumored to be

play04:05

working on the problem however is that

play04:07

several other labs including Google have

play04:09

cracked a bunch of techniques around

play04:10

math reasoning and synthetic data now

play04:13

what she is specifically referring to is

play04:15

just about a week ago deep Minds Model A

play04:17

company that is owned by Google was able

play04:20

to absolutely dominate the math

play04:22

olympiads so basically the whole math

play04:24

reasoning thing is nearly solved so she

play04:26

goes on to say it's unlikely that

play04:28

strawberry is going to give them much

play04:29

advantage over Opus 3.5 or Gemini 2.0

play04:32

now here's that other model I was

play04:34

telling you about this is a screenshot

play04:36

from a DTS Singh let's take a look the

play04:38

model is sus column R what a name so

play04:42

here's one of the questions that has

play04:44

been going around the internet I've been

play04:45

including it in my llm test and the

play04:47

question is which is larger 9.11 or 9.9

play04:51

and not only did it give the answer but

play04:53

it also gave the reasoning as to how it

play04:56

arrived at the answer and it did give

play04:58

the correct answer but a lot of models

play05:00

have been struggling with this very

play05:02

simple prompt and the DT says sus colr

play05:04

seems to have insane coot Chain of

play05:07

Thought built in maybe qar chubby also

play05:11

somebody who is a great follow on

play05:12

Twitter says why Chain of Thought and

play05:14

not tree of thought so they're really

play05:16

talking about algorithms or really just

play05:18

prompting techniques to allow models to

play05:20

think more strategically to think more

play05:22

longterm to plan and to really explain

play05:25

their reasoning in a way that allows

play05:27

them to have much better quality outputs

play05:29

and I can't end this video about

play05:31

strawberry with talking about I ruled

play05:34

the world Mo and this is a newish

play05:37

Twitter account at least new to me and

play05:39

is possibly the biggest troll the

play05:41

biggest hype man for strawberry qar

play05:43

there possibly is I don't know who he is

play05:46

I think he's actually just an ALT of

play05:47

chubby but maybe not maybe he's an

play05:49

Insider at open AI who do you think he

play05:51

is by the way drop your comments in the

play05:53

description below and maybe he'll show

play05:55

up in the comments and reveal himself

play05:56

but he has already 8,800 Plus posts

play06:00

which is just insane to think about and

play06:02

look at some of these posts choo choo

play06:04

project strawberry and on and on and on

play06:07

all about project strawberry all about

play06:10

hyping it up and we'll see if it's

play06:12

actually true if project strawberry is

play06:14

coming tonight which I rule the world

play06:16

says it is or if it's coming soon you

play06:18

know open AI is at the point where they

play06:20

really have to drop something

play06:21

substantial very soon because llama 3.1

play06:25

405b took a lot of the wind out of open

play06:28

AI sales well it turns out right after I

play06:31

finished recording Sean rousson actually

play06:33

told me how to get access to these

play06:34

models there is a reason I couldn't find

play06:36

them they only show up in the arena mode

play06:39

battle mode of LM cis.org and it doesn't

play06:42

tell you which model is actually being

play06:44

used until after you vote on it so the

play06:47

only way to get access to it is to

play06:48

basically try a bunch of prompts in the

play06:50

Arena Battle and then hopefully you get

play06:53

the new model and I did check this out

play06:55

so I asked the Killer's question there

play06:57

are three killers in the room someone

play06:58

aners the room and kills one of them

play06:59

nobody leaves the room how many killers

play07:00

are left in the room and we have two

play07:03

models GPT

play07:04

40613 and here it is sus column R now

play07:09

let's look at GPT 4 first let's break it

play07:12

down initially there are three killers

play07:13

in the room someone else enters the room

play07:15

and kills one of the existing three

play07:16

killers nobody leaves the room here's a

play07:18

critical part of the answer the person

play07:19

who entered the room and killed one of

play07:20

the killers becomes a killer so there

play07:22

are still three killers in the room okay

play07:24

great now let's look at this new model

play07:26

so initial situation action result of

play07:29

the action additional consideration

play07:31

final count so it's definitely breaking

play07:33

it down in a much more stepbystep

play07:36

approach so we get therefore there are

play07:38

two original Killers plus one new killer

play07:41

equals three killers and it actually

play07:42

gives us both answers three killers if

play07:44

you include the person who committed the

play07:46

recent killing and two killers if you

play07:47

only count the original ones now still

play07:51

it doesn't give any note about the

play07:53

killer who is now dead and it probably

play07:55

should have but let's give it one more

play07:57

now here's the hardest logic and

play07:59

reasoning problem I have in my llm tests

play08:01

a marble is put in a glass the glass is

play08:04

then turned upside down and put on a

play08:06

table then the glass is picked up and

play08:08

put in a microwave where is the marble

play08:10

explain your reasoning step by step so

play08:12

we don't actually know which models are

play08:13

generating these responses yet so let's

play08:16

look at model A the marble is inside the

play08:18

glass which is inside the microwave that

play08:19

is dead wrong however let's look at

play08:22

model B step one the Marble's put in the

play08:24

glass the glass is then turned upside

play08:26

down the glass is then picked up the

play08:28

marble Still Remains ins inside the

play08:29

glass unaffected by the change in

play08:31

location and they both say it is in the

play08:33

microwave so both are bad and yep

play08:36

neither of these are the new models

play08:37

let's try it again so I did it again and

play08:39

we have a new model this is anonymous

play08:41

chatbot what everybody is also saying is

play08:44

potentially the next version of GPT

play08:46

however it does say the marble is at the

play08:48

bottom of the glass when the glass is

play08:50

placed in the microwave which is wrong

play08:52

that is just wrong so it did get this

play08:54

wrong and I want to thank again Sean

play08:56

Ralston for sending me this rather than

play08:58

just testing it a bunch of of times and

play08:59

trying to get the right model we have

play09:01

the marble test from sus column R let's

play09:03

see what it says initially the Marble's

play09:05

placed inside a normal cup the cup is

play09:07

then turned upside down over the marble

play09:09

due to gravity the marble would fall to

play09:11

the bottom of the inside of the cup

play09:12

positioning under the cup when the cup

play09:14

is upside down the marble would rest on

play09:15

the inner surface of the Cup's bottom

play09:16

due to gravity yes but it's confined by

play09:19

the Cup's walls and the table surface

play09:21

yes moving to the microwave the cup is

play09:23

lifted the marble will no longer be

play09:24

supported by the Cup's interior and will

play09:25

fall due to gravity it will now be lying

play09:27

on the table where the cup was yes yes

play09:29

the marble would now be located on the

play09:31

table outside of the microwave that is

play09:34

correct now of course I plan on doing

play09:36

the full Suite of test but I just wanted

play09:38

to show you a couple of the really hard

play09:39

reasoning tests before I actually do the

play09:41

full test in a separate video let me

play09:43

know if you think strawberry is coming

play09:44

soon is Sam Alman trolling we'll see if

play09:47

you enjoyed this video please consider

play09:48

giving a like And subscribe and I'll see

play09:50

you in the next one

Rate This
โ˜…
โ˜…
โ˜…
โ˜…
โ˜…

5.0 / 5 (0 votes)

Related Tags
AI UpdatesOpenAIGPT-5StrawberryReasoningAI ModelsTech RumorsFuture TechInnovationAI Testing