Has OpenAI Secretly Released GPT 4.5? (Writing Test)
TLDRIn this video, Jason, a novelist and AI writing expert, discusses the sudden appearance of a new chatbot on the LMS Y platform, which is speculated to be an updated version of the GPT models, potentially GPT 4.5 or even GPT 5. The chatbot, labeled as GPT2, has demonstrated significantly improved reasoning and math skills, leading to widespread speculation about its true identity. Jason shares his experience testing the chatbot's capabilities in writing-related tasks, noting its superior performance to GPT 4 in many aspects. He provides a step-by-step guide on how viewers can access and test the chatbot themselves, either through direct chat on LMS Y or by using the Arena Battle feature. Jason also shares a document containing prompts and responses from the chatbot, highlighting its ability to generate detailed and specific story outlines and prose, which he found to be of higher quality than other models. He concludes by encouraging viewers to share their experiences and thoughts on the chatbot's performance.
Takeaways
- 🤖 A new chatbot, labeled as GPT-2, has mysteriously appeared on a platform called LMS Y, which is used for comparing language models.
- 🧐 This GPT-2 chatbot has shown significantly improved reasoning and math skills, leading to speculation that it might actually be GPT 4.5 or even GPT 5.
- 🔥 Sam Altman, known for his work with OpenAI, has tweeted positively about GPT-2, fueling the speculation about the new model's identity.
- 📈 The original GPT-2 is considered an older, less capable model, and has been largely superseded by GPT 3.5.
- 🚀 The new GPT-2 model demonstrated superior performance in writing-related activities, providing more depth and consistency in its responses.
- 💻 To access the new GPT-2 model, users can visit chat.lmsy.org and select 'ppt2 chatbot' under the direct chat section.
- 🕹️ An alternative way to access the model is through Arena Battle on the same website, which allows for blind testing of different models.
- 📝 The chatbot was tested with various writing prompts, including brainstorming ideas for a Sci-Fi Beach Romance and creating an outline based on the 'Save the Cat' beats.
- 📚 The responses from the chatbot were detailed and showed a better understanding of story structure and character development compared to other models.
- ✍️ When tasked with writing the first 500 words of a scene, the chatbot provided a narrative with a strong sense of setting and character perspective, although it was somewhat verbose.
- 🔍 The chatbot's performance in writing prose was found to be slightly better in terms of showing versus telling and understanding conflict depth, compared to GPT-4.
- 🔄 The final assessment of the chatbot's capabilities will have to wait until it is fully released, allowing for more comprehensive testing and comparisons.
Q & A
What is the topic of discussion in the video?
-The video discusses the possibility that a new chatbot, labeled as GPT2, might be an updated version of the GPT models, possibly GPT 4.5 or even GPT 5, and its capabilities in writing-related tasks.
What is LMS Y used for?
-LMS Y is a platform primarily used to compare different language models against each other to see which is better at specific tasks in an objective manner.
Why is there speculation that the new GPT2 chatbot might be GPT 4.5?
-The speculation arises because the new GPT2 chatbot has demonstrated significantly better reasoning and math skills compared to the original GPT2, leading people to believe it could be an updated version.
What did Sam Altman's tweet about GPT2 suggest to people?
-Sam Altman's tweet, in which he mentioned having a soft spot for GPT2, fueled speculation that the new GPT2 chatbot might be an indication of something more, possibly an updated version of the model.
How can one access the new GPT2 chatbot for testing?
-To access the new GPT2 chatbot, one can visit the website chat.lmsy.org, go to direct chat, and select the GPT2 chatbot from the list of models. Alternatively, one can use Arena Battle to blind test different models, including the new GPT2.
What was the first writing-related task the video presenter used to test the new GPT2 chatbot?
-The first writing-related task was a brainstorming prompt to generate 10 ideas for a Sci-Fi Beach Romance.
How did the new GPT2 chatbot perform in the brainstorming task?
-The new GPT2 chatbot performed well, providing ideas with inherent conflict and depth, which were more consistent and story-like compared to typical outputs from other models.
What was the presenter's chosen idea from the brainstorming session?
-The presenter chose the idea titled 'Sand Castles of Time', where a couple discovers a beach where building sand castles can alter reality, transporting them to different historical epics and parallel universes.
How did the new GPT2 chatbot handle the outline prompt based on the 'Sand Castles of Time' idea?
-The chatbot provided a detailed outline following Blake Snyder's 'Save the Cat' beats, with specific scenes and character interactions that were more concrete and less generic than other models.
What was the task given to the new GPT2 chatbot in the first scene writing prompt?
-The task was to write the first 500 words of the first scene in a Sci-Fi Beach romance book, focusing on the protagonist Lily's point of view in the first person, with an emphasis on showing rather than telling.
What were the presenter's observations about the depth and quality of the prose generated by the new GPT2 chatbot?
-The presenter found that the prose generated by the new GPT2 chatbot had a better grasp of the depth of conflict and story, with a more intuitive understanding of what makes a good scene, including a balance of showing versus telling.
What is the general consensus on the new GPT2 chatbot's performance in writing tasks compared to GPT 4?
-The new GPT2 chatbot, suspected to be GPT 4.5, showed improvements in reasoning, math skills, and writing tasks, with more depth and specificity in its responses compared to GPT 4.
Outlines
🤖 Introduction to the New GPT Model
The video introduces a mysteriously appeared chatbot, speculated to be an updated version of the GPT models, possibly GPT 4.5 or GPT 5. The host, Jason, is a novelist who has been teaching writers to use AI and writing principles together. He discusses the platform LMS Y, used for comparing language models. The new 'gpt2' chatbot has shown better reasoning and math skills, leading to speculations that it could be GPT 4.5. The video also mentions Sam Altman's tweet, which has fueled further speculation. Jason shares his experience trying to access the model on the LMS Y website and suggests an alternative method through Arena Battle for a better chance of testing the model. He also shares his findings from testing the model on writing-related activities.
📚 Creative Writing Prompts and Outlines
Jason provides a detailed account of using the new chatbot for creative writing tasks. He starts with a brainstorming prompt for a Sci-Fi Beach Romance and finds that the chatbot's responses are more consistent and contain inherent conflict, making them story-like rather than generic. He selects one idea, 'Sand Castles of Time,' and asks the chatbot to expand it into a full outline using Blake Snyder's 'Save the Cat' beats. The outline provided by the chatbot is detailed and specific, showing a good understanding of story structure. Jason then asks the chatbot to write the first 500 words of the scene from the outline, focusing on the protagonist's point of view. The resulting text is rich in detail and emotional depth, although it contains some AIisms. Jason concludes that while the prose quality is not significantly better than GPT 4, the chatbot demonstrates a better grasp of story and conflict.
🌐 Testing and Future Prospects
The video concludes with Jason's reflections on testing the new chatbot and its potential. He notes that the chatbot's responses often had more weight and a better sense of conflict compared to other models. He acknowledges that the chatbot's performance could improve with better prompt adjustments. However, he emphasizes the need to wait for the full release of the model to conduct thorough tests. Jason invites viewers to share their thoughts and experiences with the new chatbot in the comments section and expresses hope that the video was informative.
Mindmap
Keywords
GPT 4.5
LMS Y
Chatbot
Arena Battle
Direct Chat
Save the Cat
Show vs Tell
Conflict
Prose
Sand Castles of Time
Highlights
A new chatbot has appeared, possibly an updated version of the GPT models, speculated to be GPT 4.5 or GPT 5.
The platform LMS Y is used for comparing language models to determine which is better at specific tasks.
The new GPT2 chatbot has demonstrated improved reasoning and math skills, outperforming GPT 4 in several benchmarks.
Sam Altman's tweet about having a soft spot for GPT2 has fueled speculation about the new model's identity.
The original GPT2 is considered an older, less capable model not in widespread use.
The new GPT2 chatbot is superior to GPT 4 in many ways, including writing-related activities.
The website chat.LMS Y.org allows users to test different models, including the new GPT2 chatbot.
Due to high demand, the site was overwhelmed, making it difficult to get responses from the new model.
Arena Battle is an alternative method to access and compare the new GPT2 model against others.
The Llama 370b parameter model performed well, often chosen over GPT 4 in various tests.
The new GPT2 provided detailed and imaginative story prompts for a Sci-Fi Beach Romance.
The model's responses were more consistent and contained inherent conflict, indicating a better grasp of storytelling.
An outline generated from the GPT2 model using Blake Snyder's 'Save the Cat' beats was detailed and specific.
The GPT2 model's prose writing delved deeper into the protagonist's point of view, showing more than telling.
The model demonstrated an understanding of the depth of conflict and story, providing weight to the narrative.
Despite some issues, the GPT2 model showed potential for improvement with refined prompts.
The full capabilities of the model will not be known until it is fully released for testing.
The video invites viewers to share their experiences and findings with the new GPT2 model.