GPT2 implemented in Excel (Spreadsheets-are-all-you-need) at AI Tinkerers Seattle

Spreadsheets are all you need

26 Oct 202310:13

Summary

TLDRIn this engaging presentation, Isan showcases a personal project that demonstrates the power of AI and edge computing. He introduces a Python script using GPT-2 and a spreadsheet model that emulates the Transformer architecture without any coding, allowing users to interact with and understand the mechanics of AI models like GPT-2. The project, which he describes as both a teaching tool and a fascinating exploration of AI's inner workings, provides insights into the attention mechanism and the Chain of Thought prompting technique. Isan also shares his experience with the challenges of implementing such a model in a spreadsheet and offers resources for further exploration.

Takeaways

👨‍💻 The speaker, Isan, works at an edge compute platform that powers 4% of the internet and is open to discussing AI and Edge projects.
🚀 Isan has a personal side project called 'Spreadsheets Are All You Need', demonstrating the use of spreadsheets for complex tasks like running AI models.
🐍 The project involves using around 40-50 lines of Python code to integrate with the GPT-2 model, using a very short prompt at zero temperature.
📊 Isan has created a spreadsheet that implements GPT-2 without any API calls or Python, using only spreadsheet functions.
⏱️ The spreadsheet's computation is resource-intensive and can take about a minute to process, with a warning against running it on a Mac due to potential UI lockups.
🛠️ The spreadsheet serves as a teaching tool, analogous to the way computer architecture courses help understand system building and programming.
🎥 Isan is creating a series of videos that walk through every step of the GPT-2 implementation, focusing on the inference pass.
🔍 The spreadsheet allows for hands-on exploration of the Transformer model, including attention mechanisms and the Chain of Thought prompting technique.
🤖 The project provides insights into why certain aspects of the model are named the way they are, such as 'attention is all you need'.
📈 The speaker shares the 'weights' tab of the spreadsheet, which contains a massive amount of data, including all 124 million parameters of GPT-2.
🚫 The project has limitations, such as a maximum of 10 tokens due to the size of the weight matrices, and the difficulty of programmatically expanding the model.

Q & A

What is Isan's day job?
-Isan works at an edge compute platform that powers 4% of the internet.
What is the purpose of Isan's personal side project?
-Isan's personal side project aims to demonstrate and teach the concepts of AI and Edge computing using a unique tool: a spreadsheet that implements GPT-2 without any API calls or Python code.
How does Isan's spreadsheet project work?
-The spreadsheet project uses Excel functions to implement GPT-2, allowing users to input prompts and receive outputs without writing any code. It's designed to be an educational tool for understanding the Transformer model and its mechanisms.
What are the benefits of using a spreadsheet to teach about GPT-2?
-Using a spreadsheet to teach about GPT-2 provides a more approachable and tangible way for both non-developers and developers to understand the model's architecture and functionality. It allows users to visually track the information flow and make changes to see the effects in real-time.
What is the 'Chain of Thought' prompting technique mentioned in the script?
-Chain of Thought prompting is a technique where the AI is given more context or steps to reason through a problem. It helps the AI provide more detailed and accurate responses by simulating a step-by-step reasoning process similar to human thought.
What issues did Isan encounter while implementing GPT-2 in a spreadsheet?
-Isan faced challenges such as the spreadsheet's large size causing the Mac UI to lock up randomly, and dealing with the complexity of implementing bite pair encoding and positional embeddings within Excel's limitations.
How can one access Isan's spreadsheet project?
-The spreadsheet project can be accessed by visiting the website 'spreadsheets are all you need' where users can find videos, download the spreadsheet, and report any bugs or ask questions.
What is the significance of the attention mechanism in Transformers?
-The attention mechanism in Transformers allows the model to focus on different parts of the input sequence when generating each output element. It helps the model to handle long-range dependencies and understand the context better, which is crucial for tasks like text understanding and generation.
How does the spreadsheet demonstrate the 'attention is all you need' concept?
-The spreadsheet visually shows how the attention mechanism works by allowing users to see how tokens interact with each other at each layer. It provides a clear demonstration of how the model attends to different parts of the input, which is a core concept in Transformer models.
What was Isan's experience with running GPT-2 from source?
-Isan found running GPT-2 from source to be a challenging experience due to the complexity of the process and the difficulty in getting a working environment set up, particularly with TensorFlow 1.x.
What is the size of the spreadsheet in terms of parameters and file size?
-The spreadsheet contains all 124 million parameters of GPT-2 and is 1.5 GB in size in Excel binary format.