GPT2 implemented in Excel (Spreadsheets-are-all-you-need) at AI Tinkerers Seattle

Spreadsheets are all you need
26 Oct 202310:13

Summary

TLDRIn this engaging presentation, Isan showcases a personal project that demonstrates the power of AI and edge computing. He introduces a Python script using GPT-2 and a spreadsheet model that emulates the Transformer architecture without any coding, allowing users to interact with and understand the mechanics of AI models like GPT-2. The project, which he describes as both a teaching tool and a fascinating exploration of AI's inner workings, provides insights into the attention mechanism and the Chain of Thought prompting technique. Isan also shares his experience with the challenges of implementing such a model in a spreadsheet and offers resources for further exploration.

Takeaways

  • 👨‍💻 The speaker, Isan, works at an edge compute platform that powers 4% of the internet and is open to discussing AI and Edge projects.
  • 🚀 Isan has a personal side project called 'Spreadsheets Are All You Need', demonstrating the use of spreadsheets for complex tasks like running AI models.
  • 🐍 The project involves using around 40-50 lines of Python code to integrate with the GPT-2 model, using a very short prompt at zero temperature.
  • 📊 Isan has created a spreadsheet that implements GPT-2 without any API calls or Python, using only spreadsheet functions.
  • ⏱️ The spreadsheet's computation is resource-intensive and can take about a minute to process, with a warning against running it on a Mac due to potential UI lockups.
  • 🛠️ The spreadsheet serves as a teaching tool, analogous to the way computer architecture courses help understand system building and programming.
  • 🎥 Isan is creating a series of videos that walk through every step of the GPT-2 implementation, focusing on the inference pass.
  • 🔍 The spreadsheet allows for hands-on exploration of the Transformer model, including attention mechanisms and the Chain of Thought prompting technique.
  • 🤖 The project provides insights into why certain aspects of the model are named the way they are, such as 'attention is all you need'.
  • 📈 The speaker shares the 'weights' tab of the spreadsheet, which contains a massive amount of data, including all 124 million parameters of GPT-2.
  • 🚫 The project has limitations, such as a maximum of 10 tokens due to the size of the weight matrices, and the difficulty of programmatically expanding the model.

Q & A

  • What is Isan's day job?

    -Isan works at an edge compute platform that powers 4% of the internet.

  • What is the purpose of Isan's personal side project?

    -Isan's personal side project aims to demonstrate and teach the concepts of AI and Edge computing using a unique tool: a spreadsheet that implements GPT-2 without any API calls or Python code.

  • How does Isan's spreadsheet project work?

    -The spreadsheet project uses Excel functions to implement GPT-2, allowing users to input prompts and receive outputs without writing any code. It's designed to be an educational tool for understanding the Transformer model and its mechanisms.

  • What are the benefits of using a spreadsheet to teach about GPT-2?

    -Using a spreadsheet to teach about GPT-2 provides a more approachable and tangible way for both non-developers and developers to understand the model's architecture and functionality. It allows users to visually track the information flow and make changes to see the effects in real-time.

  • What is the 'Chain of Thought' prompting technique mentioned in the script?

    -Chain of Thought prompting is a technique where the AI is given more context or steps to reason through a problem. It helps the AI provide more detailed and accurate responses by simulating a step-by-step reasoning process similar to human thought.

  • What issues did Isan encounter while implementing GPT-2 in a spreadsheet?

    -Isan faced challenges such as the spreadsheet's large size causing the Mac UI to lock up randomly, and dealing with the complexity of implementing bite pair encoding and positional embeddings within Excel's limitations.

  • How can one access Isan's spreadsheet project?

    -The spreadsheet project can be accessed by visiting the website 'spreadsheets are all you need' where users can find videos, download the spreadsheet, and report any bugs or ask questions.

  • What is the significance of the attention mechanism in Transformers?

    -The attention mechanism in Transformers allows the model to focus on different parts of the input sequence when generating each output element. It helps the model to handle long-range dependencies and understand the context better, which is crucial for tasks like text understanding and generation.

  • How does the spreadsheet demonstrate the 'attention is all you need' concept?

    -The spreadsheet visually shows how the attention mechanism works by allowing users to see how tokens interact with each other at each layer. It provides a clear demonstration of how the model attends to different parts of the input, which is a core concept in Transformer models.

  • What was Isan's experience with running GPT-2 from source?

    -Isan found running GPT-2 from source to be a challenging experience due to the complexity of the process and the difficulty in getting a working environment set up, particularly with TensorFlow 1.x.

  • What is the size of the spreadsheet in terms of parameters and file size?

    -The spreadsheet contains all 124 million parameters of GPT-2 and is 1.5 GB in size in Excel binary format.

Outlines

00:00

🚀 Introduction to AI and Edge Computing

The speaker, Isan, introduces himself and his work at an edge compute platform that powers 4% of the internet. He invites those with AI and Edge-related projects to connect. The main focus of the video, however, is a personal project called 'Spreadsheets Are All You Need,' showcasing a Python code snippet using the GPT-2 model with a short prompt at zero temperature. The speaker emphasizes the simplicity of the prompt and its obvious completion, providing an example of how the model's predictions work based on the input given.

05:02

📊 Spreadsheet Implementation of GPT-2

Isan demonstrates a spreadsheet that implements GPT-2 without any API calls or Python code, using only spreadsheet functions. He explains the process of running the model within Excel, including the need to manually recalculate due to the time-consuming nature of the operation. He warns against running it on a Mac due to potential UI lockups and shares his experience with the threading issues. The speaker positions this tool as an educational resource, akin to computer architecture courses, to understand the underlying mechanisms of Transformers and LMS models. He also discusses the benefits of this approach, such as its accessibility to non-developers and the practical insights it offers to developers, including a deeper understanding of concepts like the attention mechanism in Transformers.

10:09

🔍 Deep Dive into GPT-2's Architecture in Spreadsheet

Isan continues to explore the intricacies of GPT-2's architecture within the spreadsheet. He walks through the various components, such as the attention mechanism, residual connections, and multi-layer perceptron, explaining how they function and interact. He also touches on the potential for using the model for Chain of Thought prompting, providing a technical explanation for its effectiveness. The speaker shares his experience with the model's performance, including the challenges of running it from source and the limitations of the Excel implementation, such as the character limit and the need for manual rearrangement of matrices.

Mindmap

Keywords

💡Edge Computing

Edge computing refers to the practice of processing data closer to where it is generated, rather than in a centralized data center or cloud. This can reduce latency and improve the speed of data processing. In the video, the speaker mentions working at an edge compute platform, highlighting the significance of edge computing in powering a substantial portion of the internet.

💡AI

Artificial Intelligence (AI) is the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions. In the context of the video, AI is used in conjunction with edge computing to enhance various applications, and the speaker expresses interest in discussing how AI and edge can assist in projects.

💡Transformers

Transformers are a type of deep learning model used for natural language processing. They are known for their ability to handle sequences of data and are particularly effective for tasks such as translation, summarization, and text generation. In the video, the speaker discusses using Transformers in conjunction with GPT-2, a specific implementation of a Transformer model.

💡GPT-2

GPT-2, or Generative Pre-trained Transformer 2, is an open-source language model developed by OpenAI. It is known for its ability to generate human-like text based on a given prompt. The speaker in the video has implemented GPT-2 within a spreadsheet, showcasing the model's versatility and accessibility.

💡Spreadsheets

Spreadsheets are software applications used for organizing, analyzing, and storing data in tabular form. They are commonly used for tasks such as accounting, data analysis, and project management. In the video, the speaker has creatively used spreadsheets as a platform to implement and interact with complex AI models like GPT-2, demonstrating the potential for educational and analytical purposes.

💡Inference

In the context of machine learning and AI, inference refers to the process of using a trained model to make predictions or draw conclusions on new data. The speaker in the video has created a series of videos that walk through the inference process of GPT-2, explaining each step from input encoding to prediction.

💡Attention Mechanism

The attention mechanism is a crucial component of Transformer models, allowing the model to weigh the importance of different parts of the input data when generating a response. It helps the model focus on relevant information and is often likened to how humans pay attention to specific parts of a conversation or task. In the video, the speaker discusses the attention mechanism and its role in the Transformer architecture.

💡Chain of Thought Prompting

Chain of Thought prompting is a technique used in AI where the model is encouraged to 'think aloud' during the problem-solving process. This can lead to more logical and step-by-step responses, similar to how a human might reason through a problem. The speaker in the video relates this concept to the information flow observed in the spreadsheet implementation of GPT-2.

💡Positional Encoding

Positional encoding is a method used in Transformer models to incorporate the order of the input sequence into the model's processing. It is essential because the self-attention mechanism in Transformers does not inherently account for the order of the input elements. In the video, the speaker discusses the impact of positional encoding on the embeddings within the spreadsheet model.

💡Layer Normalization

Layer normalization is a technique used to improve the training of deep neural networks by normalizing the input to each layer's activation function. It helps in stabilizing the learning process and can lead to faster convergence during training. In the video, layer normalization is mentioned as part of the GPT-2 model's architecture, which the speaker has implemented in the spreadsheet.

💡Residual Connections

Residual connections are a type of skip connection used in deep neural networks, where the output of one layer is added back to the input of the same layer or an earlier layer. This technique helps in mitigating the vanishing gradient problem and allows for the training of deeper networks. In the context of the video, the speaker discusses residual connections as part of the GPT-2 model's structure.

Highlights

Isan works at an edge compute platform that powers 4% of the internet.

Isan's personal side project is called 'Spreadsheets are all you need'.

The project involves using 40-50 lines of Python code with Hugging Face Transformers and GPT-2 small at zero temperature.

Isan demonstrates a spreadsheet that implements GPT-2 without any API calls or Python, using only spreadsheet functions.

The spreadsheet requires manual recalculation due to its computational intensity.

Running the spreadsheet on a Mac may cause the UI to lock up due to threading issues.

The project serves as a teaching tool for understanding Transformers and LMS.

Isan is creating videos that walk through every step of the GPT-2 implementation for the inference pass.

The spreadsheet allows for a hands-on, approachable understanding of AI models, even for non-developers.

Isan explains the attention mechanism in Transformers, how information flows through the model.

Chain of Thought prompting is practically demonstrated by giving the model more space and vectors to compute against.

The spreadsheet contains the entire GPT-2 model, including all 124 million parameters.

The spreadsheet is 1.5 GB in size and hosted on GitHub due to its large file size.

Isan shares his experience of implementing GPT-2 from source and the challenges faced.

The project aims to provide a visual and interactive learning experience for computer science concepts related to AI.

Isan's project has the potential to demystify AI models and make them more accessible to a wider audience.

The spreadsheet includes detailed components such as token embeddings, positional embeddings, and attention values.

Isan's project showcases the complexity and depth of AI models in a tangible and understandable way.

Transcripts

play00:00

hi everyone I'm isan my day job actually

play00:03

is working at an edge compute platform

play00:05

that poers 4% of the internet so if

play00:08

you've got something that you think is

play00:10

AI and Edge can help it out let me know

play00:12

I'd like to talk to you but today is a

play00:14

personal side project I call

play00:16

spreadsheets are all you need so uh let

play00:19

me just cut to the punchline here uh

play00:23

let's move here this is let's blow this

play00:26

up can you guys see that yeah okay it's

play00:29

about 40 50 lines of python code just

play00:33

hug and face Transformers draun gpt2

play00:36

small using uh very short prompt at zero

play00:41

temperature so I'm going to put in a

play00:44

really simple prompt once it decides to

play00:46

process there we go Mike is quick he

play00:51

moves and I like this prompt because

play00:53

it's small and the completion is really

play00:55

obvious right what would you expect

play00:57

about Mike knowing that he's quick well

play00:59

that he moves quickly um

play01:03

so now let me go over here this is a

play01:07

spreadsheet that also

play01:09

implements all of gpt2

play01:12

Spawn no API calls no python entirely in

play01:17

spreadsheet functions what so here's

play01:21

what we're going to

play01:22

do we are going to push this button so

play01:25

you see that where it is here that's

play01:26

where your predicted token is going to

play01:27

come in it's not doing it right now

play01:30

because it takes so long to run I turned

play01:32

off automatic recalculations there this

play01:33

mode in Excel you got to push a button

play01:35

to

play01:36

recalculate uh so I'm going to hit this

play01:38

button and then we should get quickly

play01:41

right right here you guys ready yeah

play01:45

okay here we

play01:48

go okay now you can see at the bottom I

play01:50

don't know if you can see way in the

play01:51

back you see it's like calculating

play01:53

calculating calculating this is going to

play01:56

take about a minute just to warn you

play01:57

guys um and by the the way do not run

play02:00

this on a Mac it is so big the Mac UI

play02:05

will lock up on you for a minute

play02:07

randomly at time there's no reason it

play02:09

should do that somebody messed up on the

play02:11

threading you're running on a m I am

play02:13

running on

play02:14

a so I I actually after making this

play02:17

whole thing work for months dealing with

play02:19

it randomly stopping on me for for a

play02:21

minute at a time I tried on my wife's PC

play02:24

and it never locked up so somebody

play02:26

messed up on the

play02:27

implementation uh there there you

play02:32

goly okay so uh okay good God why would

play02:38

anyone do this so besides just being a

play02:42

mass kist well uh a couple reasons um

play02:45

the obviously you're not going to run

play02:46

production workloads on this right it's

play02:48

really a teaching tool so if you've had

play02:51

you know formal computer science

play02:52

training there's a class usually they

play02:54

called computer architecture or computer

play02:56

organization right they start with

play02:58

circuits they build themand game

play03:00

then logic gates then into an Alo and

play03:02

then all the way to a microprocessor and

play03:04

even if you're not going into chip

play03:06

design it gives you a really good

play03:08

grounding when you're actually

play03:09

programming and Building Systems out

play03:11

processors on top of it uh and what I'm

play03:14

trying to do actually with the

play03:15

spreadsheet is do the same thing but for

play03:18

Transformers and LMS so I'm creating a

play03:21

bunch of videos uh there's only two so

play03:24

far where I'm walking through every step

play03:27

of the gpt2 implement a uh at least for

play03:31

the inference pass so from bite pair

play03:33

encoding all the way to the layer Norms

play03:36

the residual connections the

play03:38

multi-headed tension the multi-layer

play03:40

perceptron and getting the loit up to

play03:42

the prediction uh step by step um and I

play03:45

think you know two benefits one is like

play03:47

as a non-developer this is really

play03:49

approachable there's something visceral

play03:51

about being able to play with this even

play03:53

as a developer I think it's useful I'll

play03:55

just give you one you know example um

play03:58

you know personally

play04:00

like oh my God I now know why they said

play04:03

and called it attention is all you need

play04:05

what else are you going to call it when

play04:06

you watch the information flow

play04:08

especially in a spreadsheet of

play04:09

everything going through the Transformer

play04:11

it's really crazy you type all these

play04:14

tokens in and they only talk to each

play04:16

other once at each layer at the

play04:18

attention mechanism like you can

play04:20

actually take the diffs like you create

play04:22

a tab you create the additional tab in

play04:23

the spreadsheet and just see how they

play04:24

change and make changes and you're like

play04:26

it's amazing there's only one time they

play04:28

talk to each other so that seems a

play04:30

little theoretical but like a practical

play04:31

example would be Chain of Thought

play04:33

prompting which you guys probably heard

play04:35

of you know we can think of that

play04:37

anthropomorphically like oh yeah as a

play04:39

person if you gave ask me to think aloud

play04:41

I might reason better but if you really

play04:43

want to ground it more technically more

play04:45

satisfying Theory and the one I kind of

play04:46

subscribe to that some people believe

play04:48

and when you see the information flow

play04:50

makes a lot of sense what you're really

play04:51

doing is you're a giving it more space

play04:54

more vectors to compute against but

play04:56

you're also giving it more hits at that

play04:58

detention Mech more passes at it okay so

play05:01

that's I think my one minute timer or

play05:04

even less I mod zero Les but you can go

play05:06

yeah okay so that's it I'll wrap up if

play05:08

you want to download the spreadsheet go

play05:10

to spreadsheets are all you

play05:12

need. and you can see the videos and

play05:14

download the spreadsheet and let me know

play05:16

if you find bugs or question thank

play05:22

you see the weights tab so bad oh you

play05:26

want to see the weights tab oh my God

play05:28

okay so okay so first of all okay so

play05:32

you've got the prompt to tokens that's

play05:34

here these are some random constants

play05:36

this is where whoops this is where I

play05:39

actually do bite pair and coating inside

play05:41

a spreadsheet that's the hardest thing

play05:43

to do inside this thing because it's not

play05:44

matte mold it's all con cats but then

play05:47

you get here this is where we convert

play05:49

them to embeddings so here's your uh

play05:51

text embeddings and then here's your

play05:53

positional embeddings and one of the

play05:54

videos a really great demo is you change

play05:56

one of the tokens and you make them

play05:58

duplicate between the top and the the

play05:59

bottom and you can see that they're

play06:00

identical here but they're not identical

play06:02

here because the positional encoding has

play06:04

changed them um then here are the blocks

play06:06

and there are 12 of these this is block

play06:08

zero each of these has who uh there's a

play06:11

lot here oh this is what happens when

play06:13

you're in parallels uh it randomly

play06:16

decides to start scrolling but oh wait

play06:18

that's not it running that's it

play06:19

scrolling that's scrolling yeah welcome

play06:21

welcome to running in BM that's also

play06:22

what leans it's slow uh I don't know if

play06:24

you can see the 16 Here There are 16

play06:26

steps you can follow along that's you

play06:29

know each layer inside one block so

play06:31

here's like the residual connection

play06:33

here's here's your attention values uh

play06:37

here's the linear projection of that

play06:38

let's see there's residual connection

play06:41

there's the layer

play06:42

normalization uh and look I'll click on

play06:44

this this is this is all spreadsheet

play06:45

there's a a m ball right right there

play06:48

that's like massively long this was like

play06:51

a serious amount of time span oh yeah

play06:54

like this is your attention Matrix in

play06:55

here so like

play06:57

he I say my

play07:00

is lick he

play07:04

moves right and then you can see it

play07:07

reference oh no that's not the multi hit

play07:10

this is the soft Max this is what I was

play07:11

going for so Mike is quick he and this

play07:17

is one head moves right and you can see

play07:20

it actually looking at itself so this

play07:23

would be the mic here and you can see he

play07:26

is looking at mik you see that's a 73 so

play07:28

it's referencing that but keep in mind

play07:30

this is one head and there's a really

play07:31

good open AI paper where they actually

play07:33

ask gp4 to explain parts of gpt2 and I

play07:37

you know I think it' be actually really

play07:38

interesting to put that into the

play07:39

spreadsheet and take a look but anyways

play07:41

you're asking about the weights so

play07:43

here's your 11th block one block feeds

play07:45

into the other uh this formula basically

play07:48

calculates from which block it is which

play07:50

weight do you see these names right here

play07:53

so there is a version of this name for

play07:55

every single weight so this is layer

play07:57

known this is predicted token

play08:00

this is your ID to tokens this is your

play08:03

triangle mask right for caal this is the

play08:06

most a thing I've ever this is this is

play08:09

your positional encoding by the way you

play08:12

know one of the things I was doing like

play08:14

I was asking chat GPT to tell me about

play08:16

the architecture of GPT while I was

play08:18

doing it it gets some things wrong like

play08:20

it told me the positional encodings were

play08:22

sinusoids even though they're learned uh

play08:24

and I was like no you're wrong and then

play08:26

finally apologize um but but it was

play08:29

really helpful most of the time so then

play08:31

okay here's your attention waste right

play08:32

this is the like there

play08:35

are remember I counted at one point but

play08:37

like there is weight Matrix after weight

play08:39

Matrix after weight Matrix after weight

play08:42

Matrix all the way down all 124 million

play08:45

parameters are in here this sheet is 1.5

play08:50

GB in Excel binary format not the XML

play08:54

form so I couldn't put it on I like

play08:57

where do I poost this thing so it's

play08:59

hosted right now as a release on GitHub

play09:02

because I couldn't put it as just an

play09:03

Excel file it was too big so uh and then

play09:07

the other problem is it's really limited

play09:10

like 10 characters 10 uh tokens if I

play09:14

want to expand it I have to rearrange

play09:16

the whole Matrix did you make it

play09:18

programmatically like how you type wait

play09:21

what I did not I after I did I'm like oh

play09:23

I should do this

play09:26

program I I really wanted you you know

play09:29

in the that movie The Matrix right when

play09:31

you're looking at the the numbers that's

play09:34

what it felt like and and I

play09:39

should good uh I really I started

play09:43

playing around in June oh the other

play09:46

problem is I tried running gpt2 from

play09:48

Source don't do that it's britne me a

play09:52

tensorflow one it's really hard to get a

play09:54

working a collab notebook you got use

play09:57

yeah you got to use uh but not on a map

play10:00

anyways I should wrap up

play10:08

but okay that was amazing I'm just going

play10:10

to do kind of quick things first of all

Rate This

5.0 / 5 (0 votes)

相关标签
AI-EdgeGPT-2SpreadsheetTransformersInteractive-LearningComputer-ScienceTeaching-ToolProgrammingAttention-MechanismChain-of-Thought
您是否需要英文摘要?