Demystifying how GPT works: From Architecture to...Excel!?! ð
Summary
TLDRãã®ãããªã·ãªãŒãºã§ã¯ãã¹ãã¬ããã·ãŒãã䜿ã£ãŠGPT-2ãChatGPTã®åæã®ç¥å ã§ãã倧èŠæš¡èšèªã¢ãã«ãå®è£ ããæ¹æ³ã玹ä»ããŸããGPT-2 smallãäŸã«ãããã¹ããããŒã¯ã³ã«åå²ããããããã®ããŒã¯ã³ãæ°å€ã®ãªã¹ãã«ãããã³ã°ããããã»ã¹ããããã«ããããã¢ãã³ã·ã§ã³ããã«ãã¬ã€ã€ãŒããŒã»ãããã³ãå«ãã¢ãã«ã®æ§é ãŸã§ããåºæ¬çãªã¹ãã¬ããã·ãŒãæ©èœã䜿çšããŠè§£èª¬ããŸãããã®ã¢ãããŒãã«ãããçŸä»£ã®AIæè¡ãã©ã®ããã«æ©èœãããã«ã€ããŠãããæ·±ãç解ãåŸãããšãã§ããŸããä»åŸã®ãããªã§ã¯ããããã®åã¹ãããã«ã€ããŠè©³ãã説æããŠãããŸãã
Takeaways
- ð ãã®ã·ãªãŒãºã§ã¯ãåºæ¬çãªã¹ãã¬ããã·ãŒãæ©èœã ãã§å€§ããªèšèªã¢ãã«GPT-2ãå®è£ ããŠããã
- ð ããã¹ãã¯ããŒã¯ã³ã«åå²ããããããã¯äºåå®çŸ©ãããèŸæžã«åºã¥ããŠããã
- 𧮠ããŒã¯ã³ã¯ãã€ããã¢ç¬Šå·åãšããã¢ã«ãŽãªãºã ã䜿çšããŠããŒã¯ã³IDã«ãããã³ã°ãããã
- ð åããŒã¯ã³ã¯ãæå³ãšäœçœ®ããã£ããã£ãã768ã®æ°åã®ãªã¹ãã«ãããã³ã°ãããã
- ð ããŒã¯ã³ããããã¹ããžã®åã蟌ã¿ã¯ãããŒã¯ã³ã®æå³ãšããã³ããå ã®äœçœ®ãåæ ããŠããã
- ð¡ ãã«ããããã¢ãã³ã·ã§ã³ãšãã«ãã¬ã€ã€ãŒããŒã»ãããã³ïŒãã¥ãŒã©ã«ãããã¯ãŒã¯ã®äžçš®ïŒãéããŠãããŒã¯ã³éã®é¢ä¿ã解æãããã
- ð åãããã¯ã®åºåã¯æ¬¡ã®ãããã¯ã®å ¥åãšããŠäœ¿çšãããGPT-2ã¯12ã®ç°ãªãã¬ã€ã€ãŒãéããŠãã®ããã»ã¹ãç¹°ãè¿ãã
- ð¯ ã¢ãã³ã·ã§ã³ã¡ã«ããºã ã¯ãæäžã®éèŠãªåèªããããã®é¢ä¿ãèå¥ããã
- ð€ ãã«ãã¬ã€ã€ãŒããŒã»ãããã³ã¯ãäžããããæèã§ã®åèªã®æãå¯èœæ§ã®é«ãæå³ã決å®ããã
- ð æçµçãªèšèªãããã¯ãæãå¯èœæ§ã®é«ã次ã®ããŒã¯ã³ãéžæãããããæã«è¿œå ããã
Q & A
GPT-2ã®ã¹ãã¬ããã·ãŒãå®è£ ã§ã¯ãã©ã®ããã«ããã¹ããåŠçãããŸããïŒ
-ããã¹ãã¯ãŸãããŒã¯ã³ã«åå²ãããŸããååèªã¯äºåå®çŸ©ãããèŸæžã«åºã¥ããŠããŒã¯ã³ã«å€æãããã¹ãã¬ããã·ãŒãã®ãããã³ããããããŒã¯ã³ãžãã¿ãã§ãã€ããã¢ç¬Šå·åã¢ã«ãŽãªãºã ã«ããæçµçãªããŒã¯ã³IDã«ããããããŸãã
åã蟌ã¿ïŒembeddingïŒãšã¯äœã§ããããããŠGPT-2ã§ã©ã®ããã«äœ¿çšãããŸããïŒ
-åã蟌ã¿ã¯ãåããŒã¯ã³ãæ°å€ã®ãªã¹ãã«ãããã³ã°ããããã»ã¹ã§ããGPT-2ã¹ã¢ãŒã«ã§ã¯ãåããŒã¯ã³ã¯768ã®æ°å€ã®ãªã¹ãã«ããããããããã¯ããŒã¯ã³ã®æå³ãšäœçœ®ãæããŸãã
äœçœ®åã蟌ã¿ã®ç®çã¯äœã§ããïŒ
-äœçœ®åã蟌ã¿ã¯ãããŒã¯ã³ã®ããã³ããå ã®äœçœ®ã«å¿ããŠåã蟌ã¿å€ããããã«å€æŽããããšã§ãããŒã¯ã³ã®äœçœ®æ å ±ãæããŸããããã«ãããã¢ãã«ã¯åãåèªã§ãç°ãªãæèã§ã®æå³ãåºå¥ã§ããŸãã
å€é 泚ææ©æ§ïŒmulti-headed attentionïŒã®åœ¹å²ã¯äœã§ããïŒ
-å€é 泚ææ©æ§ã¯ãæäžã®åèªãã©ã®ããã«é¢é£ããŠããããç解ããéèŠãªåèªãç¹å®ããããšã§ãæèãææ¡ããŸããäŸãã°ããheãããMikeããæãããšãèªèãããªã©ã§ãã
å€å±€ããŒã»ãããã³ã®æ©èœãšã¯äœã§ããïŒ
-å€å±€ããŒã»ãããã³ã¯ãåèªã®è€æ°ã®æå³ãåºå¥ããæèã«åºã¥ããŠæãé©åãªæå³ãéžæãã圹å²ãæãããŸããããã«ãããã¢ãã«ã¯ç¶ãåèªãããŒã¯ã³ãããæ£ç¢ºã«äºæž¬ã§ããŸãã
èšèªãããïŒlanguage headïŒã®åœ¹å²ã¯äœã§ããïŒ
-èšèªãããã¯ãæçµãããã¯ã®åºåã確çã»ããã«å€æããèŸæžå ã®æ¢ç¥ã®ããŒã¯ã³ããæãå¯èœæ§ã®é«ãããŒã¯ã³ãéžæããŠæãå®æãããŸãã
GPT-2ã®ã¹ãã¬ããã·ãŒãå®è£ ã§ãã©ã®ããã«ããŠæ¬¡ã®ããŒã¯ã³ãéžæãããŸããïŒ
-ã¹ãã¬ããã·ãŒãã§ã¯ãæçµãããã¯ã®åºåããçæããã確çã«åºã¥ããŠãæãå¯èœæ§ã®é«ãããŒã¯ã³ãéžæãããŸãããã®ãã¢ã§ã¯ãæãé«ã確çãæã€ããŒã¯ã³ãéžæãããŠããŸãã
GPT-2ã¢ãã«ã®ç¹°ãè¿ãããã»ã¹ã«ãããåãããã¯ã®åœ¹å²ã¯äœã§ããïŒ
-GPT-2ã®åãããã¯ã¯ã泚ææ©æ§ãšããŒã»ãããã³ãå«ã¿ãå ¥åãåãåãããããåŠçããŠæ¬¡ã®ãããã¯ãžã®åºåãçæããŸãããã®ããã»ã¹ã¯ã12ã®ç°ãªãã¬ã€ã€ãŒãŸãã¯ãããã¯ãéããŠç¹°ãè¿ãããŸãã
ããŒã¯ã³ãã©ã®ããã«ããŠåã蟌ã¿ã«ãããããããã®äŸãæããŠãã ããã
-äŸãã°ã'Mike' ãšããåèªã¯ãããŒã¯ã³IDã«ãããããããã®åŸã768ã®æ°å€ãããªããªã¹ãã«å€æãããŸããããã«ãããåèªã®æå³ãšãã®äœçœ®ãè¡šçŸãããŸãã
枩床ïŒtemperatureïŒãŒããšã¯äœãæå³ããŸããïŒ
-枩床ãŒããšã¯ãã¢ãã«ãæãå¯èœæ§ã®é«ã1ã€ã®ããŒã¯ã³ã®ã¿ãéžæããç¶æ ãæããŸããããã¯äžè²«æ§ã®ããåºåãæäŸããŸãããããå€ãã®ããŒã¯ã³ããéžæããããšã§å€æ§æ§ãæãããããšãã§ããŸãã
Outlines

ãã®ã»ã¯ã·ã§ã³ã¯ææãŠãŒã¶ãŒéå®ã§ãã ã¢ã¯ã»ã¹ããã«ã¯ãã¢ããã°ã¬ãŒãããé¡ãããŸãã
ä»ããã¢ããã°ã¬ãŒãMindmap

ãã®ã»ã¯ã·ã§ã³ã¯ææãŠãŒã¶ãŒéå®ã§ãã ã¢ã¯ã»ã¹ããã«ã¯ãã¢ããã°ã¬ãŒãããé¡ãããŸãã
ä»ããã¢ããã°ã¬ãŒãKeywords

ãã®ã»ã¯ã·ã§ã³ã¯ææãŠãŒã¶ãŒéå®ã§ãã ã¢ã¯ã»ã¹ããã«ã¯ãã¢ããã°ã¬ãŒãããé¡ãããŸãã
ä»ããã¢ããã°ã¬ãŒãHighlights

ãã®ã»ã¯ã·ã§ã³ã¯ææãŠãŒã¶ãŒéå®ã§ãã ã¢ã¯ã»ã¹ããã«ã¯ãã¢ããã°ã¬ãŒãããé¡ãããŸãã
ä»ããã¢ããã°ã¬ãŒãTranscripts

ãã®ã»ã¯ã·ã§ã³ã¯ææãŠãŒã¶ãŒéå®ã§ãã ã¢ã¯ã»ã¹ããã«ã¯ãã¢ããã°ã¬ãŒãããé¡ãããŸãã
ä»ããã¢ããã°ã¬ãŒãé¢é£åç»ãããã«è¡šç€º

ãChatGPT以å€ãç°¡åæ¯èŒïŒãè€æ°LLMãåæã«äœ¿çšã§ãããäžçºæ€çŽ¢ãã®æ©èœãGMOãæããŠAIãã«æèŒ

Coze | How to use Workflows

LocalAI LLM Single vs Multi GPU Testing scaling to 6x 4060TI 16GB GPUS

倧èŠæš¡èšèªã¢ãã«ã¯ãã ã®é·ç§»å³ãå®éã«äœã£ãŠç解ãããïŒã倧èŠæš¡èšèªã¢ãã«1ã#129

Attach evaluators to datasets | LangSmith Evaluations - Part 9

Enhancing search on AWS with AI, RAG, and vector databases (L300) | AWS Events

How Microsoft Copilot for Microsoft 365 works
5.0 / 5 (0 votes)