Code Review: Clojure Lexer

TheVimeagen
21 Jun 202312:54

Summary

TLDRこのビデオスクリプトでは、プログラミング言語Clojureに関する議論が行われています。ClojureはJVM上で動作し、Lisp系言語の特徴を備えています。話者はClojureの関数定義とパターンマッチング、そしてレキシカル解析器の動作について詳しく説明し、コードを通して言語の柔軟性と強さを強調しています。また、Clojureスクリプトを実行するためのツールとしてbabashkaというインタープリタの存在も紹介されています。このツールはスクリプト作成に便利で、並列処理も可能と話題に挙がっています。

Takeaways

  • 🤔 脚本中讨论了编程语言Elixir和Clojure,以及它们在JVM上运行的背景。
  • 📝 强调了代码的可读性,提到了代码清洁的重要性,以及在代码中使用清晰的命名约定。
  • 📈 讨论了函数的多态性,即同一个函数名可以有多个不同的实现,这与参数的数量和类型有关。
  • 🔍 提到了模式匹配的概念,这是一种编程技术,允许根据输入数据的结构来执行不同的代码路径。
  • 📚 讨论了在编程中使用递归,即函数调用自身以简化复杂任务的处理。
  • 🌈 幽默地提到了代码编辑器Emacs和Lisp程序员对括号颜色的偏好,暗示了工具对提高编程体验的重要性。
  • 📖 提到了`token`文件和`token create`函数,这可能与解析输入字符串并生成相应的标记(token)有关。
  • 🔢 讨论了如何处理和解析输入字符串,包括字符位置和字符串切片的概念。
  • 📦 提到了状态管理,以及如何在函数式编程中显式传递状态,而不是隐式地在类或对象中维护状态。
  • 🧵 讨论了并发和多线程的概念,以及如何在Clojure中实现这些功能,以及它们在脚本和批处理任务中的潜在用途。
  • 🚧 以幽默的方式讨论了Bash脚本的复杂性,以及在达到一定的复杂度后,Bash脚本可能变得难以管理。

Q & A

  • ElixirとClojureの違いは何ですか?

    -ElixirはErlang VM上で動作する関数型プログラミング言語であり、ClojureはJVM上で動作するLisp系の関数型プログラミング言語です。

  • Clojureの命名にはどのような誤解があると話されていますか?

    -話者は、Clojureの名前がJVM上で始まったからと自分で作り上げた誤解を持っていましたが、実際はそうではありません。

  • Clojureにおける関数のオーバーロードはどのように機能しますか?

    -Clojureでは、defnマクロを使用して関数を定義し、引数の数が異なる複数の関数をオーバーロードすることができます。

  • Clojureで文字列を処理する際に使用される'lex'関数はどのような働きをしますか?

    -lex関数は入力された文字列を解析し、トークンを作成する際に使用されます。入力が0の場合にはエラーメッセージを返し、それ以外では文字列を継続的に解析していきます。

  • Clojureにおけるパターンマッチングとは何ですか?

    -パターンマッチングは、関数を呼び出す際に引数に応じて異なる動作を定義することができる機能です。

  • Clojureでの文字列の'unpacked'状態とは何を意味しますか?

    -unpacked状態とは、文字列が分割され、各要素が個別に参照できる状態を指します。

  • Clojureで'token'とは何を意味していますか?

    -tokenは、文字列から生成された解析済みのデータの単位を指し、特定の操作や識別子を表すために使用されます。

  • Clojureの'token create'関数はどのように使用されますか?

    -token create関数は、特定の操作や識別子を表す新しいトークンを作成するために使用されます。

  • Clojureにおける'lazy sequence'とは何ですか?

    -lazy sequenceは遅延評価されたシーケンスで、必要な場合にのみ評価され、メモリ効率が良くなります。

  • Clojureでのエディタのカラーリング機能はなぜ重要だと考えられていますか?

    -カラーリングはコードの可読性を高めるため非常に重要で、特に括弧が多く使われるClojureでは、色の異なる階層を視覚的に区別しやすくなります。

  • Clojureの開発においてEmacsの目的は何だと話されていますか?

    -EmacsはLisp言語の開発者たちが括弧のカラーリングを実現するために開発されたエディタであり、その機能はその後他の言語にも拡張されました。

  • Clojureで状態を管理する際のアプローチはどのようなものでしょうか?

    -Clojureでは状態を明示的に関数に渡す必要があり、状態を隠蔽する構造はあまり使用されません。

  • Clojureにおける'predicate'とは何を意味していますか?

    -predicateは、特定の条件を満たすかどうかを判定する関数であり、文字列の解析プロセスで使用されます。

  • Clojureでの'cons'操作とはどのような機能ですか?

    -consは新しいリストを作成する操作で、新しい要素をリストの先頭に追加することができます。

  • Clojureでの'babashka'とはどのようなツールですか?

    -babashkaはClojureのインタープリターであり、スクリプト実行のための便利なツールです。

Outlines

00:00

😀 プログラミング言語クロージャーの紹介とその機能の解説

この段落では、プログラミング言語Elixirとクロージャーについて話しています。話者はクロージャーのスペルを思い出し、JVMで始まったことからそのスペルを覚えるという個人的な話もあります。次に、関数Lexの定義とその動作について説明し、複数の引数を持つ関数のオーバーロードについても触れています。最後に、コードの読み方と理解について話し、コードのクリーンさを重要視する意見も述べています。

05:03

😉 トークンの生成とクロージャーでのリスト操作の解説

段落2では、トークンの生成とリスト操作について詳しく説明しています。トークンの種類や位置、そしてそれらを生成する関数token-two-upとtoken-createの働きについて話されています。また、関数Lexがどのように再帰的に呼び出され、文字列を操作し、トークンを作成するプロセスについても解説しています。さらに、エディタのプラグインやemacsの機能についても言及しており、それらがLispプログラマーにとってどのような意味を持つかについても議論しています。

10:04

🤔 babashkaの紹介とクロージャーでの並列処理の利点

最後の段落では、babashkaというインタープリタとそれに付随するスクリプト用ユーティリティについて紹介しています。babashkaを使用することで、bashスクリプトをクロージャーで記述し、より柔軟性のある並列処理が可能になるという利点について説明しています。また、bashでのスクリプト作成の難しさと、babashkaがそれを簡素化する理由についても議論しており、話者は自身がクロージャーでストリーミングを書いた経験も共有しています。

Mindmap

Keywords

💡Elixir

Elixirは、Erlang VM上で動く関数型プログラミング言語です。このビデオでは、Elixirとクロージャーという別の話題を比較して議論していますが、最終的にはクロージャーに焦点を当てています。Elixirは高信頼性と並行性を特徴としており、分散システムやリアルタイムアプリケーションの開発に適しています。

💡クロージャー

クロージャーは、関数型プログラミングにおける重要な概念で、関数とその関数が参照する変数を束包にまとめたものです。ビデオでは、クロージャーのスペルや由来について冗談めかして議論し、その後は実際にクロージャー言語に関するコードを分析しています。クロージャーは、関数型プログラミングでの高階関数や部分適用などのテクニックに不可欠です。

💡JVM

JVMとはJava Virtual Machineの略で、Javaバイトコードを実行する仮想マシンです。ビデオでは、クロージャーがJVM上で動作する言語であると触れていますが、これは誤った情報です。実際には、クロージャーはGoogleが開発した言語で、独自の仮想マシン上で動作します。この誤解は、話者によるジョークとして提示されています。

💡Lex

Lexは、言語処理において用語解析や構文解析を行うツールやライブラリを指し、このビデオでは特定のコードにおける関数の呼び出しを指しています。Lexは入力文字列をトークンに分割し、次に解析される形に変える役割を果たします。ビデオでは、Lex関数がどのように動作し、コードを解析するプロセスの一部として呼び出されるかが議論されています。

💡関数オーバーロード

関数オーバーロードは、同じ名前の関数を異なる引数型や引数数で複数定義し、コンパイル時に呼び出し元の引数に基づいて適切な関数が選択されるプログラミングの概念です。ビデオでは、defnという特殊な構文を使って関数オーバーロードを実装し、異なる引数を持つ複数のLex関数を定義する例が示されています。

💡パターンマッチング

パターンマッチングは、関数型プログラミング言語でよく使われる技術で、値とパターンを比較して一致するものを見つけるプロセスです。ビデオでは、パターンマッチングがどのようにLex関数の動作に適用され、異なる引数に基づいて異なるコードパスを実行するかが説明されています。

💡トークン

トークンは、コンパイラ理論において、ソースコードを解析する際に分割される基本的な要素を指します。ビデオでは、トークンの作成とその属性(演算子、識別子、文字など)が議論されており、コード解析プロセスにおけるトークンの役割が強調されています。

💡ラムダ計算

ラムダ計算は、関数型プログラミングの基礎をなす数学的モデルであり、無名関数を定義し、関数を第一級オブジェクトとして扱うことができます。ビデオでは、ラムダ計算とクロージャー言語の関係が触れられており、ラムダ計算がクロージャー言語の設計に多大な影響を与えたと説明されています。

💡babashka

babashkaは、Clojureのスクリプト言語として知られており、Unixシェルスクリプトの代替として使用されます。ビデオでは、babashkaが提供する機能と利便性について触れられており、Clojureのコードをスクリプトとして実行し、タスクを自動化することができると説明されています。

💡emacs

emacsは、高度なテキストエディタであり、プログラミングを含む様々なタスクに使用されています。ビデオでは、emacsがLispプログラマーにとって理想的なエディタとして設計されたと触れられており、その機能とカスタマイズ可能性が強調されています。

Highlights

The discussion begins with a decision to explore Elixir and closure.

Misunderstandings about the spelling of 'closure' lead to a humorous moment.

The origin of the 'J' in 'closure' is attributed to its start on the JVM, which is a myth.

The conversation touches on the concept of pattern matching in programming.

A function named 'Lex' is introduced with various input lengths for different operations.

The importance of clean code is emphasized, with a self-critique on the current code's cleanliness.

The concept of recursive functions and their role in processing strings is discussed.

The use of 'cons' in creating new lists is explained.

The transcript includes a debate on the importance of correct spelling and the influence of popular usage.

The participants discuss the process of token creation and the function 'token create'.

The concept of a lazy sequence and its use in programming is introduced.

The transcript includes a humorous take on the history of Emacs and its creation for the purpose of coloring parentheses.

The discussion highlights the absence of a class or state in the presented code, emphasizing functional programming.

The participants delve into the intricacies of passing state to functions in a stateless programming paradigm.

The transcript includes a moment of realization about the nature of 'predicate' and its use in the code.

The concept of casting a lazy sequence to a string is discussed.

The conversation ends with a discussion about Babashka, an interpreter for Clojure that offers scripting utilities.

The practical applications of threading in Clojure are highlighted, contrasting it with the complexities of parallel processing in Bash.

Transcripts

play00:00

we could do Elixir but I also want to

play00:02

see closure you want to see closure yeah

play00:04

we can do closed air uh um let's go like

play00:07

this uh clo wait how do you spell it'll

play00:10

just be clj uh someone was being very

play00:12

cute that's very cutesy for someone who

play00:16

doesn't know closure I would just not

play00:17

even know what to do here so we'll start

play00:18

with closure

play00:20

uh here uh marker

play00:23

closure start

play00:26

uh all right flip you gotta go back a

play00:28

little bit and get the beginning of

play00:29

closure in here okay it's with a J

play00:31

though because like it started on the

play00:33

jvm yeah it's closure is that really why

play00:36

it has a j

play00:38

that is my head that is the myth that I

play00:41

made up for myself about how to remember

play00:43

why to spell it that way

play00:47

[Laughter]

play00:49

news

play00:51

closer oh it's because it started on the

play00:54

jvm what well if we say it enough times

play00:56

though then that will be what people

play00:58

think it doesn't matter like Jiff first

play01:00

GIF you know what I mean like it doesn't

play01:01

really matter what the Creator says it's

play01:03

just because the Creator says it wrong

play01:05

doesn't mean

play01:07

yeah it doesn't mean I have to be wrong

play01:09

about things all right so let's do okay

play01:11

I cannot read this immediately

play01:14

dude are you just pumped up is this just

play01:17

list is closer just lisp a lisp yeah

play01:22

okay so there's a function called Lex

play01:23

which has an input

play01:26

if it has length zero

play01:30

no it's going to say call Lex with a

play01:32

length 0 on your input or something I

play01:34

think I actually don't even know how to

play01:35

read this part

play01:37

are these off oh no no no So Def n you

play01:41

can Define multiple different ways to

play01:43

call it

play01:45

so I think this is like when we did in

play01:48

uh

play01:50

right chat anybody exactly I see I see

play01:54

what you're saying so def end means that

play01:55

there's more than one it's the operator

play01:57

oh nope

play01:58

there's only one definitely you're

play02:00

overloading yeah okay nice so it's like

play02:02

when we did it in um

play02:04

Airline you know how you could Define a

play02:06

function with different number length

play02:07

arguments and then depending on if they

play02:09

kind of like matched up then it would do

play02:10

this one right so you call Lex from the

play02:12

output with just an input so it says I'm

play02:15

going to call myself starting at spot

play02:16

zero position zero okay so would that be

play02:19

this so it doesn't go like this yeah so

play02:20

then it's gonna do the next one so that

play02:22

means we have a function this is

play02:24

function length one this is function

play02:26

length two because it actually has both

play02:28

input already and scope and this one and

play02:31

this one is function length six

play02:34

uh four I think right where do I where

play02:37

did you get six well because it still

play02:39

has input and position but it also has

play02:41

apparently pred no there's different

play02:43

paths I'm pretty I'm pretty sure so it's

play02:45

like the first one is if you call Lex

play02:47

with one no it's gonna call the first

play02:49

thing oh maybe two it's gonna do the

play02:51

second if you call it with four it's

play02:53

gonna do the last

play02:54

that's actually like pattern matching

play02:56

okay I yeah I'm too stupid for this to

play02:59

tell you the truth

play03:00

this is this I mean I I'm like it's PK

play03:04

bit mask

play03:06

S

play03:08

I don't know what the chpk actually

play03:11

means but I think it's like character or

play03:13

something and then there's oh character

play03:15

at some position maybe and then there's

play03:17

the rest of the string

play03:21

uh

play03:23

so it's like unpacking the string so you

play03:25

can't really reference like you you

play03:28

unpack it at the beginning right it's

play03:30

Peak

play03:31

it's Peak Plus rest of spring

play03:35

based on the input

play03:37

maybe something like that oh oh that's

play03:40

because you do zero position and then

play03:42

the input and this depacks it as yep

play03:45

yep

play03:47

okay cool damn okay this is this is can

play03:50

I I honestly feel emotionally uh

play03:53

difficult right now okay so let's see if

play03:55

it's empty we got an end to file nice if

play03:58

it is so the first thing you do is you

play03:59

get like your Opera you do that like the

play04:02

line above there is the first thing that

play04:03

you do I'm not a it turns it into a

play04:05

string I think okay by the way shy Ryan

play04:07

is correct this is not clean code

play04:10

not clean code I need clean I know it

play04:12

looks pretty clean to me okay because

play04:13

there's a zero right here where's the

play04:15

definite how oh my goodness don't eat

play04:18

yourself you're gonna what write the

play04:19

character Zero I don't know in other

play04:21

places what does zero mean

play04:30

pathetic all right

play04:32

um all right so let's see so we have uh

play04:34

so okay so it gets that so if

play04:35

conditional if uh we're empty then we

play04:38

have the end of file right

play04:39

yep there's no more there's no more

play04:41

string left to do so we make a list out

play04:43

of the things that we had or whatever

play04:44

return a list yep okay so this must

play04:46

recursively descend back into input or

play04:49

into deflects by doing yeah skips white

play04:52

space that's cool next input I wonder

play04:55

what that means

play04:56

it uh it must just move it forward one

play04:59

uh character one like character and then

play05:02

put I guess yeah so we're sitting right

play05:04

here okay okay okay uh two op up so

play05:09

there must be something called the two

play05:10

Ops somewhere around here

play05:12

yeah it's in there's another file called

play05:14

token

play05:17

two up nice okay so there you go so if

play05:20

it's within this list I'll call it a

play05:22

list I don't actually know what it is if

play05:23

it matches these this this list or

play05:26

something yep then we will apparently

play05:28

check out some cons I'm not really sure

play05:30

what that means that creates a new list

play05:32

con that's like uh the kind of things

play05:34

that you would do in uh it's called cons

play05:37

when you create a new list okay that it

play05:38

must be short for condescending

play05:40

yeah it's for people who put the

play05:42

parentheses on the wrong side of the

play05:44

identifier

play05:47

here to tell me how stupid I am uh

play05:50

uh let's see so

play05:53

yep I can't actually understand any of

play05:55

this at this point I've I'm completely

play05:57

lost first part they make they're making

play05:59

the list the first thing they put in

play06:01

there is they say hey let's create a new

play06:03

token so that's oh create is this file

play06:06

the whole name space the slashes or

play06:09

namespace Oh weird okay

play06:12

then we're going to do a kind which is

play06:13

an operator

play06:15

and then we're going to do operator

play06:16

position

play06:18

that must be the type of token Plus

play06:22

its current start

play06:24

I'm not sure why that all gets past the

play06:26

cons

play06:27

yeah I'm a little bit confused but I

play06:29

understand this a lazy sequence must be

play06:31

a just a pointer to a string that starts

play06:33

at the current position up more but then

play06:36

we do elixir

play06:38

it's calling Lex again so it's calling

play06:39

itself the function again oh this one

play06:42

needs to move forward two spots because

play06:44

this is a the this is the two style

play06:47

operation right so this is like less

play06:48

than or equal to so it's saying hey we

play06:51

gotta add 2 to our current position and

play06:53

then do the rest of the string

play06:55

yeah but

play07:00

I really don't know what this means

play07:01

other than I don't know what's going on

play07:03

in that line because I'm not sure what

play07:05

op and pass is for oh

play07:07

oh I think it's because look at the last

play07:10

the last one is when it gets called

play07:14

oh maybe not because I was thinking

play07:15

maybe it's matching the last case of

play07:18

pred pause input and kind so is it

play07:20

passing those to there or not no it's

play07:22

not it's only passing because it

play07:25

because Lex is only called right here

play07:27

right it's passing all of them to token

play07:29

create yes I didn't even see that yeah

play07:31

it's passing into token you don't have

play07:32

rainbow parents on so it's impossible to

play07:34

read this yeah all of the lispers are

play07:36

angry at you because you don't have

play07:37

rainbow parents rainbow parents so were

play07:40

you unable to be a lisper back in the

play07:42

60s when it was in created

play07:44

yeah it was impossible they were just

play07:46

waiting for editor plugins yeah so I

play07:49

just didn't see that it wasn't so token

play07:51

create we could go look at that in the

play07:53

token file TJ do you realize that means

play07:56

emacs was created

play07:58

such that

play08:00

lispers could invent a way to have

play08:02

rainbow prints that is literally the

play08:05

goal of emacs was they built a whole

play08:06

operating system just so they could

play08:09

color their parentheses I wow it makes

play08:12

so much more sense now I I have to say

play08:14

this this is perfectly reasonable now

play08:16

and before I didn't get it at all but

play08:18

now now I get it all right so you want

play08:20

me to go back to token right yeah yeah

play08:22

so where's like Korea there should just

play08:24

be a function in here somewhere called

play08:25

create there we go

play08:28

so liberal so second is the literal okay

play08:31

so op must be the literal and this is

play08:32

the position within the string yep right

play08:35

yep yep okay weird that they have both

play08:37

the literal and the and it's offset into

play08:39

the string but whatever it doesn't

play08:41

really matter or if it's a single okay

play08:42

it's in the book it has the two but yeah

play08:44

okay Single Character whatever pretty

play08:46

straightforward I think we know what's

play08:47

going on here

play08:50

notice so in this one one thing that's

play08:53

interesting that's a little different

play08:54

than the other ones we've seen so far

play08:56

right is that uh

play08:59

there's no there's no like class or

play09:02

state or anything at all there's not

play09:04

even like uh like something that holds

play09:06

like Alexa right you just have one

play09:07

function and the function does the whole

play09:10

thing which is kind of cool

play09:11

it's like you have to pass explicitly

play09:13

the state to the function every time

play09:15

yeah okay yeah yeah it reminds me of

play09:18

what I used to do see a bunch

play09:21

see you always I mean you still see it a

play09:24

lot in a lot of c-style libraries that

play09:25

you actually hold on and create the

play09:26

state and pass it to all the underlying

play09:28

libraries like this is what my websocket

play09:30

is doing and then it goes off like WS

play09:31

lay I think does the exact same thing it

play09:34

gives me that feelings all over again I

play09:36

kind of get this

play09:38

hmm

play09:40

operator operand operand

play09:43

well sing me an operand okay so is

play09:45

letters so we do this this must be Lex

play09:47

so here we go this is the thing this is

play09:49

this one right here

play09:51

yeah oh yep so this must be so predicate

play09:54

so that must be the predicate so

play09:55

predicate position input kind

play09:59

and so it's getting the car nice is that

play10:02

that must be what's happening okay so

play10:04

like inside of those let statements

play10:05

those get executed in order and it's

play10:08

like uh sequentially you're like adding

play10:10

each of these to the current scope so

play10:12

you like split the predicate with the

play10:14

input split with using the predicate on

play10:17

the input right now you have this

play10:18

identifier which is the characters

play10:20

you're gonna move forward this number of

play10:22

positions and then you get a kind

play10:29

there's an equal sign in between each of

play10:31

those right yeah yeah there's an idea

play10:33

okay so this is casting this thing to a

play10:35

string because it must be some sort of

play10:37

lazy sequence or something like that so

play10:38

it's casting it to a string it's

play10:40

creating how many positions it actually

play10:41

moved which is the account of characters

play10:43

inside of this and then the kind which

play10:45

is a token identifier if kind isn't

play10:48

passed in

play10:49

yep

play10:50

okay

play10:52

and then it creates a new identifier and

play10:54

that says okay go do the rest go do the

play10:56

rest now right because it's always

play10:58

calling Lex at the end okay because you

play11:00

pass it with nil so letter is nil so it

play11:03

does not have a kind it just does

play11:04

identifier right here right

play11:09

wow okay okay I'm with it I'm with it

play11:12

that was difficult I'm not gonna lie to

play11:14

you that was kind of difficult I'm glad

play11:15

I had you here because I don't think I

play11:17

could we could have worked our way

play11:18

through it through this without you

play11:20

well I did write some closure on stream

play11:23

recently and I didn't run it on the jvm

play11:25

so boom okay what does that mean

play11:29

uh lispy clouds in the chat right now

play11:31

showed me and there's this cool uh like

play11:34

I you could call it kind of like a run

play11:35

time

play11:36

uh it's called babashka or I always I

play11:39

can't remember because I always think

play11:42

Babushka but like it's babashka I'm

play11:44

pretty sure it's definitely uh

play11:47

it like

play11:49

it's basically like an interpreter for

play11:51

foreclosure that you can use and it

play11:53

comes with a bunch of uh utilities for

play11:56

like scripting so then instead of

play11:58

running bash you could write closure and

play11:59

you get all of these sort of like

play12:01

different things it was really cool okay

play12:03

that sounds

play12:07

it was cool

play12:09

you could do threading and stuff which

play12:11

was nice like sometimes you want to kick

play12:12

off eight jobs wait for them to finish

play12:14

but you're like it's kind of a nightmare

play12:16

in bash so it's not a closure shout out

play12:19

it's not a nightmare and best you not

play12:20

know about parallel

play12:22

I do but what if you want to run

play12:24

different kinds of things it's

play12:25

complicated

play12:28

and then they need to they need to go do

play12:30

other things okay reasonable

play12:32

it is it's always not that bad in bash

play12:35

until it is that's true bash has like a

play12:37

bash is kind of like a digital what do

play12:40

they call that oscilloscope where it's

play12:41

like it's not bad or it's bad there's

play12:44

just like not really an in-between stage

play12:46

it's a step function it's a step

play12:47

function you've written not enough bash

play12:49

or you've written too much bash and

play12:51

there is nothing in between yep

Rate This

5.0 / 5 (0 votes)

Related Tags
クロージャーコード解説スクリプティングインタープリターbabashka並列処理バッシュプログラミングLispスタイル関数型言語
Do you need a summary in English?