Phases of Compiler | Compiler Design

THE GATEHUB
1 Jan 202421:46

Summary

TLDRThis video script offers an in-depth exploration of the compiler's six phases: Lexical Analysis, Syntax Analysis, Semantic Analysis, Intermediate Code Generation, Code Optimization, and Target Code Generation. It explains the transformation of high-level language into assembly language, emphasizing the role of the symbol table and error handling. The script uses a simple mathematical expression to illustrate each phase's function, highlighting the importance of tokenization, parse tree construction, semantic verification, and optimization, culminating in the generation of efficient assembly code.

Takeaways

  • 📚 A compiler is a program that consists of six distinct phases, each responsible for a specific task in the process of translating high-level language code into assembly language.
  • 🔍 The Lexical Analyzer is the first phase, which breaks down the input high-level language into a stream of tokens, performing tokenization.
  • 🌐 The Syntax Analyzer, also known as the Parser, constructs a parse tree from the stream of tokens, ensuring the code adheres to the language's grammar rules.
  • 🔑 The Semantic Analyzer checks the parse tree for semantic correctness, including type checking, undeclared variables, and multiple declarations, using a symbol table to store variable information.
  • 🛠️ Intermediate Code Generation takes the semantically verified parse tree and converts it into an intermediate representation, often in the form of three-address code.
  • ⚙️ Code Optimization is an optional phase that refines the three-address code to reduce program size and improve efficiency by minimizing the number of lines.
  • 🎯 Target Code Generation is the final phase, translating the optimized intermediate code into assembly language, which is closer to machine code.
  • 🔍 The Symbol Table is a data structure used across all compiler phases to store and manage information about variables, such as their names and data types.
  • 👮‍♂️ The Error Handler is responsible for reporting errors detected during any phase of the compilation process back to the user.
  • 🔄 It is important to note that the compiler phases are not strictly sequential and can be executed in parallel, depending on the implementation.
  • 📈 The script provides an example of compiling a simple line of code, demonstrating the role of each phase and the transformation from high-level language to assembly language.

Q & A

  • What are the six phases of a compiler?

    -The six phases of a compiler are Lexical Analyzer, Syntax Analyzer, Semantic Analyzer, Intermediate Code Generation, Code Optimization, and Target Code Generation.

  • What is the primary input for the compiler?

    -The primary input for the compiler is pure high-level language, which it then converts into assembly language.

  • What is the role of the Lexical Analyzer in the compilation process?

    -The Lexical Analyzer takes the input stream of characters from the high-level language and converts it into a stream of tokens, a process known as tokenization.

  • What is a token in the context of a Lexical Analyzer?

    -A token is a sequence of characters that represents the smallest unit of the language, such as identifiers, keywords, operators, and constants.

  • What does the Syntax Analyzer do with the stream of tokens?

    -The Syntax Analyzer constructs a parse tree from the stream of tokens, following a set of rules known as grammar to ensure the input is syntactically correct.

  • What is the purpose of the Semantic Analyzer?

    -The Semantic Analyzer verifies the parse tree semantically, checking for type correctness, undeclared variables, and multiple declarations, ensuring the program makes sense logically.

  • How does the Intermediate Code Generation phase represent the program?

    -The Intermediate Code Generation phase represents the program in a form known as three-address code, which is a popular format for intermediate code.

  • What is the function of the Code Optimization phase?

    -The Code Optimization phase aims to optimize the three-address code, improving the efficiency of the program by reducing the number of lines or the size of the code without changing its functionality.

  • What does the Target Code Generation phase produce?

    -The Target Code Generation phase produces assembly code, which is a lower-level representation of the program that is closer to machine code.

  • What is the role of the Symbol Table in the compilation process?

    -The Symbol Table is a data structure used to store information about the variables, their types, and other identifiers used in the program. It is utilized by various phases of the compiler, especially the Semantic Analyzer and the Lexical Analyzer, to keep track of declared elements.

  • What is the significance of the Error Handler in the compiler?

    -The Error Handler is responsible for reporting errors detected during the compilation process to the user. It is used by all phases of the compiler to ensure that any issues are communicated effectively.

Outlines

00:00

🤖 Compiler Phases Overview

This paragraph introduces the six main phases of a compiler: Lexical Analysis, Syntax Analysis, Semantic Analysis, Intermediate Code Generation, Code Optimization, and Target Code Generation. It explains the input to the compiler, which is a high-level language, and the output, which is assembly language. The paragraph also touches on the concept of a compiler as a program consisting of functions, each representing a phase, and the importance of the symbol table for storing variables and values across all phases. It sets the stage for a detailed discussion of each phase with an example.

05:02

🔍 Lexical Analysis and Tokenization

The second paragraph delves into the first phase of the compiler, Lexical Analysis, focusing on its role in converting a stream of characters into a stream of tokens. It describes the process of tokenization, where identifiers, keywords, operators, and constants are identified and given token names and values. The paragraph also explains the use of the symbol table to store information about identifiers and their types, which is crucial for later phases of the compilation process.

10:04

🌐 Syntax Analysis and Parse Tree Construction

This paragraph discusses the Syntax Analysis phase, also known as parsing, where the compiler constructs a parse tree from the stream of tokens. It highlights the importance of grammar rules in validating the syntax of the code. The process of checking the syntax correctness of the input code is explained, along with the construction of the parse tree, which is a hierarchical representation of the code's structure.

15:06

📑 Semantic Analysis and Type Checking

The fourth paragraph examines the Semantic Analysis phase, emphasizing its role in verifying the semantic correctness of the parse tree. It discusses type checking, undeclared variable detection, and multiple declaration checks as key functions of the semantic analyzer. The paragraph also clarifies the use of the symbol table in storing and retrieving data type information for variables, which is essential for semantic verification.

20:11

🔄 Intermediate Code Generation and Optimization

The focus of this paragraph is on the generation of intermediate code, specifically three-address code, and the subsequent optimization phase. It explains how the semantically verified parse tree is transformed into an intermediate representation that can be further optimized to reduce program size or improve efficiency. The paragraph also mentions that code optimization is an optional phase, as programmers can also write optimized code directly.

🛠️ Target Code Generation and Assembly Language

The final paragraph in the script describes the process of generating target code, which is the assembly language code. It illustrates how the optimized three-address code is translated into assembly code using registers and operations that the computer's hardware can understand. The paragraph concludes the discussion on compiler phases by summarizing the roles of the lexical analyzer, syntax analyzer, semantic analyzer, intermediate code generation, code optimization, and target code generation, and their interplay with the symbol table.

Mindmap

Keywords

💡Compiler

A compiler is a special program that translates code written in a high-level programming language into assembly language or machine code. In the video, the compiler is the central theme, as it is discussed in terms of its various phases and functions, which transform high-level language into a form that a computer can execute.

💡Phases of Compiler

The phases of a compiler refer to the distinct stages the compiler goes through to process code. There are six main phases: Lexical Analysis, Syntax Analysis, Semantic Analysis, Intermediate Code Generation, Code Optimization, and Target Code Generation. Each phase plays a critical role in the compilation process, as outlined in the script.

💡Lexical Analyzer

The lexical analyzer, also known as the scanner, is the first phase of the compiler. It takes the input code as a stream of characters and converts it into a stream of tokens. Tokens are the smallest units of the language, such as keywords, identifiers, operators, and constants. The script explains the process of tokenization and its importance in the compilation process.

💡Syntax Analyzer

The syntax analyzer, often referred to as the parser, constructs a parse tree from the stream of tokens generated by the lexical analyzer. It checks the grammatical structure of the code against a set of predefined rules, known as a grammar. The script emphasizes the critical role of the syntax analyzer in ensuring the syntactic correctness of the code.

💡Semantic Analyzer

The semantic analyzer is responsible for verifying the semantic meaning of the code, beyond its syntactic structure. It checks for type consistency, undeclared variables, and multiple declarations, among other things. The script provides examples of how the semantic analyzer uses the symbol table to ensure the semantic correctness of the code.

💡Intermediate Code Generation

Intermediate code generation involves converting the semantically verified parse tree into an intermediate representation of the code, often in the form of three-address code. This intermediate code is a simplified version of the original code, making it easier to optimize and translate into target code. The script explains the use of three-address code as a popular format for this phase.

💡Code Optimization

Code optimization is the phase where the intermediate code is analyzed and modified to improve the efficiency of the program. This can involve reducing the number of instructions, minimizing memory usage, or enhancing execution speed. The script notes that code optimization is an optional phase, as programmers can also write optimized code directly.

💡Target Code Generation

The final phase of the compiler is target code generation, where the optimized intermediate code is translated into assembly language or machine code specific to the target platform. The script illustrates this with an example of how high-level operations are translated into assembly instructions involving registers.

💡Symbol Table

A symbol table is a data structure used throughout the compilation process to store information about identifiers, such as variable names, their types, and scopes. The script explains that the symbol table is accessed by various phases of the compiler, particularly the semantic analyzer and the lexical analyzer, to maintain consistency and check for errors.

💡Error Handling

Error handling in the context of a compiler refers to the process of detecting and reporting errors that occur during the compilation process. The script mentions that each phase of the compiler, including lexical analysis and syntax analysis, has the capability to identify errors and report them to the user.

💡Type Checking

Type checking is a critical function of the semantic analyzer, where it verifies that operations are performed on compatible data types. The script provides an example of type checking by explaining the need to convert an integer to a float before performing multiplication, a process known as type casting.

Highlights

The total phases of a compiler are six, including Lexical Analyzer, Syntax Analyzer, Semantic Analyzer, Intermediate Code Generation, Code Optimization, and Target Code Generation.

A compiler takes pure high-level language as input and converts it to assembly language.

The Lexical Analyzer performs tokenization, converting a stream of characters into a stream of tokens.

Syntax Analyzer constructs a parse tree using a set of grammar rules to check for syntactic correctness.

Semantic Analyzer verifies the parse tree semantically, checking for type correctness, undeclared variables, and multiple declarations.

Intermediate Code Generation produces a three-address code representation of the program.

Code Optimization is an optional phase that aims to reduce the size of the program and the number of lines.

Target Code Generation converts the optimized three-address code into assembly code.

Symbol Table is a data structure used across all phases of the compiler for storing variables and their attributes.

Error handling is an integral part of the compiler, reporting errors found during various phases to the user.

The compiler's phases can be executed in parallel, not necessarily sequentially, as they are independent functions.

Type casting is a process where the data type of a value is converted to match the required type for operations.

The importance of the Symbol Table in storing variable names, types, and scopes for reference during compilation.

Lexical Analyzer assigns identifiers to variables and operators but does not store them in the Symbol Table unless declared.

Semantic Analyzer uses the Symbol Table to check for type mismatches, undeclared variables, and multiple declarations.

The three-address code is a popular format in intermediate code generation, limiting the use of three variables at most.

Code optimization simplifies the code by reducing the number of temporary variables and operations.

Assembly language generation involves mapping high-level constructs to machine-specific instructions.

The compiler's error handling mechanism ensures that any errors in the code are identified and reported to the user.

Transcripts

play00:00

In this video, we will discuss about the phases of compiler.

play00:04

So what are the total phases of compiler?

play00:06

There are six phases.

play00:07

Which are?

play00:08

Lexical Analyzer, Syntax Analyzer, Semantic Analyzer, Intermediate Code Generation, Code Optimization and Target Code Generation.

play00:16

So what does this mean?

play00:18

This is the compiler of the overall app.

play00:20

So what is the input for the compiler?

play00:22

Pure high-level language.

play00:24

And the compiler converts it to assembly language.

play00:26

We have already discussed this.

play00:28

So basically what is a compiler?

play00:30

It is a program.

play00:32

So what is a program?

play00:34

It is some set of functions in a way.

play00:36

So what is in it?

play00:38

Compiler is a program which has six functions.

play00:40

And all these six functions mean six phases.

play00:42

When it is needed, we will call that function.

play00:44

So now let's see what is one by one.

play00:46

After that, let's go into its detailing.

play00:48

So for this, the first phase is lexical analyzer.

play00:52

So for this, the input is pure high-level language.

play00:54

Which we call stream of characters.

play00:56

Lexical analyzer has three phases.

play00:58

So we will go to this character.

play01:00

And it will convert it to stream of tokens.

play01:02

So what is the work of lexical analyzer?

play01:04

It will convert whatever is your program into tokens.

play01:06

That's why it is called tokenization.

play01:08

Anyway, all these things are examples.

play01:10

We will understand all the things.

play01:12

Now we will brief it.

play01:14

What is the work of each phase?

play01:16

Next, what is stream of tokens as an input?

play01:18

For syntax analyzer.

play01:20

What will syntax analyzer do?

play01:22

It will construct it in the parse tree.

play01:24

There are some set of rules in it.

play01:26

It is called grammar. What happens through grammar?

play01:28

The parse tree is generated.

play01:30

It goes to semantic analyzer.

play01:32

What does semantic analyzer do?

play01:34

It is also a parse tree construct.

play01:36

But it is called semantically verified parse tree.

play01:38

What is it? We will understand it now.

play01:40

After that, what is next?

play01:42

Intermediate code generation.

play01:44

So this parse tree as an input.

play01:46

Semantically verified parse tree.

play01:48

It is an input for intermediate code generation.

play01:50

And to represent intermediate code generation,

play01:52

we have a lot of formats.

play01:54

But the most popular one is your three-address code.

play01:56

So whatever program we write,

play01:58

and at the end when we write it in intermediate code,

play02:00

it will be three-address code.

play02:02

Okay.

play02:04

After that, this code optimization will go to it.

play02:06

What code optimization will do?

play02:08

It will optimize this three-address code.

play02:10

It will convert it into optimized three-address code.

play02:12

What is this optimized three-address code?

play02:14

It will go to target code.

play02:16

And what will target code generation do?

play02:18

It will convert it into assembly code.

play02:20

So this is the work of all the phases.

play02:22

Now we will understand it as an example.

play02:24

Before that, let's understand that all these phases,

play02:26

when all these phases are processing,

play02:28

that is, when they are running,

play02:30

then it is possible that there is something

play02:32

that we have to store.

play02:34

Some values, some variables have to be stored.

play02:36

What is the type of that variable?

play02:38

To store all these things,

play02:40

we need some data structure.

play02:42

So for that, the data structure we use here is

play02:44

your symbol table.

play02:46

Who can use this?

play02:48

All the phases can be used.

play02:50

When someone needs to store something,

play02:52

where will he go and store it?

play02:54

In the symbol table.

play02:56

So we have to compile the diagram.

play02:58

So it is possible that at some point there will be an error.

play03:00

So that error can also come in your lexical.

play03:02

Syntax and so on.

play03:04

There can be some error in all the phases.

play03:06

So what do they report?

play03:08

The error handler.

play03:10

What does the error handler do?

play03:12

It reports the user.

play03:14

So error handlers use all the phases

play03:16

and the symbol table also uses all the phases.

play03:18

So this is your diagram.

play03:20

Draw it.

play03:22

Which is called phases of compiler.

play03:24

Now we will understand the working of each phase by taking an example.

play03:26

So let's see how all the phases of the compiler work.

play03:30

So there are six phases.

play03:32

Now let's see it one by one.

play03:34

So I have taken a simple program of one line.

play03:36

X equal to Y plus Z multiplied by 60.

play03:38

Where X Y Z data type is float.

play03:40

Okay.

play03:42

So this is your pure high level language.

play03:44

Because before compilation,

play03:46

what is the output of pre-processing?

play03:48

Pure high level language.

play03:50

This is also called streams of characters.

play03:52

So for lexical analyzer,

play03:54

what is its output?

play03:56

Tokens.

play03:58

What is the work of lexical analyzer?

play04:00

To convert the program into a token.

play04:02

So what are tokens?

play04:04

Token can be your identifier.

play04:06

It can be a keyword.

play04:08

It can be an operator.

play04:10

It can be a constant.

play04:12

These are all your tokens.

play04:14

Anyway, lexical analyzer is also a separate chapter.

play04:16

We will discuss that.

play04:18

But here as of now we just understand

play04:20

how lexical analyzer converts a program into a token.

play04:22

So let's start the program.

play04:24

As we got X.

play04:26

So what is X?

play04:28

It is X identifier.

play04:30

So what is X identifier?

play04:32

And what will it do correspondingly?

play04:34

It will generate a token.

play04:36

What is a token?

play04:38

Its token name and its value.

play04:40

That means attribute value.

play04:42

ID comma 1.

play04:44

What is exactly?

play04:46

This is token name and this is its value.

play04:48

Okay.

play04:50

So this means it is token number 1.

play04:52

Then what will come next?

play04:54

Y.

play04:56

Y is also your identifier.

play04:58

So identifier comma second identifier.

play05:00

Okay.

play05:02

What is addition operator?

play05:04

Then what is Z?

play05:06

Your identifier is ID comma 3.

play05:08

So that we can distinguish the three identifiers.

play05:10

Okay.

play05:12

The first number assigned is for X.

play05:14

The second number assigned is for Y.

play05:16

And the third number assigned is for Z.

play05:18

Okay.

play05:20

And all these identifiers will be found in lexical analyzer.

play05:22

Symbol table.

play05:24

First it is telling that 1,2,3.

play05:26

That number.

play05:28

Identify number 1.

play05:30

Identify number 2.

play05:32

Identify number 3.

play05:34

In this your serial number, variable name and type.

play05:36

Type of X is float.

play05:38

Type of Y is also float.

play05:40

Type of Z is also float.

play05:42

So we will come to symbol table later.

play05:44

How symbol table is created?

play05:46

And who stores data in symbol table?

play05:48

Everyone can store data in symbol table.

play05:50

All the phases of compiler are used.

play05:52

How exactly it is used?

play05:54

We will come to that later.

play05:56

Just understand that what is X?

play05:58

Identify number 1.

play06:00

Identify number 2 of Y.

play06:02

Identify number 3 of Z.

play06:04

What is multiplication operator?

play06:06

And what is integer constant of 60?

play06:08

Now here the question can be that

play06:10

For X identifier,

play06:12

You have written the identifier number.

play06:14

That means its value.

play06:16

So equal to plus multiplication 60.

play06:18

Why is all this symbol table?

play06:20

This can be a doubt.

play06:22

It is written for identifier.

play06:24

We have to identify X.

play06:26

We have to identify Y.

play06:28

We have to identify Z.

play06:30

We have to identify numbers.

play06:32

Equal to operators or constant.

play06:34

Why no number is generated for this?

play06:36

Because when you install C compiler in system.

play06:38

By default in C compiler

play06:40

Already there are many operator keywords.

play06:42

So it already knows that what are they.

play06:44

C compiler knows that

play06:46

What is equal to, plus and multiplication.

play06:48

But identifier is used.

play06:50

Identifier means

play06:52

It can be your variable name.

play06:54

It can be your function name.

play06:56

It can be your array name.

play06:58

This compiler does not know.

play07:00

Rules know that

play07:02

By using this rule we declare variable.

play07:04

We write the name of the variable.

play07:06

So that's why we did not store it in symbol table.

play07:08

But if there is any variable.

play07:10

If there is any function name.

play07:12

We have to store it in symbol table.

play07:14

So that when we go to use it later.

play07:16

We will know that what is X,Y,Z actually.

play07:18

Because what happens.

play07:20

The compiler does not know.

play07:22

That X,Y,Z.

play07:24

That's why we are storing it in symbol table.

play07:26

That's why we are not giving any value to all these.

play07:28

So this is tokenization.

play07:30

This process is called tokenization.

play07:32

And this X equal to Y,Z plus this

play07:34

Which is written from here.

play07:36

This is called lexeme.

play07:38

And if we associate something with it.

play07:40

What is this?

play07:42

A token.

play07:44

So now how will we write this?

play07:46

ID comma 1.

play07:48

What is ID comma 1? Identifier 1. What is this? X.

play07:50

Which is equal to Y plus Z.

play07:52

Multiplied by 60.

play07:54

So the program.

play07:56

The input of lexical analyzer.

play07:58

X equal to Y plus Z.

play08:00

Into 60.

play08:02

And the output of this is.

play08:04

This is the output.

play08:06

Now where will this go?

play08:08

This as an input will be syntax analyzer.

play08:10

Which is second phase of compiler.

play08:12

Syntax analyzer.

play08:14

Which we also call parser.

play08:16

Parser is a separate chapter.

play08:18

The most important chapter of compiler is. Syntax analyzer.

play08:20

Which is parser.

play08:22

So the output of this is.

play08:24

Parsery.

play08:26

So what is parsery?

play08:28

How is it constructed?

play08:30

What are its rules and regulations?

play08:32

You will see all these things later.

play08:34

First we have to understand.

play08:36

How is parsery constructed?

play08:38

I have already drawn this.

play08:40

What is the output of syntax analyzer?

play08:42

Parsery.

play08:44

What is the input?

play08:46

Token.

play08:48

What is set of rules?

play08:50

As soon as you install compiler in your system.

play08:52

What is the meaning of compiler?

play08:54

Set of rules.

play08:56

Through which we will check if there is any error or not.

play08:58

So let's suppose we have taken.

play09:00

Grammar.

play09:02

So there will be a lot of grammar.

play09:04

For this particular statement.

play09:06

What is it?

play09:08

S derives id which is equal to.

play09:10

E equal to e plus t slash t.

play09:12

T equal to t star f.

play09:14

Slash f equal to id.

play09:16

And int.

play09:18

So what will we do through this grammar?

play09:20

We will take it.

play09:22

And through this grammar.

play09:24

We will generate its parsery.

play09:26

If this parsery is constructed.

play09:28

It means it is syntactically correct.

play09:30

So see how it is done.

play09:32

Id which is equal to e.

play09:34

E plus t.

play09:36

So what will we do?

play09:38

How will we check if it is correct or not?

play09:40

Means we have got this input.

play09:42

When we will give this to syntax analyzer.

play09:44

So how will we check if your parsery is correct or not?

play09:46

So what will we do?

play09:48

We will check if it is equal or not.

play09:50

So see.

play09:52

Top to down.

play09:54

What is the first leaf note?

play09:56

Id.

play09:58

So what is the meaning of id?

play10:00

We are talking about id1.

play10:02

Ok.

play10:04

What is id1?

play10:06

It is x.

play10:08

Which is equal.

play10:10

So now we will go top to down.

play10:12

Top to down.

play10:14

Left to right.

play10:16

So it is plus.

play10:18

Ok fine.

play10:20

Then we will come here.

play10:22

Where did we get next?

play10:24

What is id3?

play10:26

It is z.

play10:28

What is id3?

play10:30

It is z.

play10:32

What is multiplication?

play10:34

It is 60.

play10:36

Means according to this rule.

play10:38

The input we have given.

play10:40

Set of tokens.

play10:42

Or stream of tokens.

play10:44

Means your program is syntactically correct.

play10:46

Ok.

play10:48

We just have to construct.

play10:50

If it is constructed.

play10:52

That means it is syntactically verified.

play10:54

Next.

play10:56

The most important phase in this compiler is.

play10:58

Semantic analyzer.

play11:00

We have understood this very well.

play11:02

What is the work of semantic analyzer?

play11:04

Input is parsery.

play11:06

Output is parsery.

play11:08

But what is added?

play11:10

Semantically verified parsery.

play11:12

This is very important.

play11:14

In symbol table.

play11:16

In symbol table.

play11:18

These variables.

play11:20

X,Y,Z.

play11:22

Data types are stored.

play11:24

What is that?

play11:26

Semantic analyzer.

play11:28

Lexical analyzer does not store it.

play11:30

Why?

play11:32

Because now what will it do?

play11:34

It will verify semantically.

play11:36

So what is the meaning of verifying semantically?

play11:38

It is same parsery.

play11:40

It is same parsery.

play11:42

60.0

play11:44

What was there?

play11:46

60. That means it was integer value.

play11:48

Now what will it do?

play11:50

What happens is.

play11:52

What will this semantic analyzer do first?

play11:54

Note down here.

play11:56

What is first?

play11:58

Type checking.

play12:00

Then undeclared variable.

play12:02

Multiple declaration.

play12:04

Semantic analyzer checks all these things.

play12:06

So first let's see the meaning of type checking.

play12:08

So the meaning of type checking is.

play12:10

Let's suppose we have two values.

play12:12

If we have to multiply them.

play12:14

And store them somewhere.

play12:16

Then data type of both values should be same.

play12:18

If the data type of both values is different.

play12:20

If we have to store them.

play12:22

Then what will be?

play12:24

Type mismatch.

play12:26

So what is this?

play12:28

What was this?

play12:30

X.

play12:32

What was this?

play12:34

Y.

play12:36

And what is this?

play12:38

Z.

play12:40

So what is the data type of X,Y and Z?

play12:42

Integer.

play12:44

So to multiply these two.

play12:46

We have to convert integer to float.

play12:48

This is called type casting.

play12:50

So let's take an example of type casting.

play12:52

Let's suppose int A.

play12:54

Which is equal to.

play12:56

Suppose this is 60.

play12:58

Let's take another variable.

play13:00

Float B.

play13:02

Now what we have to do.

play13:04

If we have to store this value in B.

play13:06

Then what will we do?

play13:08

B which is equal to.

play13:10

Float bracket.

play13:12

This is called type casting.

play13:14

So here.

play13:16

What is happening in this?

play13:18

We have to multiply these two.

play13:20

So what is your ID here?

play13:22

Z.

play13:24

What is the data type of Z? Float.

play13:26

So what will be the data type of F? Float.

play13:28

What will be the data type of T?

play13:30

This is how evaluation is done.

play13:32

How to evaluate in the past tree?

play13:34

We will discuss that later.

play13:36

Just see this.

play13:38

What is your T? Float.

play13:40

If we have to multiply.

play13:42

Then we need this float and this float.

play13:44

Only then it can be multiplied.

play13:46

The data type of both can not be different.

play13:48

It will be same only then we will multiply.

play13:50

So the most important work of semantic analyzer.

play13:52

What is that?

play13:54

Semantically verify the parsage which is constructed.

play13:56

That means the type should not be mismatched.

play13:58

So what happened here?

play14:00

It converted this 60 to 60.0.

play14:02

Now this is also float.

play14:04

This was already a float.

play14:06

So what does this mean?

play14:08

First point.

play14:10

So who needs this symbol table?

play14:12

Who needs this symbol table?

play14:14

Semantic analyzer.

play14:16

So this value which is stored.

play14:18

Of X, Y and Z.

play14:20

This lexical analyzer is not stored.

play14:22

And yes, one very important thing.

play14:24

There can be a confusion here.

play14:26

First the lexical analyzer came.

play14:28

So it will store this.

play14:30

So it is not like that.

play14:32

This is not a sequential execution in the compiler.

play14:34

All the phases in the compiler.

play14:36

They can be executed in parallel.

play14:38

There is only one function.

play14:40

There are six different functions.

play14:42

I don't know which function will call whom.

play14:44

So there is a set of code.

play14:46

So there are many functions in it.

play14:48

Function can be called anytime.

play14:50

So it will jump and go there.

play14:52

It will be executed.

play14:54

So in the same way as soon as this line reads.

play14:56

So good.

play14:58

Float X, Y, Z.

play15:00

So what will it need for type matching later?

play15:02

Semantic analyzer.

play15:04

So this float X, Y, Z.

play15:06

Semantic analyzer will store.

play15:08

Lexical analyzer also stores.

play15:10

Let me tell you how it does it.

play15:12

So this will happen.

play15:14

So this means that these parts will be semantically verified.

play15:16

Now what is the second?

play15:18

Undeclared variable.

play15:20

Let's suppose what we did.

play15:22

This is not written.

play15:24

This line is not there.

play15:26

X equal to Y plus Z into 60.

play15:28

What will the lexical analyzer do?

play15:30

It will convert it into a token.

play15:32

What is X? It is an identifier.

play15:34

What is Y? It is also an identifier.

play15:36

Because it has declared how to use the variable.

play15:38

No.

play15:40

It will only generate in the token.

play15:42

And as soon as it comes here to the semantic analyzer.

play15:44

So it will check.

play15:46

Okay what is this?

play15:48

Float is not here, float is not here, float is not here.

play15:50

Who has stored this now?

play15:52

If we do this first line.

play15:54

Then who will store this?

play15:56

Lexical analyzer.

play15:58

Semantic analyzer stores only when its data type is declared.

play16:00

But what does lexical analyzer have to do?

play16:02

To convert into token.

play16:04

What will it do? X, Y, Z. Corresponding.

play16:06

What will it do? It will be blank.

play16:08

Nothing will be stored in it.

play16:10

As soon as it comes to the semantic analyzer.

play16:12

It will go and check.

play16:14

Okay, let's see.

play16:16

So the thing is that it is missing.

play16:18

So what will it do? It will show an error.

play16:20

That we use such variables which are not declared.

play16:22

So this is the second work.

play16:24

What is the third? Multiple declaration.

play16:26

So I don't know whether it is visible or not.

play16:28

But see, I will explain.

play16:30

Let's suppose we write int X which is equal to 60.

play16:32

int X which is equal to 60.

play16:34

This is the first declaration.

play16:36

And who has written the second declaration?

play16:38

float

play16:40

X is equal to 60.0

play16:42

So this means

play16:44

We are using the same variable again.

play16:46

There can be an integer as well.

play16:48

It is not that the data type is different.

play16:50

So int X is equal to 60. Float X is equal to 60.

play16:52

So the memory allocation for this and for this.

play16:54

So this means in the same program

play16:56

We are using the same variable.

play16:58

So this is also not allowed.

play17:00

Who will identify this?

play17:02

Semantic analyzer.

play17:04

So what is the work of semantic analyzer? Type checking.

play17:06

So what is the work of semantic analyzer? Type checking.

play17:08

Will check if the variable is an undeclared.

play17:10

Will check if the variable is an undeclared.

play17:12

And will check multiple declaration.

play17:14

Will check this and will show the error.

play17:16

So the most important thing in all phases

play17:18

Means

play17:20

Syntax analyzer and Semantic analyzer

play17:22

What will be the output of semantic analyzer?

play17:24

Semantically verified parse tree.

play17:26

So if the type is not mismatched

play17:28

And your undeclared variable is not there

play17:30

And there is no multiple declaration.

play17:32

So what will it do?

play17:34

as output. Now let's come to next intermediate code generation.

play17:39

The most popular representation in intermediate code generation is 3 address code.

play17:44

3 address code is also a separate chapter. So 3 address code means

play17:49

that we can use only 3 maximum variables in it. T2 equal to Y plus T1.

play17:54

So one variable is this, second one is this, third one is this. We can't use fourth variable.

play17:59

There are many representations. But we are writing 3 address code.

play18:04

So first of all, Z and 60 are multiplied. So Z multiplied by

play18:09

60.0 is stored in a temporary variable. This is the output of intermediate code generation.

play18:14

What is the input? Semantically verified pass-through.

play18:19

We took T1 as a temporary variable, multiplied it and stored it in T1.

play18:24

Now what? Y plus T1. See, Y plus Z into 60. We added it.

play18:29

We added T1 into Y and stored it in T2. And T2 into X.

play18:34

So what is this? 3 address code. So the output of intermediate code generation will be 3 address code.

play18:39

Now who will have this? Code optimization. What is the work of code optimization?

play18:44

To optimize the code, that is, to reduce the size of the program.

play18:49

Or we can say to reduce the number of lines. So see, Y plus T1

play18:54

we have to store in T2 and then T2 in X. So if we write directly X equal to Y plus T1, then we

play18:59

have to minimize the number of lines. The size of the program is reduced by 1.

play19:03

So this is your code optimization. And code optimization is not compulsory.

play19:08

Code optimization is such a phase which is optional. It is possible that we wrote such a code

play19:13

and wrote it directly. So its output will also be the same. So it means that code optimization

play19:18

is optional. Why? Because we can write optimized code directly.

play19:23

Now what is this? Optimized 3 address code. So what is optimized 3 address code?

play19:28

Who will have optimized 3 address code? Target code generation.

play19:34

What will it do? It will generate assembly language. So what is assembly language?

play19:39

We have taken R0 and R1 two registers and stored Z value in R0 and Y value in R1.

play19:45

So now what will we do? What is in R0? Z is in R0. So Z is here.

play19:50

Z into 60. Where do we have to store it? We have to multiply Z into 60 and store it in R0.

play19:57

So what is in R0? We have to multiply Z into 60 and store it in R0.

play20:01

So what is the value of R0? Z into 60. Now what do we have to do? We have to add.

play20:06

Y plus that means R0 is Z into 60. And what are we doing in R1? We are adding.

play20:11

What is the value of R1? Y. So Y plus Z into 60. And then we are storing it in R1.

play20:16

So addition is done in this way. Then what do we have in R1?

play20:21

We have this value. Y plus Z into 60. And what have we done? We have stored it in X.

play20:27

So this is your assembly language. So these were all the phases.

play20:32

Did we miss any point? No. I think we have discussed all the points.

play20:37

The most important thing was that what is the work of lexical analyzer? To generate token.

play20:42

Ok. After that the work of syntax analyzer? To construct the past tree.

play20:46

To construct the past tree and through which? Through grammar.

play20:49

The work of semantic analyzer? It will verify the past tree semantically.

play20:53

That means these three points. Now down this note. Ok.

play20:56

Type checking, undeclared variable, multiple declaration. It will check all these.

play21:00

If this is not a problem then corresponding to this. Then it will construct the semantically verified tree.

play21:06

What will intermediate code generation do? It will generate three address code.

play21:09

Code optimization will optimize it. That means it will reduce the number of lines.

play21:13

And after that what will target code generation do? It will generate assembly code. Ok.

play21:19

And this is your symbol table. Symbol table. Symbol table can use all the phases.

play21:24

But mostly semantic analyzer and lexical analyzer are used.

play21:28

But it is that anyone can use it. So these were all the phases of compiler.

play21:33

And we have discussed it through an example. Now we will discuss that every phase is different.

play21:39

What are the things in lexical analyzer? Then we will see syntax analyzer.

play21:43

Then we will discuss all the chapters of semantic one by one.

Rate This

5.0 / 5 (0 votes)

Related Tags
Compiler DesignLexical AnalysisSyntax AnalysisSemantic AnalysisCode OptimizationAssembly LanguageHigh-Level LanguageParse TreeTokenizationSymbol Table