Different Phases of Compiler

Neso Academy
31 Mar 202219:12

Summary

TLDRThis lecture covers the various phases of a compiler, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, code optimization, and target code generation. It illustrates how an arithmetic expression is processed through these phases, resulting in assembly language code. The lecture also introduces tools like Lex and Yacc for implementing compiler phases and mentions the LANCE C compiler platform for embedded processors.

Takeaways

  • πŸ“š The lecture provides an overview of the different phases of a compiler.
  • πŸ” The phases include the preprocessor, compiler, assembler, and linker/loader.
  • πŸ‘¨β€πŸ« The lecture focuses on converting high-level language code into machine code.
  • 🌐 The preprocessor removes preprocessor directives and embeds header files.
  • πŸ”’ The compiler converts high-level language code into assembly language code.
  • πŸ“ The lexical analysis phase involves tokenizing the input code.
  • 🌳 The syntax analysis phase constructs a parse tree using context-free grammars.
  • πŸ” The semantic analysis phase checks for type correctness and scope resolution.
  • πŸ’Ύ The intermediate code generator produces three-address code from the parse tree.
  • πŸ”„ The code optimizer reduces the length of code and improves efficiency.
  • πŸ› οΈ The target code generator produces assembly code from the optimized intermediate code.
  • πŸ› οΈ Tools like Lex and Yacc are used to implement the lexical and syntax analysis phases.
  • πŸ“˜ The LANCE C compiler platform is mentioned for implementing the front end of a C language compiler.

Q & A

  • What are the four main phases of a language translator in a compiler?

    -The four main phases of a language translator in a compiler are the preprocessor, the compiler, the assembler, and the linker and loader.

  • What is the primary function of a preprocessor in a compiler?

    -The primary function of a preprocessor is to convert high-level language code into pure high-level language code by embedding required header files and omitting preprocessor directives.

  • What does the lexical analysis phase of a compiler do?

    -The lexical analysis phase takes lexems as input and generates tokens. It is responsible for identifying and categorizing sequences of characters into meaningful tokens that the compiler can understand.

  • What is a lexem and how does it differ from a word?

    -A lexem is similar to a word but differs in that while words individually have their own meanings, a group of lexems in their entirety convey the meaning. For instance, 'x' individually doesn't convey any meaning until it is considered within the context of an entire arithmetic expression.

  • What are tokens and how are they generated?

    -Tokens are the meaningful units of a programming language that represent identifiers, operators, literals, and other syntactic elements. They are generated by the lexical analyzer, which recognizes patterns using regular expressions.

  • Can you explain the role of regular expressions in lexical analysis?

    -Regular expressions are used by the lexical analyzer to define patterns for recognizing different types of tokens. For example, a regular expression can define what constitutes a valid identifier in a language.

  • How does the syntax analyzer use context-free grammars?

    -The syntax analyzer uses context-free grammars to analyze the stream of tokens and produce a parse tree. It follows a set of production rules to ensure that the tokens conform to the language's syntax.

  • What is the purpose of a parse tree in compiler design?

    -A parse tree is a hierarchical structure that represents the syntactic structure of a program according to its grammatical rules. It is used to verify the syntactic correctness of the source code.

  • What does the semantic analyzer check for in a parse tree?

    -The semantic analyzer checks for type correctness, array bounds, scope resolution, and logical consistency within the parse tree. It ensures that the program makes sense from a semantic point of view.

  • What is the intermediate code generator's role in the compiler?

    -The intermediate code generator takes the semantically verified parse tree and produces intermediate code, such as three-address code, which is a lower-level representation of the program that can be further optimized or translated into target code.

  • How does the code optimizer reduce the length of the code?

    -The code optimizer reduces the length of the code by eliminating unnecessary operations and variables, such as by directly assigning the result of an expression to a variable instead of using a temporary variable.

  • What tools can be used to implement the lexical analysis phase of a compiler?

    -The tool Lex can be used to implement the lexical analysis phase. It generates a lexical analyzer from a user-provided specification.

  • What is the role of YACC in compiler implementation?

    -YACC (Yet Another Compiler Compiler) is used to implement the syntax analysis phase. It generates a parser from a set of context-free grammar rules.

  • What is the significance of the Lance C compiler platform mentioned in the script?

    -The Lance C compiler platform is a software platform that can be used to implement the entire front end of a C language compiler for embedded processors.

Outlines

00:00

πŸ“˜ Introduction to Compiler Phases

The instructor begins by welcoming students to a course on compiler design. The lecture aims to provide an overview of the various phases of a compiler using an illustration. The expected outcome includes understanding the different phases such as the preprocessor, compiler, assembler, and linker/loader. The process starts with the preprocessor which removes directives and embeds header files into the source code. The compiler then translates this high-level code into assembly language code. To save time, the instructor decides to focus on a single arithmetic expression to demonstrate how it goes through the compiler phases. The first phase discussed is lexical analysis, where the lexical analyzer processes lexemes to generate tokens. The tokens represent the meaning of lexemes, and the analyzer uses regular expressions to identify them. An example of a regular expression for identifiers is provided, and a finite automata diagram is used to illustrate the process of recognizing identifiers.

05:01

πŸ” Lexical Analysis and Syntax Analysis

The second paragraph delves into the syntax analysis phase, overseen by the syntax analyzer or parser. The parser uses context-free grammars to form a parse tree from the stream of tokens. The instructor illustrates the formation of the parse tree using production rules. The yield of the parse tree is the same as the input expression, indicating no syntax errors. The semantic analyzer then takes the parse tree and performs semantic analysis, checking for type correctness, array bounds, scope resolution, and other logical aspects of the code. The semantic analyzer ensures the meaningfulness of the parse tree and identifies any semantic errors.

10:02

πŸ› οΈ Intermediate Code Generation and Optimization

The third paragraph covers the intermediate code generation phase, where the semantically verified parse tree is transformed into intermediate code, specifically three-address code (3AC). The precedence of operators in the parse tree is used to determine the order of operations. The intermediate code generator assigns results to temporary variables to maintain this order. The code optimizer then takes this intermediate code and optimizes it by reducing the number of statements, for example, by assigning the result of an addition directly to the final variable instead of using a temporary variable. This optimization shortens the code and potentially improves efficiency.

15:04

πŸ’Ύ Target Code Generation and Compiler Tools

The final paragraph discusses the target code generation phase, where the optimized intermediate code is converted into assembly code. The instructor explains the assembly code segment line by line, detailing how each command moves and manipulates data in registers to perform the arithmetic operation. The assembly language code is a translation of the high-level arithmetic expression into machine-readable instructions. The instructor then discusses tools like lex and yacc for implementing the lexical and syntax analysis phases of a compiler. The lecture concludes with a mention of the LANCE C compiler platform for implementing the front end of a C language compiler and directs interested learners to a research paper for further study. The session ends with a preview of the next topic, the symbol table.

Mindmap

Keywords

πŸ’‘Compiler

A compiler is a special kind of software that translates code written in one programming language (the source code) into another language (the target code). In the context of the video, the compiler's purpose is to convert human-readable source code into machine code. The script mentions that there are four main phases in this translation process: preprocessor, compiler, assembler, and linker and loader.

πŸ’‘Phases of Compiler

The phases of a compiler refer to the distinct stages through which source code passes to be transformed into executable machine code. The script outlines these phases as preprocessor, compiler, assembler, and linker and loader. Each phase has a specific role, such as lexical analysis, syntax analysis, and code optimization.

πŸ’‘Lexical Analysis

Lexical analysis is the first phase of a compiler where the source code is broken down into meaningful pieces known as tokens. The script explains that lexical analyzers take lexemes (similar to words) as input and generate tokens, which are the meanings of these lexemes. An example from the script is the conversion of an arithmetic expression into tokens.

πŸ’‘Tokens

Tokens are the units of meaning identified during lexical analysis. They represent the smallest elements of a language that have significance, such as identifiers, operators, and literals. The script uses tokens to illustrate how an arithmetic expression is parsed into meaningful components by the lexical analyzer.

πŸ’‘Syntax Analysis

Syntax analysis is the phase where the compiler checks if the tokens generated by the lexical analyzer follow the rules of the language's grammar. The script describes how a syntax analyzer, or parser, uses context-free grammars to form a parse tree from the tokens, ensuring that the code is syntactically correct.

πŸ’‘Parse Tree

A parse tree is a hierarchical structure representing the syntactic structure of a string according to some grammar. In the script, the parse tree is used to illustrate how the syntax analyzer organizes tokens into a structure that reflects the grammatical rules of the language, ensuring the code is syntactically valid.

πŸ’‘Semantic Analysis

Semantic analysis is the phase where the compiler checks the meaning of the syntax tree to ensure that it makes logical sense. The script mentions that the semantic analyzer is responsible for type checking, array bound checking, and scope resolution to ensure the code is semantically correct.

πŸ’‘Intermediate Code

Intermediate code is a form of representation that lies between the high-level source code and the target machine code. The script explains that the intermediate code generator produces this code from the semantically verified parse tree. An example used in the script is the three-address code (3AC), which is a common form of intermediate code.

πŸ’‘Code Optimization

Code optimization is the phase where the compiler improves the efficiency of the intermediate code without changing its functionality. The script provides an example of how the optimizer can reduce the number of operations in the intermediate code, making the final assembly code more efficient.

πŸ’‘Assembly Code

Assembly code is a low-level programming language that is specific to a particular computer architecture. The script describes how the target code generator translates the optimized intermediate code into assembly code, which is closer to machine code and can be executed by the computer's processor.

πŸ’‘Lex

Lex is a tool used for writing lexical analyzers. It is mentioned in the script as a tool that can be used to implement the lexical analysis phase of a compiler. Lex reads a description of a lexical analyzer and generates C code to implement it.

πŸ’‘Yacc

Yacc, which stands for 'Yet Another Compiler Compiler', is a parser generator used to create the syntax analysis phase of a compiler. The script explains that Yacc, in conjunction with Lex, is commonly used to implement the compiler's front end, which includes lexical and syntax analysis.

Highlights

Introduction to the phases of a compiler with an illustration.

Overview of various phases of the compiler.

Introduction to tools for implementing different compiler phases.

Explanation of the role of the preprocessor in converting source code.

The function of the compiler in producing assembly language code.

Detailed walkthrough of an arithmetic expression through the compiler phases.

Lexical analysis phase explanation using lexical analyzer.

Explanation of lexemes and tokens in lexical analysis.

Regular expressions used by the lexical analyzer.

Finite automata illustration for identifier recognition.

Syntax analysis phase controlled by the parser.

Formation of the parse tree using context-free grammars.

Yield of the parse tree and syntax error detection.

Semantic analysis phase for type checking and scope resolution.

Intermediate code generation from the semantically verified parse tree.

Introduction to three-address code (3AC) as an intermediate code.

Code optimization phase for reducing code length.

Target code generation phase producing assembly code segment.

Explanation of assembly language code segment translation.

Tools for implementing compiler phases: Lex and Yacc.

Lance C compiler platform for implementing the front end of a C compiler.

Conclusion and anticipation for the next lecture on symbol tables.

Transcripts

play00:06

hello everyone

play00:07

welcome back to the course of compiler

play00:09

design

play00:10

as promised today we will observe the

play00:13

different phases of the compiler with

play00:14

the help of an illustration

play00:16

so without any further ado let's get to

play00:19

learning

play00:22

now if we talk about the expected

play00:24

outcome of this particular lecture

play00:26

first we are going to have a brief

play00:28

overview of the various phases of the

play00:30

compiler

play00:31

thereafter we will learn about the

play00:33

various tools using which the different

play00:36

phases can be implemented

play00:38

well in the last lecture we have already

play00:40

observed that in order to convert the

play00:43

human readable source code into the

play00:45

machine code

play00:46

we need a language translate

play00:49

and the language translator has got four

play00:51

different phases

play00:52

the preprocessor then the compiler the

play00:55

assembler and finally the linker and

play00:58

loader

play00:59

now after going through the preprocessor

play01:01

the high level language code gets

play01:03

converted into the pure high level

play01:04

language code

play01:06

basically the preprocessor will embed

play01:08

the required header files with the

play01:10

source code omitting all the

play01:11

preprocessor directives from that

play01:16

now this pure high level language code

play01:18

will be given to the compiler which in

play01:21

turn will produce the equivalent

play01:23

assembly language code

play01:25

now this much we have already observed

play01:26

in the previous session

play01:28

so today we will take this pure high

play01:30

level language code and observe how it

play01:33

goes through the various phases of the

play01:35

compiler now to be really honest

play01:38

converting this entire piece of code

play01:40

will be a little time consuming

play01:42

so what we will do instead of the entire

play01:45

code

play01:45

let's consider this statement that is

play01:48

the arithmetic expression

play01:51

we will observe how this expression goes

play01:53

through the compiler

play01:54

and finally how this particular assembly

play01:57

language code segment is produced

play02:00

basically we will observe how this

play02:02

expression is treated by all the six

play02:04

phases of the compiler

play02:06

now coming to the first one that is the

play02:09

lexical analysis phase

play02:11

here the entire process is taken care of

play02:13

by the lexical analyzer

play02:15

so here the expression is given to the

play02:18

lexical analysis phase

play02:20

and the lexical analyzer takes lexums as

play02:23

inputs

play02:24

and generates the tokens

play02:27

now lexums are similar to words with

play02:30

only one small difference that is

play02:33

words individually have their own

play02:35

meanings whereas group of legazims in

play02:38

their entirety convey the meaning

play02:41

for instance this word analysis means a

play02:45

detailed examination of anything complex

play02:48

in order to understand its nature or to

play02:50

determine its essential features

play02:53

on the contrary this x individually

play02:56

doesn't convey any meaning until we

play02:58

consider the entire arithmetic

play03:00

expression

play03:01

now coming to tokens

play03:03

these are actually the meanings of the

play03:05

lexus

play03:06

so if we traverse this statement from

play03:09

left to right

play03:10

particularly in this statement

play03:12

x is an identifier

play03:15

then the equal symbol is an operator to

play03:18

be precise it is the assignment operator

play03:21

thereafter a is again an identifier

play03:24

the plus symbol is an arithmetic

play03:26

operator and so on

play03:29

so this is the output of the lexical

play03:32

analysis phase that is a stream of

play03:34

tokens

play03:35

and the job of the lexical analyzer is

play03:37

to find out the meaning of every lexing

play03:40

it recognizes the tokens with the help

play03:42

of regexes or regular expressions

play03:45

exemply gratia this is the regular

play03:48

expression for identifiers

play03:50

where l stands for letter

play03:53

d for digit and this special character

play03:56

is the underscore

play03:58

now allow me to illustrate this using

play04:00

its equivalent finite automata

play04:03

observe there are two regexes for

play04:05

identifier

play04:07

let's consider the first one

play04:09

so from the initial state that is q0

play04:12

seeing a later we will go to the next

play04:15

state q1

play04:16

and from this one

play04:18

seeing any number of letters or digits

play04:21

we will end up at the final state that

play04:23

is q3

play04:25

coming to the next form

play04:27

starting from the initial state

play04:29

if we see an underscore we will move

play04:32

towards the next state that is q2

play04:34

thereafter

play04:36

saying any number of letter or digits we

play04:38

will end up at the final state q3

play04:42

many of you may know this that we cannot

play04:45

have an identifier name starting with

play04:47

digits

play04:48

and this is the logic which rejects the

play04:51

identifier names beginning with digits

play04:55

so for examining the leg zooms the

play04:57

lexical analyzer makes use of the type 3

play05:00

or regular grammars of the family of

play05:02

grammars for formal languages

play05:05

now let's move on to the next phase

play05:08

here the syntax analyzer also known as

play05:11

the parser is in control

play05:13

the stream of tokens is passed to the

play05:15

syntax analysis phase

play05:17

the syntax analyzer depends on the type

play05:19

2 or context-free grammars

play05:22

for this particular expression

play05:24

these are the cfg production rules that

play05:27

the parser will use in order to form the

play05:29

parse tree

play05:31

let me illustrate the formation of the

play05:33

parse tree

play05:34

consider the first production rule

play05:36

the start symbol s can be rewritten as

play05:40

id equals e semicolon

play05:43

it means an expression can be assigned

play05:46

to an identifier

play05:47

and the expression has to be followed by

play05:49

the statement terminator that is the

play05:51

semicolon so from s

play05:54

we can derive

play05:55

id

play05:56

equals operator e and semicolon

play06:01

coming to the next production rule

play06:03

e can be rewritten as e plus t

play06:07

that is expression can be expression

play06:10

plus term

play06:11

so from this e

play06:13

we can derive e plus t

play06:16

using the production rule

play06:18

now e can also be rewritten as t

play06:21

that is expression can also be a single

play06:24

term

play06:25

so using this production rule from e

play06:28

we can derive t

play06:30

coming to the next one

play06:32

t can be rewritten as t into f

play06:35

that means term can be a term multiplied

play06:39

by a factor

play06:41

so from this t

play06:42

we will derive t into f

play06:45

that is t

play06:46

then the multiplication operator and

play06:49

then f

play06:50

additionally t can also be written as f

play06:54

that is term can also be a single factor

play06:58

so from t

play06:59

we can derive f in these two instances

play07:04

finally consider the last production

play07:07

rule

play07:08

f can be rewritten as id

play07:10

that is factor is an identifier

play07:14

and using this we can derive id from all

play07:17

these f's

play07:18

so this is the parse tree

play07:21

now before observing the yield of the

play07:24

parse tree let me tell you a few things

play07:26

about this particular grammar here id

play07:30

the equals operator semicolon the plus

play07:33

and the multiplication operators are the

play07:35

set of terminals

play07:36

and the ones in the upper case that is s

play07:40

e

play07:41

t

play07:42

f

play07:43

these are the set of variables or

play07:45

non-terminals

play07:47

now in order to find out the yield of

play07:49

the parse tree

play07:50

we will have to traverse it top to

play07:52

bottom left to right taking notes of

play07:55

only the terminals

play07:56

so let's begin

play07:58

during traversal this id is the first

play08:01

terminal that we encounter

play08:03

remember

play08:04

top to bottom left to right

play08:07

so the next one is this equals operator

play08:10

next this id

play08:12

thereafter this plus operator

play08:15

then this id

play08:17

then this multiplication operator

play08:20

after that this id

play08:22

and finally this semicolon

play08:25

therefore the yield of the parse tree

play08:28

would be

play08:29

id equals id plus id into id semicolon

play08:34

and since the yield of the parse tree

play08:36

and the expression are the same

play08:39

the syntax analyzer will not produce any

play08:42

errors

play08:43

so in short taking the stream of tokens

play08:46

the syntax analyzer analyzes them

play08:48

following specific set of production

play08:50

rules and produces the parse tree and if

play08:53

the yield of the parse tree and the

play08:55

provided stream of tokens are the same

play08:58

then there is no error otherwise there

play09:00

is some syntax error in the statement

play09:03

now let's move on to the next phase

play09:07

in this phase the semantic analyzer

play09:09

takes the control

play09:11

the parse tree produced by the syntax

play09:13

analyzer is given to semantic analysis

play09:16

phase and the semantic analyzer in turn

play09:19

produces the semantically verified parse

play09:21

tree

play09:23

semantic analyzer is responsible for

play09:25

type checking array bound checking and

play09:28

the correctness of scope resolution

play09:31

basically it does the logical analysis

play09:33

of the parse tree

play09:35

like in this parse tree these three

play09:38

identifiers can be constants

play09:40

whereas this particular identifier

play09:43

because it is followed by this

play09:45

assignment operator cannot be a constant

play09:48

here the semantic analyzer will examine

play09:51

whether the type of this identifier is

play09:53

variable

play09:54

semantic analyzer detects type mismatch

play09:57

errors undeclared variables misuse of

play10:01

reserved words multiple declaration of a

play10:04

variable within a single scope accessing

play10:07

an out of scope variable

play10:09

mismatch between actual and formal

play10:11

parameters etc

play10:13

in simpler terms semantic analyzer looks

play10:16

for the meaningfulness of the parse tree

play10:19

and verifies that

play10:21

now the next phase is handled by

play10:24

intermediate code generator the

play10:26

semantically verified parse tree is

play10:29

given to the intermediate code

play10:30

generation phase where the intermediate

play10:33

code generator in turn produces the

play10:35

intermediate code

play10:37

so this was our parse tree

play10:40

and this is the yield of the parse tree

play10:42

which is similar to the pure high level

play10:44

language expression

play10:46

after being semantically verified

play10:49

the intermediate code generator will

play10:51

produce the intermediate code

play10:53

here we are using the very popular

play10:55

intermediate code that is the three

play10:57

address code dac

play11:00

now if you observe this parse tree

play11:02

carefully

play11:03

the precedence of the operators are

play11:05

visible here if we look at the tree from

play11:08

bottom to upwards

play11:11

so at the lowest level the

play11:12

multiplication operator is there and

play11:15

therefore this would be performed at

play11:17

first

play11:18

for this the intermediate code generator

play11:21

will create this temporary variable say

play11:24

t0 and assign the result of b into c to

play11:28

it

play11:29

coming to the next level the result of

play11:31

this is being added with this identifier

play11:35

which in the expression is a

play11:38

therefore in the intermediate code the

play11:40

result of a plus d0 is being assigned to

play11:43

another temporary variable t1

play11:46

finally in the last level the result of

play11:48

the expression is being assigned to this

play11:51

identifier which in the arithmetic

play11:53

expression is the variable x so in the

play11:56

intermediate code t1 is being assigned

play11:59

to x

play12:01

now this intermediate representation of

play12:03

the code is called three address code

play12:06

because if you observe all the

play12:08

statements in here at most they have

play12:11

three addresses in them

play12:13

and by saying address we are mentioning

play12:16

the addresses of these variables

play12:19

now till this phase it is called the

play12:21

front end because taking this

play12:23

intermediate code

play12:24

if we want to generate the target code

play12:26

which is specific to the platform all we

play12:29

have to do is modify the next two phases

play12:32

that is the backend according to the

play12:34

platform which we want to produce our

play12:36

target code for

play12:39

now let's move on to the next phase

play12:41

shall we

play12:43

so here the code optimizer is in control

play12:46

basically the intermediate code is

play12:49

provided to the code optimization phase

play12:51

there the code optimizer in turn

play12:53

generates the optimized code

play12:56

now code optimization can either be

play12:58

machine dependent or machine independent

play13:01

we will observe them in details in due

play13:03

time for now for the sake of

play13:06

understanding let me illustrate the

play13:08

optimization procedure so as you can

play13:10

observe

play13:11

our intermediate code that is the three

play13:14

address code had three statements

play13:16

in the first one t0 is being assigned

play13:19

with the result of b into c

play13:22

then in the second one t1 is being

play13:24

assigned with the result of a plus d0

play13:27

and finally in the third one

play13:29

we're assigning t1 to x

play13:32

now if you observe carefully the second

play13:34

and the third statements

play13:36

here

play13:37

instead of t1 if we assign a plus t0

play13:41

directly to x

play13:42

we can reduce the length of the code

play13:45

so the optimized code will have only two

play13:47

statements

play13:49

and in a nutshell

play13:51

this is exactly what the code optimizer

play13:53

does

play13:54

it optimizes the intermediate code

play13:58

next up is the target code generator

play14:01

so the optimized code is given to the

play14:03

target code generation phase which

play14:05

happens to be the last phase of the

play14:07

compiler

play14:08

now taking this optimized code the

play14:11

target code generator will finally

play14:13

produce this assembly code segment

play14:16

allow me to walk you through this code

play14:18

segment

play14:19

here in the first line we have mov that

play14:22

is the mnemonic for moving

play14:24

then we have eax

play14:26

now eax is actually the extended version

play14:28

of the ax register

play14:30

which is a combination of ah and al

play14:34

ah can store 16 higher order bits and al

play14:37

stores remaining 16 lower order bits

play14:40

so eax can actually store 32 bits it is

play14:44

actually the accumulator register

play14:47

next we have d word ptr that is the data

play14:50

word pointer which is pointing at the

play14:52

register base pointer rbp with a scaling

play14:55

factor of minus 8.

play14:57

basically this pointer is pointing to

play14:59

the variable b

play15:01

and using this entire command the value

play15:03

held by variable b is being moved to the

play15:06

eax register

play15:08

now this imul is the mnemonic for sign

play15:11

multiplication

play15:13

d word ptr rbp minus 12 is pointing to

play15:16

the value stored by the variable c

play15:19

and using this command the value stored

play15:22

in variable c is being multiplied with

play15:24

the value stored in the register eax

play15:27

and the result is also stored in the eax

play15:29

register

play15:30

basically by the end of these two

play15:33

commands

play15:34

we will have the result of b into c in

play15:36

the accumulator

play15:39

coming to the next command here the

play15:41

contents of the eax register are being

play15:44

moved to another register edx which also

play15:47

is a 32-bit register

play15:49

similar to eax edx also has two

play15:52

different 16 bit segments where the

play15:55

higher order 16 bits are to be stored in

play15:57

dh and the lower order 16 bits will be

play16:00

stored in dl

play16:03

now in the next command we are moving

play16:05

the data world pointed by register base

play16:08

pointer with the scaling factor minus 4

play16:10

that is the content of the variable a

play16:13

into the accumulator

play16:15

thereafter in this command we are adding

play16:18

the contents of the register eax and edx

play16:21

and also storing the result in the eax

play16:24

register that is the accumulator

play16:27

finally we are moving the content of the

play16:29

eax register to the address pointed by

play16:31

the data with pointer rbp minus 16 which

play16:35

is actually the address of the variable

play16:37

x

play16:38

so after execution of these commands we

play16:40

will have the calculated value in the x

play16:45

so this is the meaning of this assembly

play16:48

language code segment

play16:50

so this is how that arithmetic

play16:53

expression of our peer high level

play16:54

language code

play16:56

finally gets translated into this

play16:58

assembly language code segment after

play17:00

going through all these phases

play17:04

now let's observe the tools using which

play17:07

we can practically implement various

play17:09

phases of the compiler

play17:11

so the compiler has six phases

play17:15

in order to implement the lexical

play17:16

analysis phase we can use the program

play17:19

called lex

play17:21

lex is the standard lexical analyzer

play17:23

generator on many unix-based systems

play17:27

it reads an input stream specifying the

play17:29

lexical analyzer and writes the source

play17:32

code which implements the lexical

play17:34

analyzer for the c programming language

play17:38

yacc which stands for yet another

play17:41

compiler compiler is a lalr parcel

play17:45

generator we will study about the llr

play17:48

parsers in the chapter 4.

play17:50

anyway using yacc we can implement the

play17:54

syntax analysis phase

play17:56

lex is commonly used with yacc

play18:00

now we already know that among these six

play18:02

phases

play18:04

the first four are collectively called

play18:06

the front end

play18:07

and the last two are known as the back

play18:09

end

play18:10

using the software platform lance c

play18:13

compiler we can implement the entire

play18:15

front end of a c language compiler for

play18:18

an embedded processor

play18:20

interested learners are advised to go

play18:22

through the research paper lance a c

play18:24

compiler platform for embedded

play18:27

processors by dr reiner leipers of

play18:30

university of dortmund germany

play18:32

the link of the paypal has been provided

play18:35

in the description of this lecture

play18:37

all right so in this lecture we have

play18:40

gone through the overview of the various

play18:42

phases of compilers

play18:44

also we have learned about the tools

play18:46

using which we can implement different

play18:48

phases

play18:50

all right people that will be all for

play18:52

this session in the next session we will

play18:54

discuss about the symbol table so i hope

play18:57

to see you in the next one thank you all

play18:59

for watching

play19:01

[Applause]

play19:03

[Music]

play19:11

you

Rate This
β˜…
β˜…
β˜…
β˜…
β˜…

5.0 / 5 (0 votes)

Related Tags
Compiler DesignCode TranslationLexical AnalysisSyntax ParsingSemantic AnalysisIntermediate CodeCode OptimizationAssembly LanguageProgramming ToolsCompiler Phases