Lecture-8 Introduction to Databases: Relational Algebra - Select, project, join

Wobbly Bit
4 Jul 201418:05

Summary

TLDRThis video introduces relational algebra, a formal language foundational to implemented languages like SQL. It covers basic relational algebra concepts and popular operators, including the select, project, and cross-product operators. The video explains how to filter, slice, and combine relations using these operators. It also discusses the natural join operator, which simplifies combining relations by enforcing equality on shared attributes. The script sets the stage for a deeper exploration of additional operators and alternative notations in a subsequent video.

Takeaways

  • 📚 Relational algebra is a formal language used to perform queries on relational databases, forming the basis for languages like SQL.
  • 🔍 Queries in relational algebra operate on relations and produce new relations as results, allowing for complex data manipulation.
  • 📈 The Select operator (σ) is used to filter rows in a relation based on a given condition, denoted with a subscript for the condition.
  • 📊 The Project operator (π) allows for the selection of specific columns from a relation, specified by listing the column names in the subscript.
  • 🔗 The Cross-Product operator combines every possible pair of tuples from two relations, resulting in a new relation with a schema that is the union of the original schemas.
  • 🔑 The Natural Join operator is a convenient way to combine relations by enforcing equality on shared attributes and removing duplicate columns.
  • 🔄 Duplicate values are eliminated in relational algebra results, differing from SQL which is based on multisets and retains duplicates.
  • 🎯 The Theta Join operator is an abbreviation in relational algebra that combines relations based on a specific condition, akin to the basic join operation in many database systems.
  • 📝 The script introduces the basics of relational algebra and sets the stage for learning additional operators and notations in a subsequent video.
  • 📌 Keys in a relation are attributes or sets of attributes that are guaranteed to be unique, which is crucial for operations like joins and selections.

Q & A

  • What is relational algebra?

    -Relational algebra is a formal language used for expressing queries on relational databases. It forms the underpinnings of implemented languages like SQL.

  • What are the basic operations relational algebra performs on relations?

    -Relational algebra operates on relations and produces relations as a result. It can filter, slice, and combine relations using various operators.

  • What is the simplest query in relational algebra?

    -The simplest query in relational algebra is the name of a relation itself, which returns a copy of the entire relation.

  • What does the Select operator in relational algebra do?

    -The Select operator is used to filter certain rows out of a relation based on a specified condition, denoted by a Sigma with a subscript for the condition.

  • How can you filter students with a GPA greater than 3.7 using relational algebra?

    -You can use the Select operator with the condition Sigma GPA > 3.7 over the student relation to filter out students with a GPA greater than 3.7.

  • What is the purpose of the Project operator in relational algebra?

    -The Project operator is used to select specific columns from a relation, denoted by the Greek PI symbol with a subscript listing the desired column names.

  • How does the cross-product operator work in relational algebra?

    -The cross-product operator combines every tuple from one relation with every tuple from another relation, resulting in a new relation with a schema that is the union of the two input schemas.

  • What is the difference between a cross-product and a natural join in relational algebra?

    -A natural join is a specialized form of a cross-product that automatically enforces equality on shared attributes and removes duplicate columns, whereas a cross-product simply combines tuples without any conditions.

  • How does the theta join operator differ from the natural join?

    -The theta join operator is equivalent to applying a specific condition (theta) to the cross product of two relations, whereas the natural join automatically enforces equality on all attributes with the same name.

  • Why is the natural join considered convenient in relational algebra?

    -The natural join is convenient because it simplifies the process of combining relations with shared attributes by automatically enforcing equality and removing duplicate columns without the need for explicit conditions.

  • What is the significance of relational algebra in database management systems?

    -Relational algebra is significant in database management systems because it provides a formal foundation for query languages like SQL and helps in understanding the underlying operations for combining and manipulating relations.

Outlines

00:00

📚 Introduction to Relational Algebra

This paragraph introduces relational algebra, a formal language that forms the basis of implemented languages like SQL. The video will cover the basics of relational algebra and its popular operators. It reviews that queries on relational databases operate on and produce relations. The example of a college admissions database with three relations is introduced: 'college', 'student', and 'apply'. The concept of keys as unique attributes is explained. The simplest query in relational algebra is presented as the name of a relation, such as 'student', which returns a copy of the relation. The paragraph sets the stage for exploring relational algebra operators in detail.

05:02

🔍 Select and Project Operators in Relational Algebra

The paragraph explains the Select and Project operators in relational algebra. The Select operator, denoted by Sigma (Σ), is used to filter rows based on a condition. Examples are given for selecting students with a GPA greater than 3.7 and filtering applications to Stanford for a CS major. The Project operator, denoted by Pi (Π), is used to select specific columns from a relation. It is demonstrated how to compose operators, such as applying a Project operator to the result of a Select operator. The paragraph also discusses the elimination of duplicate values in relational algebra results, contrasting it with SQL's handling of duplicates.

10:04

🔗 Cross-Product and Natural Join Operators

This paragraph delves into the Cross-Product operator, which combines every tuple from one relation with every tuple from another, creating a new relation with a schema that is the union of the two original schemas. The Natural Join operator is introduced as a convenient way to perform a cross-product followed by a selection based on equality of common attributes. The paragraph illustrates how these operators can be used to answer complex queries, such as finding the names and GPAs of students from large high schools who applied to CS and were rejected, and extends the query to include applications to colleges with enrollment over 20,000.

15:06

🔄 Theta Join and Conclusion of Relational Algebra Basics

The final paragraph introduces the Theta Join operator, which is equivalent to applying a selection condition to the cross-product of two relations. It explains that while the Natural Join simplifies notation, it does not add expressive power to relational algebra. The paragraph concludes by summarizing that relational algebra operates on sets of relations and produces relations as a result, using various operators to filter, slice, and combine relations. It sets the stage for the next video, which will cover additional operators and alternative notations for relational algebra expressions.

Mindmap

Keywords

💡Relational Algebra

Relational Algebra is a formal language used for expressing queries on relational databases. It forms the theoretical foundation for many implemented query languages, such as SQL. In the video, relational algebra is introduced as a way to manipulate and query data in a structured manner. The script explains that relational algebra operates on relations and produces relations as a result, which is central to understanding how databases work.

💡Relation

A relation in the context of the video refers to a table in a database, which is a collection of tuples (rows) with the same attributes (columns). The video script uses the term 'relation' to describe the basic unit of data that relational algebra operates on, such as the 'student', 'college', and 'apply' relations in the college admissions database example.

💡Select Operator

The Select operator in relational algebra is used to filter rows in a relation based on a given condition. It is denoted by the Greek letter Sigma (ÎŁ) with a subscript for the condition. The video script provides examples of using the Select operator to find students with a GPA greater than 3.7 or applications to Stanford for a CS major, illustrating how it can be used to retrieve specific subsets of data from a database.

💡Project Operator

The Project operator is used to select specific columns from a relation. It is denoted by the Greek letter Pi (Π) with a subscript listing the column names to be included. In the video, the Project operator is used to extract just the student ID and decision from the 'apply' relation, demonstrating how to reduce a relation to only the columns of interest.

💡Cross-Product Operator

The Cross-Product operator, also known as the Cartesian product, combines every tuple from one relation with every tuple from another relation. The result is a new relation with a schema that is the union of the schemas of the two original relations. The video script uses the cross-product to combine the 'student' and 'apply' relations, creating a large relation with all possible combinations of students and their applications.

💡Natural Join

The Natural Join operator is a shorthand for performing a cross-product followed by a selection based on equality of common attributes. It automatically eliminates duplicate columns that result from the join. In the video, the Natural Join is used to combine the 'student', 'apply', and 'college' relations in a way that enforces equality on shared attributes like student ID and college name, simplifying the process of combining relations.

💡Theta Join

The Theta Join operator is similar to the Natural Join but allows for a specific condition (theta condition) to be applied when combining relations. It is equivalent to performing a cross-product followed by a selection based on the theta condition. The video script mentions that most database management systems implement the theta join as their basic operation for combining relations, which is why understanding this operator is crucial.

💡Duplicate Values

In relational algebra, duplicate tuples in the result of a query are eliminated, adhering to the set semantics where each element is unique. This is in contrast to multisets or bags, which allow duplicates. The video script explains that relational algebra queries, by default, remove duplicates, which is an important distinction when comparing relational algebra to other data models.

💡Keys

A key in a relation is an attribute or a set of attributes whose values are guaranteed to be unique. The video script underlines the keys for the 'college', 'student', and 'apply' relations as a reminder of their uniqueness. Keys are crucial for identifying tuples and for performing operations like joins in relational databases.

💡Schema

The schema of a relation refers to its structure, including the names and types of its attributes. In the video, the schema is used to describe how the 'student', 'college', and 'apply' relations are organized. Understanding the schema is essential for performing operations like joins and projections, as it dictates which attributes can be combined or selected.

Highlights

Relational algebra is a formal language that underpins implemented languages like SQL.

Queries in relational algebra operate on and produce relations.

The simplest query in relational algebra is the name of a relation, producing a copy of that relation.

The Select operator (σ) is used to filter rows based on a condition.

Multiple conditions in the Select operator are combined using the logical AND.

The Project operator (π) is used to select specific columns from a relation.

Operators can be composed to filter rows and select columns simultaneously.

Relational algebra eliminates duplicates in query results, unlike SQL which is based on multisets.

The cross-product operator combines every possible pair of tuples from two relations.

The natural join operator combines relations by enforcing equality on shared attributes and removes duplicate columns.

Natural join is a convenient notation but does not add expressive power to relational algebra.

The theta join operator is equivalent to applying a selection condition to the cross product of two relations.

Theta join is the basic operation for combining relations in most database management systems.

Relational algebra is used to express queries in a formal way, using a set of operators to manipulate relations.

The video series will cover additional operators and alternative notations for relational algebra in the next part.

Transcripts

play00:00

this is the first of two videos where we

play00:02

learn about relational algebra

play00:04

relational algebra is a formal language

play00:07

it's an algebra that forms the

play00:09

underpinnings of implemented languages

play00:11

like sequel in this video we're going to

play00:14

learn the basics of the relational

play00:15

algebra query language and a few of the

play00:17

most popular operators in the second

play00:20

video we'll learn some additional

play00:21

operators and some alternate notation

play00:23

notations for relational algebra now

play00:26

let's just review first from our

play00:28

previous video on relational querying

play00:30

that queries over relational databases

play00:32

operate on relations and they also

play00:34

produce relations as a result so if we

play00:37

write a query that operates say on the

play00:39

three relations depicted here the result

play00:42

of that query is going to be a new

play00:44

relation and in fact we can pose queries

play00:46

on that new relation or combine that new

play00:49

relation with our previous relations so

play00:52

let's start out with relational algebra

play00:54

for the examples in this video we're

play00:56

going to be using a simple college

play00:58

admissions database with three relations

play01:00

the first relation the college relation

play01:02

contains information about the college

play01:04

name state and enrollment of the college

play01:06

the second relation the student relation

play01:08

contains an ID for each student

play01:10

the students name GPA and the size of

play01:13

the high school they attended and

play01:14

finally the third relation contains

play01:16

information about students applying to

play01:17

colleges specifically the students I

play01:20

need the college name where they're

play01:21

applying the major they're applying for

play01:23

and the decision of that application

play01:25

I've underlined the keys for these three

play01:28

relations as a reminder a key is an

play01:30

attribute or a set of attributes whose

play01:32

value is guarantee be guaranteed to be

play01:34

unique so for our examples we're going

play01:36

to assume that college names are unique

play01:38

student IDs are unique and that students

play01:40

will only apply to each college for a

play01:42

particular major one time so we're going

play01:46

to have a picture of these three

play01:48

relations at the bottom of the slides

play01:49

throughout the video the simplest query

play01:53

in relational algebra is a query that is

play01:55

simply the name of a relation so for

play01:58

example we can write a query student and

play02:00

that's a valid expression in relational

play02:02

algebra if we run that query on our

play02:05

database we'll get as a result a copy of

play02:08

the student relation pretty

play02:10

straightforward

play02:11

now what happens next

play02:12

is that we're going to use operators of

play02:14

the relational algebra to filter

play02:16

relations slice relations and combine

play02:18

relations so let's go through those

play02:20

operators the first operator is the

play02:24

Select operator so the Select operator

play02:27

is used to pick certain rows out of a

play02:29

relation the Select operator is denoted

play02:31

by a Sigma with a subscript that's the

play02:34

condition that's used to filter the rows

play02:35

that we extract from the relations so

play02:37

we're just going to go through three

play02:39

examples here the first example says

play02:41

that we want to find the students whose

play02:42

GPA is greater than 3.7 so to write that

play02:45

expression in relational algebra we

play02:47

write the Sigma which is the selection

play02:49

operator as a subscript the condition

play02:51

that we're filtering for GPA greater

play02:53

than 3.7 and the relation over which

play02:56

we're applying that selection predicate

play02:58

so this expression will return a subset

play03:01

of the student table containing those

play03:03

rows where the GPA is greater than 3.7

play03:06

if we want to filter for two conditions

play03:08

we just do an end of the conditions in

play03:10

the subscript of the Sigma so if we want

play03:13

say students whose GPA is greater than

play03:14

3.7 and it's high school size is less

play03:17

than a thousand we'll write select GPA

play03:21

greater than 3.7 we'll use the logical

play03:25

and operator a caret high school size is

play03:27

less than a thousand and again we'll

play03:29

apply that to the student relation and

play03:31

once again the result of that will be a

play03:34

subset of the student relation

play03:35

containing the rows that satisfy the

play03:36

condition if we want to find the

play03:39

applications to Stanford for a CS major

play03:41

then we'll be applying a selection

play03:43

condition to the apply relation again we

play03:46

write the Sigma and now the subscript is

play03:48

going to say that the college name is

play03:50

Stanford and the major is CS again the

play03:55

and operator and that will be applied to

play03:59

the apply relation and they will return

play04:02

as a result a subset of the apply

play04:04

relation

play04:05

so the general case of the select

play04:08

operator is that we have this Sigma we

play04:10

have a condition as a subscript and then

play04:13

we have a relation name and we return as

play04:15

a result the subset of the relation our

play04:19

next operator is the project operator so

play04:21

the select operator picks certain rows

play04:23

and the project operator picks

play04:25

and columns so let's say we're

play04:27

interested in the applications but all

play04:29

we wanted to know is the list of ID's

play04:31

and the decisions for those applications

play04:33

the project operator is written using

play04:36

the Greek PI symbol and now the

play04:38

subscript is a list of the column names

play04:40

that we would like to extract so we

play04:42

write ID

play04:43

sorry student ID and decision and we

play04:47

apply that to the apply relation again

play04:49

and now what we get back is a relation

play04:52

that has just two rows it's going to

play04:54

have all the tuples of apply but it's

play04:56

only going to have the student ID and

play04:58

the decision columns so the general case

play05:01

of a project operator is the projection

play05:04

and then a list of attributes can be any

play05:07

number and then a relation name now what

play05:13

if we're interested in picking both rows

play05:15

and columns at the same time so you want

play05:17

only some of the rows and we want only

play05:19

some of the columns now we're going to

play05:21

compose operators remember that

play05:23

relational queries produce relations so

play05:26

we can write a query say with a select

play05:29

operator of the students whose GPA is

play05:32

greater than three point seven this is

play05:34

how we do that and now we can take that

play05:37

whole expression which produces a

play05:39

relation and we can apply the project

play05:41

operator to that and we can get out the

play05:43

student ID and the student name okay so

play05:50

what we actually see now is that the

play05:53

general case of the selection and

play05:55

projection operators weren't quite what

play05:57

I told you at first I was deceiving you

play05:59

slightly when we write the Select

play06:01

operator it's a select with a condition

play06:04

on any expression of the relational

play06:07

algebra and if it's a big one we might

play06:09

want to put parens on it and similarly

play06:11

the project operator is a list of

play06:14

attributes from any expression of the

play06:17

relational algebra and we can compose

play06:19

these as much as we want we can have

play06:21

select over projective or select select

play06:23

project and so on now let's talk about

play06:27

duplicate values in the results of

play06:29

relational algebra queries let's suppose

play06:31

we ask for a list of the app of the

play06:33

majors that people have applied for and

play06:35

the decision for those majors so we

play06:37

write that as the project

play06:39

the major and the decision on the apply

play06:42

relation you may think that when we get

play06:46

the results of this query we're going to

play06:48

have a lot of duplicate values so we'll

play06:49

have CS yes CS yes CS no EES EE no and

play06:54

so on you can imagine in a large

play06:55

realistic database of applications

play06:57

there's going to be hundreds of people

play06:59

applying for majors and having a yes or

play07:01

a no decision the semantics of

play07:03

relational algebra says that the

play07:05

duplicates are always eliminated so if

play07:08

you run a query that would logically

play07:10

have a lot of duplicate values you just

play07:12

get one value for each result that's

play07:14

actually a bit of a difference with the

play07:15

sequel language so sequel is based on

play07:18

what's known as multi sets or bags and

play07:21

that means that we don't eliminate

play07:23

duplicates whereas relational algebra is

play07:25

based on sets themselves and duplicates

play07:28

are eliminated

play07:29

there is a multicenter bag relational

play07:31

algebra defined as well but will be fine

play07:33

by just considering the set relational

play07:35

algebra in these videos

play07:37

our first operator that combines two

play07:39

relations is the cross-product operator

play07:41

also known as the Cartesian product what

play07:44

this operator does is it takes two

play07:45

relations it kind of glues them together

play07:47

so that their schema of the result is

play07:49

the union of the schemas of the two

play07:51

relations and the contents of the result

play07:53

are every combination of tuples from

play07:55

those relations this is in fact the

play07:57

normal set cross-product that you might

play07:59

have learned way back in elementary

play08:00

school so let's talk about say doing the

play08:03

cross-product of student and apply so if

play08:07

we do this cross-product just to save

play08:09

drawing I'm gonna just kind of glue

play08:11

these two relations together here so if

play08:14

we do the cross-product we'll get as a

play08:16

result a big relation here which is

play08:18

going to have eight attributes the eight

play08:20

attributes across the student and apply

play08:22

now the only small little trick is that

play08:25

when we glue two relations together

play08:26

sometimes they'll have the same

play08:28

attribute name we can see we have si D

play08:30

on both sides so just as a notational

play08:32

convention when cross-product is done

play08:34

and there's two attributes that are

play08:36

named they're prefaced with the name of

play08:38

the relation they came from so this one

play08:39

would be referred to in the

play08:41

cross-product as the student dot si D

play08:43

where this one over here would be

play08:45

referred to as the apply dot si D so

play08:48

again we blew together in the Cartesian

play08:50

product the two relations with

play08:52

four attributes each we get a result

play08:54

with eight attributes now let's talk

play08:56

about the contents of these so let's

play08:57

suppose that the student relation had

play09:00

estoppels in it and that's how many

play09:03

tuples while the apply had a tuples in

play09:05

it the result of the Cartesian product

play09:08

is going to have s times a tuples it's

play09:10

going to have one tupple for every

play09:13

combination of tuples from the student

play09:15

relation and the apply relation now the

play09:19

cross product seems like it might not be

play09:21

that helpful but what is interesting is

play09:23

when we use the cross product together

play09:25

with other operators and let's see a big

play09:27

example of that let's suppose that we

play09:30

want to get the names of GPAs of

play09:32

students with a high school size greater

play09:34

than a thousand who applied to see us

play09:36

and were rejected okay so let's take a

play09:39

look we're going to have to access the

play09:41

students and the apply records in order

play09:44

to run this query so what we'll do is

play09:46

we'll take student cross apply as our

play09:48

starting point so now we have a big

play09:51

relation that contains eight attributes

play09:54

and all of those tuples that we

play09:55

described previously but now we're going

play09:58

to start making things more interesting

play09:59

because what we're going to do is a big

play10:01

selection over this relation and that

play10:04

selection is first of all going to make

play10:05

sure that it only combines student and

play10:07

apply tuples that are referring to the

play10:10

same student so to do that we write

play10:12

student dot s ID equals apply dot s ID

play10:16

so now we've filtered the result of that

play10:19

cross-product to only include

play10:21

combinations of student and apply tuples

play10:22

that makes sense now we have to do a

play10:25

little bit of additional filtering we

play10:27

said that we want the high school size

play10:28

to be greater than a thousand so we do a

play10:31

little and operator in the high school

play10:33

we want them to have applied to CS so

play10:36

that's an major equals CS we're getting

play10:38

a nice big query here and finally we

play10:40

want them to have been rejected so and

play10:42

decision equals we'll just use an R for

play10:45

reject so now we've got that gigantic

play10:49

query but that gives us exactly what we

play10:51

want except for one more thing which as

play10:53

I said all we want is their names and

play10:55

GPAs so finally we take a big

play10:57

parentheses around here and we apply to

play10:59

that the projection operator getting at

play11:02

the student name

play11:04

the GPA and that is the relational

play11:07

algebra expression that produces the

play11:09

query that we've written in English now

play11:12

we've seen how the cross product allows

play11:14

us to combine tuples and then apply

play11:17

selection conditions to get meaningful

play11:19

combinations of tuples it turns out that

play11:21

relational algebra includes an operator

play11:24

called the natural join that is used

play11:26

pretty much for the exact purpose what

play11:27

the natural join does is it performs a

play11:30

cross-product but then it enforces

play11:32

equality on all of the attributes with

play11:34

the same name so if we set up our schema

play11:37

properly for example we have student ID

play11:39

and student ID here meaning the same

play11:42

thing then when the cross product is

play11:44

created it's only going to combine

play11:46

tuples where the student ID is the same

play11:47

and furthermore if we add collagen we

play11:51

can see that we have the college name

play11:52

here in the college name here if we

play11:54

combine college and apply tuples we'll

play11:56

only combine tuples that are talking

play11:58

about the same College

play11:59

now in addition one more thing that it

play12:01

does is it gets rid of these pesky

play12:04

attributes that have the same names so

play12:06

since when we combine for example

play12:08

student and apply with the natural join

play12:10

we're only combining case tuples where

play12:13

the student s ID is the same as the

play12:15

apply s ID then we don't need to keep

play12:17

two columns through copies of that

play12:19

column because the values are always

play12:20

going to be equal so the natural join

play12:24

operator is written using a bowtie

play12:28

that's just the convention you will find

play12:31

that in your text editing programs if

play12:33

you look carefully so let's do some

play12:35

examples now let's go back to our Sam

play12:38

query where we were finding the names

play12:40

and GPAs of students in from large high

play12:42

schools who applied to CS and were

play12:44

rejected so now instead of using the

play12:46

cross-product we're going to use the

play12:48

natural join which as I said was written

play12:50

with a bowtie

play12:51

what that allows us to do once we do

play12:54

that natural join is we don't have to

play12:56

write that condition that enforced

play12:57

equality on those two attributes because

play12:59

it's going to do it itself and once

play13:02

we've done that then all we need to do

play13:04

is apply the rest of our conditions

play13:05

which were that the high school is

play13:06

greater than a thousand and the major is

play13:09

CS and the decision is reject again

play13:13

we'll call that R and then since we're

play13:16

only getting the names and

play13:17

pas we write the student name and the

play13:21

GPA okay and that's the result of the

play13:25

query using a natural join so as you can

play13:27

see that's a little bit simpler than the

play13:28

original with the cross-product and by

play13:30

setting up schemas correctly natural

play13:32

join can be very useful now let's add

play13:34

one more complication to our query let's

play13:37

suppose that we're only interested in

play13:38

applications to colleges where the

play13:40

enrollment is greater than 20,000 so so

play13:43

far in our expression we've referred to

play13:44

the student relation in the apply

play13:46

relation but we haven't used the college

play13:47

relation but if we want to have a filter

play13:50

on enrollment we're gonna have to bring

play13:52

the college up the college relation into

play13:54

the picture this turns out to perhaps be

play13:56

easier than you think

play13:57

let's just erase a couple of our

play13:59

parentheses here and what we're going to

play14:02

do is we're going to join in the college

play14:04

relation with the two relations we have

play14:06

already now technically the natural join

play14:11

is a binary operator people often use it

play14:14

without parentheses because it's

play14:15

associative but if we get pedantic about

play14:17

it we could add that and then we're in

play14:19

good shape now we've joined all three

play14:21

relations together and remember

play14:22

automatically the natural join enforces

play14:25

equality on the shared attributes very

play14:27

specifically the college name here is

play14:30

going to be set equal to the apply

play14:31

College name as well now once we've done

play14:34

that we've got all the information we

play14:36

need we just need to add one more

play14:37

filtering condition which is that the

play14:39

college enrollment is greater than

play14:42

20,000 and with that we've solved our

play14:45

query so to summarize the natural join

play14:50

we the natural join combines relations

play14:52

it automatically sets values equal and

play14:55

attribute names are the same and then it

play14:56

removes the duplicate columns the

play14:59

natural join actually does not add any

play15:01

expressive power tooth to relational

play15:04

algebra we can rewrite the national

play15:05

natural join without it using the

play15:07

cross-product so let me just show that

play15:09

rewrite here if we have and now I'm

play15:12

going to use the general case of two

play15:13

expressions one expression natural join

play15:17

with another expression that is actually

play15:19

equivalent to doing a projection on the

play15:24

schema of the first expression I'll just

play15:27

call it e 1 now Union the schema of the

play15:30

second X

play15:30

and that's a real union so that means if

play15:32

we have two copies we just keep one of

play15:34

them over the selection now we're going

play15:39

to set all the shared attributes of the

play15:41

first expression to be equal to the

play15:43

shared attributes of the second so I'll

play15:44

just write e1 a1 equals e2 a 1 and E 1 a

play15:51

2 equals e2 a2 now these are the cases

play15:55

where again the attributes have the same

play15:58

names and so on so we're setting all

play16:01

those equal and that is applied over

play16:04

expression one cross-product expression

play16:07

2 so again the natural join is not

play16:11

giving us additional expressive power

play16:13

but it is very convenient notationally

play16:18

the last operator that I'm going to

play16:20

cover in this video is the theta join

play16:22

operator like natural join theta join is

play16:24

actually an abbreviation that doesn't

play16:26

add expressive power to the language let

play16:28

me just write it the F theta join

play16:29

operator takes two expressions and

play16:31

combines them with the bow tie looking

play16:34

operator but with a subscript theta that

play16:37

theta is a condition it's a condition in

play16:40

the style of the condition in the

play16:42

selection operator and what this

play16:44

actually says it's it's pretty simple is

play16:47

it's equivalent to applying the theta

play16:50

condition to the cross product of the

play16:52

two expressions so you might wonder why

play16:54

even mention the theta join operator and

play16:57

the reason I mention it is that most

play16:59

database management systems implement

play17:01

the theta join as their basic operation

play17:03

for combining relations

play17:05

so the basic operation is take two

play17:07

relations combine all tuples but then

play17:09

only keep the combinations that pass the

play17:11

theta condition often when you talk to

play17:13

people who build database systems or use

play17:15

databases when they use the word join

play17:17

they really mean the theta join so in

play17:21

conclusion

play17:22

relational algebra is a formal language

play17:24

it operates on sets of relations and

play17:27

produces relations as a result the

play17:29

simplest query is just the name of a

play17:31

relation and then operators are used to

play17:33

filter relations slice them and combine

play17:35

them so far we've learned the Select

play17:38

operator for selecting rows the

play17:40

projector operator for selecting columns

play17:42

the cross-product operator for combining

play17:44

every possible pair of tuples from two

play17:46

relations and then two abbreviations the

play17:49

natural joint which is a very useful way

play17:51

to combine relations by enforcing

play17:53

equality on certain columns and the

play17:55

theta join operator in the next video

play17:57

we'll learn some additional operators of

play17:59

relational algebra and also some

play18:01

alternative notations for relational

play18:03

algebra expressions

Rate This
★
★
★
★
★

5.0 / 5 (0 votes)

Étiquettes Connexes
Relational AlgebraDatabase QueriesData ManagementSQL LanguageProgrammingData StructuresEducational VideoData AnalysisDatabase SystemsComputer Science
Besoin d'un résumé en anglais ?