ETL - Extract Transform Load | Summary of all the key concepts in building ETL Pipeline
Summary
TLDRThis video delves into the crucial concept of ETL (Extract, Transform, Load) pipelines, essential for data warehousing. It covers the extraction of data from various sources, transformation processes involving mapping, enrichment, and aggregation, and the final loading into data warehouses. The video is a valuable resource for both SQL beginners and experienced professionals.
Takeaways
- π ETL stands for Extract, Transform, and Load, which are the three main phases of a data pipeline used in data warehousing.
- π In the Extract phase, data is gathered from various sources like databases, flat files, or real-time streaming platforms like Kafka.
- π« Avoid complex logic during the extraction phase; simple transformations like calculating age from the date of birth are acceptable.
- π Ensure data format consistency across multiple sources to maintain uniformity in the data warehouse.
- π‘ Apply data quality rules during extraction to ensure the integrity and relevance of the incoming data, such as filtering out records from before the business started.
- π The staging area is a temporary holding place for data where basic transformations and quality checks occur before the data moves to the data warehouse.
- π Common load strategies include full loads for small tables and delta loads for larger tables to manage changes efficiently.
- πΊ The Transform phase involves converting raw data into meaningful information through mapping, enrichment, joining, filtering, and aggregation.
- π Mapping in the Transform phase can include direct column mappings, renaming, or deriving new columns from existing data.
- π Fact tables in the data warehouse contain measures like total sales and are often linked to dimension tables via foreign keys.
- π’ The Enterprise Data Warehouse (EDW) serves as the main business layer, storing processed data for reporting and analysis, and can feed downstream applications or data marts.
Q & A
What does ETL stand for in the context of data warehousing?
-ETL stands for Extract, Transform, and Load, which are the three main steps involved in the process of integrating data from different sources into a data warehouse.
Why is understanding ETL important for SQL beginners?
-Understanding ETL is important for SQL beginners because it is a fundamental concept in data warehousing and data integration, which are essential skills for working with databases and managing data flows.
What are the different sources from which data can be extracted?
-Data can be extracted from various sources such as OLTP systems, flat files, hand-filled surveys, and real-time streaming sources like Kafka.
What is the purpose of the extract phase in ETL?
-The purpose of the extract phase is to get data from the source as quickly as possible and prepare it for the subsequent transformation phase.
What is the significance of data format consistency in the extraction phase?
-Data format consistency ensures that the same data is represented in the same manner across different sources, simplifying the integration process and reducing errors during data transformation.
What are some examples of data quality rules that can be applied during the extraction phase?
-Examples of data quality rules include checking that sales data is from the correct time period (e.g., after the business started), ensuring that related columns have corresponding values, and limiting the length of description columns to save storage space.
What are the two popular load strategies for the extract phase?
-The two popular load strategies for the extract phase are full load, where the entire table is sent every time, and delta load, where only changes to the table are sent.
What is the main purpose of the transform phase in ETL?
-The main purpose of the transform phase is to apply various data transformations and mappings to convert raw data into meaningful information that can be used for business analysis and reporting.
What are some common transformation steps involved in the transform phase?
-Common transformation steps include mapping, enrichment, joining, filtering, removing duplicates, and aggregation.
What is the difference between a dimension table and a fact table in a data warehouse?
-A dimension table typically contains descriptive information about the data (e.g., employee details) and has a primary key, while a fact table contains quantitative measures (e.g., sales figures) and includes foreign keys that reference the primary keys of dimension tables.
What is the role of the load phase in the ETL process?
-The load phase is responsible for loading the transformed data into the appropriate tables in the data warehouse, such as dimension tables, fact tables, and enterprise data warehouse (EDW) tables, and making it available for business intelligence and reporting.
What is the purpose of data marts in the context of ETL?
-Data marts are subject-specific areas derived from the enterprise data warehouse (EDW), used for focused analysis and reporting. They contain data that is specific to a particular business area or department.
Outlines
π Introduction to ETL Pipelines
The video script introduces the concept of Extract, Transform, Load (ETL) pipelines, emphasizing their importance for building data warehouses. It targets both SQL beginners and experienced professionals, with the aim of providing a comprehensive understanding of ETL. The speaker outlines the key points of ETL that every developer should know, mentioning that these are not exhaustive but are crucial based on the presenter's understanding. The first topic discussed is the extraction phase, which involves obtaining data from various sources in different formats. Methods for extraction include flat files, JDBC/ODBC connections, SFTP for security reasons, and real-time streaming solutions like Kafka. The frequency of extraction and ingestion is determined by business needs and is not always real-time. The script also touches on simple logic application and data format consistency during the extraction phase.
π Data Quality and Load Strategies in ETL
This paragraph delves into maintaining data quality during the ETL process, focusing on the extraction phase. It discusses applying data quality rules, such as ensuring data pertains to the period a business has been operational, and handling records that do not meet these criteria by either ignoring them or flagging them for review. The importance of data format consistency across multiple sources is highlighted, with examples given for gender representation and date formats. The paragraph also covers the strategies for loading data into the staging area, such as full loads for small tables and delta loads for larger, frequently updated datasets. The use of flags for inserts and updates simplifies the loading process, but when not provided, it necessitates a comparison between the staging and target tables. The paragraph concludes with the typical truncate and load approach for the extraction phase, which is not intended for running business queries.
π Transform Phase: Enhancing Data for Business Insights
The transform phase of the ETL process is the focus of this paragraph, where raw data from staging tables is converted into meaningful information through various data transformations. The speaker outlines several key transformation steps, including mapping (source to target mapping, renaming, and deriving new columns), enriching data through lookups, joining multiple tables into a single one for clearer reporting, filtering data for specific business needs, and removing duplicates to ensure data quality. Aggregation is also discussed as a critical step, where measures are calculated to support business decisions, such as total sales or revenue. The paragraph emphasizes the importance of these transformation steps in preparing data for the load phase, making it ready for business consumption and analytics.
π Load Phase: Dimension and Fact Tables in Data Warehousing
The load phase of the ETL process is detailed in this paragraph, which involves loading data into intermediate, dimension, and fact tables. Dimension tables are highlighted as having primary keys (surrogate keys) and functional identifiers, with attributes that define the information stored. The paragraph discusses different load strategies for dimension tables, including Type 1, Type 2, and hybrid models, and the importance of defining granularity for accurate reporting. Fact tables are explained as having primary keys, foreign keys from dimension tables, and measures that can be aggregated. The nature of these tables is such that they generally support additive measures, although semi-additive and non-additive measures also exist. The paragraph concludes with a brief mention of the Enterprise Data Warehouse (EDW) and data marts, which are used for storing processed data and subject-specific analysis, respectively.
π Wrapping Up the ETL Process
In the final paragraph, the speaker summarizes the ETL process, from extracting data from various sources to transforming and loading it into the appropriate tables for business intelligence and reporting. The video aims to be educational for beginners and a refresher for experienced professionals. The speaker invites viewers to comment if any points were missed and expresses gratitude for watching. The paragraph reinforces the importance of understanding ETL for anyone involved in data warehousing and business analytics, ensuring that the audience is equipped with the knowledge to handle ETL pipelines effectively.
Mindmap
Keywords
π‘ETL
π‘Data Warehouse
π‘OLTP
π‘Flat File
π‘JDBC and ODBC
π‘Data Governance
π‘Real-time Streaming
π‘Data Quality Rules
π‘Staging Area
π‘Batch Processing
π‘Dimension Table
π‘Fact Table
π‘Semi-Additive and Non-Additive Facts
π‘EDW (Enterprise Data Warehouse)
π‘Data Mart
Highlights
ETL (Extract, Transform, Load) is essential for building data warehouses and is a crucial topic for both SQL beginners and experienced professionals.
The Extract phase involves getting data from various sources as quickly as possible, including databases, flat files, and real-time streaming solutions.
Data can be extracted using flat files, JDBC/ODBC connections, or by pushing files via SFTP to a landing area.
Real-time streaming, supported by solutions like Kafka, is crucial for businesses requiring low latency, such as credit card inquiries on websites.
The Transform phase is where raw data is converted into meaningful information through various data transformations and mappings.
Mapping in the Transform phase involves source to target mapping, which can be direct, column-level, or involve creating new derived columns.
Data enrichment in the Transform phase can involve looking up additional information to make the data more meaningful for business reports.
Join transformations are used to combine data from multiple tables into a single table for easier reporting and analysis.
Filter transformations allow for the selection of specific data subsets, such as filtering data for a specific region like North America.
Duplicate records can be a common issue, and the ETL process should be designed to handle them effectively.
Aggregation is a key transformation step, often involving group by clauses and aggregation functions like SUM, MAX, MIN to calculate business measures.
The Load phase involves loading data into intermediate tables, dimensions, facts, and creating data marts for business intelligence.
Dimension tables should have a primary key (surrogate key) and a functional identifier or natural key for uniqueness and business relevance.
Fact tables, in addition to a primary key, also contain foreign keys sourced from dimension tables and include measures for aggregation.
The EDW (Enterprise Data Warehouse) is the main business layer that supports BI and contains processed data for decision-making.
Data marts can be derived from the EDW for subject-specific analysis and are used for reporting and visualization.
ETL pipelines should be designed to handle various data formats and quality rules, ensuring data consistency and accuracy.
Load strategies for ETL can include full loads for small tables or delta loads for larger tables to manage changes efficiently.
Transcripts
hello everyone my name is nitin welcome
back to my channel in today's video
we'll talk about extract transform load
or in short etl
so if you are building any data
warehouse you must be aware of etl
pipelines
if you are a sql beginner i must tell
you that this is one of the most
important topic which you should
understand thoroughly
if you are an experienced professional i
am pretty sure this video will work as a
very good refresher for you
as part of this video i have covered the
important points which everybody must
know as part of etl
also want to mention that these are not
the only points so don't think that it
is a complete exhaustive list but from
my side as per my understanding whatever
i feel is very important and are a must
for any etl developer i have listed that
in this video
so without wasting any time let's start
with the very first topic which is
extract
extract based basically means that you
have to get data from your source so
whatever the source could be it could be
an oltp sources running on some database
solutions it could be a flat file it
could be a hand filled surveys it could
be anything so for your data warehouse
there can be n number of multiple
sources and those sources can send you
data in different formats extract phase
means you have to get data as quickly as
possible from your source and how can we
do that
so there are multiple ways by which you
can achieve that the most common way is
using the flat files so the source will
extract data from their environment and
they will send you it in flat file
formats if the source is running some
typical database solutions as oltp
sources they may even let you connect
via jdbc and odvc and then you can
connect to those sources and extract the
data yourself
however in some of the cases because of
data governance and security issues the
source may not allow you to connect
directly to their servers so in that
case they may extract the files they
will push files by sftp to your
environment and we call it a landing
area
and from there you can consume in in
your tables
now with or the near real time streaming
or the real time streaming has come into
the picture so now solutions like kafka
guinnesses all these supports real-time
streaming so if you are supporting a
business where latency or the delay in
consuming the data or
say for example if you're working for a
credit card business and there is a
person who is visiting your website and
who is inquiring about your different
credit cards so as a business partner
right what you may want to do is reach
out to that customer immediately as
as and when he's surfing your website
because in a typical data warehouse if
there is a 24 hour delay
and say next day you reach out to that
person that yesterday you were on our
website and you were interested in
different credit cards we have some
offer for you would you like to take it
now
because of this delay you may end up
losing a customer but at that moment
when he's actually on your website and
you have real time streaming enabled and
you are streaming the click stream data
into your systems and and supporting
your business in real time that is one
of the use cases i can think of in that
case
turning your potential customers into
actual customers it's very high so
depending on the business you are
supporting you may go for the real-time
streaming solutions as well for
extractions
and ingestion and then there is a
typical pipeline batch where
on decided frequency once a day or twice
a day once a week
the data will come to your environment
and you will ingest it into your tables
so extraction basically means you you
have to get data done from source and
you have to do it as quickly as possible
that does not mean that it should be
real time streaming all the time there
are other factors which determines that
what will be the frequency of extraction
and ingestion of the data
now moving on to the next topic which is
no complex logic so in extraction phase
which typically happens in the staging
area we do not apply any complex logic
so simple logic like if you have date of
birth from there if you want to
calculate the age of the customer maybe
that kind of logic you can implement in
in this staging area
which comes in the extraction phase
where you can apply very very basic
logic like determining the edge so no
complex logic in the extraction phase
next point is data format consistency so
what i mean by that is you may have
multiple sources sending you data and
most of the time you may feel that there
is a relationship between data sent by
different sources so here say example
i have a gender column and source one is
sending me male female and others right
source two might be sending the same
information which is gender but they may
be using the convention mfo
however source three could be sending it
like 0 1 and 2.
now i cannot ingest all these data as is
in my data warehouse right i have to
have a data format consistency so that
everywhere the same data is represented
in the same manner so i may choose in my
target that i'll represent for gender
column whenever any source is sending me
gender values
i will apply some logic and i will keep
it as mfo the other example could be a
it's a very basic example where you
apply some consistency into your date
formats so source one may be sending you
data in this format yyyy hyphen mm
hyphen dd similarly source two could be
sending data in their own format and
source 3 and source 4. so now in your
target you can decide that okay for any
incoming date column by default i'll
have this format so you can apply some
basic
data format transformations on your
incoming data in this extraction phase
the next point is the data quality rules
so in the extraction phase you can also
apply some data quality rules
like say your business has started from
2015 onward so any sales you have made
you know that it should be after 2015
right it should not be 2014 or before
that because your business was not even
in the existence at that time so you can
apply this data quality rule that while
ingesting any data the source should be
sending you data from 2015 onwards so if
it is not following this data quality
rule you can either push it to an error
table so that you can report it to the
source or you can simply ignore such
records
so that is one example the other example
could be that there are sometimes
description columns and we never use
this description column in our warehouse
data warehouse for any business or
analytical purpose we are just storing
it so in that case probably just to save
some storage space right and speed up
the process you may restrict it to the
first 500 characters only so you can
apply that kind of rule also say if you
if you are getting a value for column
one
then column two should also have a value
it cannot be null so that data quality
rules also you can apply so for example
if the source is sending you some data
and if the source has sales date in it
then probably you can check for the
sales id column also that if sales has
been made and you are getting a sales
date value from the source you should
also get the sales id for that else in
that case it's an error record and you
can move it to their tables these are
some of the basic data quality rules
also you can apply in this phase
generally the tables who are involved in
the extraction phase typical the staging
tables are truncate and load so today
file will come you will truncate the
table you will load the today file you
will process the data tomorrow when the
new file will come you will delete the
previous table completely and you will
load the fresh file and then you will
proceed with your etl pipeline so
generally these are truncate and load
there it is not supposed to run any
business queries so in the extraction
phase right the staging tables which are
involved you are not supposed to run any
business queries on top of that it is
strictly for technical purpose only it
is primarily built to support your edl
pipelines and you are not supposed to
run any business queries in fact the
business team should not be even given
access to the staging area
now what can be the popular load
strategies right for the extract so it
could be a full or it could be a delta
so if a table is pretty small say a few
hundred or even thousands of rows the
source may send you full data every time
so this actually reduces the overhead or
the operations overhead of maintaining
the incremental flow so in that case
it's a small table the source may end up
sending you full table all the time so
in that case you will simply truncate
and load your final table
the other popular strategy is the delta
this is specially very good for the
bigger tables where on daily basis you
are getting hundreds and thousands of
rows and and with time it becomes a very
big table so in such cases what happens
source generally identify
what what are the updated records
deleted records or the new records and
they send only the changes happening to
their table to you and then you have to
load this data into your staging area
and then you have to flag the records in
some cases the source may also send you
the flag that i for insert and default
delete you for update like that so this
makes your life easy but if the source
is not sending you any flag then in that
case you have to load this data into
staging table and compare it with your
target table on the basis of primary key
to determine whether it's a new record
or it's an old record with some updates
and you have to apply update on the
table
so those are two very popular load
strategies in the staging areas and the
load approach obviously when you are
starting it for the first time right you
will go for the historical load so in
that case you have to bring in all the
data till date and then you will load it
into your tables and after that you will
run the incremental loads on daily basis
or whatever is the frequency
so that is about the extract phase guys
i hope you are clear that when you are
working on the extract phase primarily
you are getting data from source and you
are ingesting data into your staging
areas these are some of the very
critical points you should be aware of
and these are the steps generally taken
in the etl pipeline for the extract one
now moving on to the transform phase
right transform phase what does this
mean actually so in transform phase
whatever data you have ingested in your
staging tables you will apply
different
data transformations or different data
mapping rules on it to make it more
meaningful so in this phase we will
start converting a raw data into
meaningful information
and in transform phase you can either
have predefined set of steps that for
any incoming data you will follow these
steps one by one in sequence and at the
end of those steps you will have a
enriched data
which makes more more sense to your
business so some of the common
transformation steps i have mentioned in
this video it is not an exhaustive list
but as per my understanding these are
the most important transformation steps
which should be there
typically in all the etl pipeline
the very first step is the mapping so
mapping basically means source to target
mapping so when you're getting data from
the source right they may have they may
be sending you data for 10 columns and
for every 10 incoming 10 columns there
will be a mapping for those columns in
the target table so it could be like it
could be as is mapping where whatever
the source is sending you data you just
move that data as is into your table
without any transformation or it could
be a column level mapping that or it
could be a new derived columns also so
say source is sending you first name and
last name and now in your business you
want full name of the customer so you
may create a new derived column
concatenating first name and last name
and coming up with the full name
similarly you can apply you can rename
the columns like source may be sending
you emp underscore id for the employer
id column and in in your case you want
the full name so you can change it to
employee id so you can rename the
columns
so that's a typical example of mapping
transformations you can enrich the data
so what will happen sometimes the source
will send you some data which may not be
very meaningful
from your business perspective so in
that case you can do a quick look up
into some other tables and you can add
some more information to it so that the
output is more meaningful for your
reports right so one example could be
the source could be sending you zip code
but the promotions your marketing team
is running is at the city level so what
you may want to do is rather than just
storing the zip code you may want to
convert that zip code into a city level
by doing a lookup into the table and
then storing city value into your tables
right similarly the third type of
transformation is the joint where you
have data into multiple tables and then
you
join those multiple tables and you load
a single table by joining these tables
so this makes more sense because
sometimes it happens that
source are sending you data source may
have different
tables at their own
environment also and they are extracting
it one by one and they are just sending
you data but for you it makes more sense
to club those data into a single table
and then use it for reporting so you may
go for the join transformations
similarly you can apply a filter
transformation also so say uh it could
your source could be sending you a
global data like it it spread across
multiple countries and the source is
sending you data for all the countries
in one fight now you are going to
support a promotion or a marketing
scheme which is applicable only for
north america so in that case if you're
creating a data mod or if you're
creating if you are bringing data
specifically for your north america
continent you may want to apply a filter
in incoming data so that you select only
the north america data and then you move
it to your tables right so filter is
another very important transformation
step
during transform phase the other is a
very common problem actually but i am
pointing it here as a step because if
required right if you are frequently
facing this problem which is of remove
duplicates then probably you should add
this as a typical step for all incoming
data so you should your ideally your etl
process should handle duplicate records
although it's not mandatory for you but
i think it is
a good practice to have this check now
how duplicates can end up in your
environment
one very common reason is the source has
sent you the same file again right so
like today is say 20th
and
source has sent you a 19th file again
right so you will process the same file
assuming that it is of 20th but when you
process the same data and if your etl
pipelines are not properly built you may
process the same data again so if you
consume the same data again you will end
up having a duplicate records in your
tables right which may lead to data
quality issues
uh the other reason could be that you
might have run the same job again so for
19th you might have run the same job
again and now you have duplicate data in
your tables so your etl pipeline should
be designed in a way that you should be
able to handle your duplicate records
now aggregation when you're building
your data warehouse there will be times
when you are calculating the measures
and these measures actually support your
businesses that okay a business may want
to know that how many sales have
happened in a particular store in a
given quarter you may want to do some
aggregation which is equivalent to
basically a group by clause in your sql
and then you may want to come up with
some numbers by adding those coming up
with some or average max and min
and you may want to make your data more
meaningful right so aggregation is one
another very important transformation
which you must be very well aware of
right so this is about the transform i
hope
i have covered all the important
critical steps which happens during the
transformation phase
if you feel i missed any
feel free to leave a comment below and
probably i'll try to add that as well
now moving on to the last step which is
the load phase so in the load phase what
happens is you have the data you have
extracted data from source ingested into
your staging tables
you have read data from your staging
tables apply a lot of transformation
steps into it and load it into
intermediate tables now what happens is
once you have data into your
intermediate tables these intermediate
tables typically what happens is in many
scenarios enterprises don't want to have
a separate data layers every time so
these intermediate
steps right many times it happens it
will happen in the temporary table or
the volatile table which directly loads
the dimensions and fact right so that's
this is also a possibility now coming to
the load phase basically you will be
loading your dimension tables you will
be loading your fact tables you will be
loading your ew tables and you will be
creating data marks out of your data
warehouse as well
so what are dimension tables so this is
not a specific dimension table video but
i'll tell you the key points as per me
that when you're loading a dimension
table these are the key points you
should consider
so one is obviously the dimension table
typically should have a primary key and
which which are generally
created using auto increment column or
the identity columns in some rdbms
solutions or you can call it as a
surrogate key so these are meaningless
these are just like the number one two
three four five
and you basically it is used to create
uniqueness for each record right however
business wise it does not make much
sense because these are purely technical
columns and these are like a sequence
now a dimension table must also have a
functional identifier or a natural key
so to explain it more let me use some
examples right so it does not mean that
a functional identifier is always
one row in each dimension table
depending on the load type of the
dimension right like in std2 where you
will maintain the history
the primary key or the auto increment or
the surrogate key will always be unique
but the functional identifiers may
repeat right so say
employee who have moved his city from
bangalore to hyderabad the employee will
remain the same in your employee table
right though that information will
remain the same
like his date of joining
right his name
and his pan card all these details will
remain the same only his
city office city will change from
bangalore to hyderabad so the functional
identifier may exist multiple times
however the surrogate key associated
with those function identifier will
still be unique
right and this is unique for each row
but it may have multiple occurrences
depending on whether you are maintaining
scd2 table or a cd1 table
the other thing dimension tables has the
attributes attributes basically tells
you that what information this dimension
actually is storing so it's like for
employee is the dimension table
if you have an employee dimension table
it may have employee name employee cd
employee date of birth so these are the
basically the attributes of your
employees
as i said earlier dimension has the load
strategy it could be a cd123 or the
hybrid std model which is used to load
any dimension table whether you want to
maintain history or you don't want to
maintain history
and if you if your dimension is very
small and kind of a static not much
changes are coming into that dimension
table you may even want to go for a
truncate and load option for that
dimension table
and granularity whenever you're
designing a data model you have to make
sure that your dimension granularity is
properly defined
because if that is not well defined then
in that case reporting when you are
creating a reporting and visualization
on top of those dimension tables you may
not be able to reflect data in a proper
manner to your business right so grain
is one thing you should consider while
loading any dimension tables
next two dimension is the fact tables so
fact tables
if you know that it should have a
primary key which is similar to the
surrogate keys which we see in the
dimension so even the fact tables have
the primary key or the auto increment
identity columns or the surrogate keys
however in addition to that they also
have a foreign key so foreign key
basically are the the primary keys which
are sourced from the source from the
source dimension table into the facts so
typically the dimension tables are
always loaded first
and the fact table source the primary
key of dimension table into it
as and reference them as a foreign key
so there is a foreign key relationship
between primary key foreign key
relationship between dimensions and fact
and fact table generally
have the measures like total sales total
revenue where you will apply some group
by you will apply some aggregation
functions like sum max min to calculate
some measures that being said all facts
does not have the additive property
there are some semi-additive and
factorious facts also and non-additive
facts also but in general fact tables
generally have uh the measures and we do
aggregation those measures could be
additive uh across all the attributes or
across some of the attributes
now that's about fact table
next is the edw so this is your main
business layer where you are storing all
the ew tables so your edw tables has the
process data which is very important for
your business team to make any decisions
so these tables are generally exposed to
that business or the reporting team or
the bi engineers in your team who will
read data from these
process data from these tables and will
create reports on top of that so edw
basically means it's a main data layer
which is so which supports the bi
and it has all the process data which is
kind of end of the etl pipeline so all
the process data is also shared with the
downstream applications which could be
reporting or it could be some other team
where you will export data from these
tables into flat files and push it into
there so now rather than acting as the
consumer or the target you will act as a
source and you will export that process
data and you will push it to some other
team which will now start their etl
pipeline or their consumption process
similar to etw you can have data marts
also so sometimes you may want a very
subject specific area where you can do
your analysis so in that case you can
derive data mods from your edw you will
create separate tables in data marts and
you will source only subject specific
data from your edw and create data mods
and
data marts are also used for reporting
and visualization purpose
so that's it guys
i wanted to cover this in today's video
that what is etl extract transform and
load and what all the key steps in each
phase i hope this video was helpful you
if you are a fresher i'm pretty sure you
might have learned some new things today
and if you are a experienced
professional as i said earlier i am i
believe that this video was a good
refresher for you
if you feel i missed any point here feel
free to drop a comment and i'll try to
correct it thank you very much for
watching the video thanks
Browse More Related Video
What is ETL Pipeline? | ETL Pipeline Tutorial | How to Build ETL Pipeline | Simplilearn
Data Warehouse System Processes | Lecture #5 | Data Warehouse Tutorial for beginners
Introduction To Data Warehouse, ETL and Informatica Intelligent Cloud Services | IDMC
What is Zero ETL?
Data Warehouse Interview Questions And Answers | Data Warehouse Interview Preparation | Intellipaat
What is ETL (Extract, Transform, Load)?
5.0 / 5 (0 votes)