Beruflich Dokumente
Kultur Dokumente
• Overview of ETL
• Planning Data Extraction
• Planning Data Transformation
• Planning Data Loads
Lesson 1: Overview of ETL
• ETL in a BI Project
• Common ETL Data Flow Architectures
• Documenting High-Level Data Flows
• Creating Source To Target Mappings
ETL in a BI Project
Business Requirements
Technical
Data
Architecture Reporting and
Warehouse
and Analysis
and ETL
Infrastructure Design
Design
Design
• Single-stage ETL
Source DW
• Data is transferred directly from
source to data warehouse
• Transformations and validations
occur in-flight or on extraction
• Two-stage ETL Source Staging DW
• Data is staged for a coordinated
load
• Transformations and validations
occur in-flight, or on staged data Source Landing Zone
• Three-stage ETL
• Data is extracted quickly to a
landing zone, and then staged prior
to loading Staging DW
• Transformations and validation can
occur throughout the data flow
Documenting High-Level Data Flows
ProductDB
Audit Start
Filter on LastModified
Concatenate Size
Lookup Subcategory Lookup Category Handle NULLs*
(Size + ' ' + MeasureUnit)
Staging
Table Column Data type Validation Transformation
• On extraction
Source
• From source
• From landing zone
• From staging Landing
• In data flow
Zone
• In-place
• In landing zone
Data
• In staging Warehouse
Transact-SQL vs. Data Flow Transformations
• Use Transact-SQL
SELECT CAST(c.CustomerID AS nvarchar(5)) AS CustomerAltKey,
CONVERT(nvarchar(50), c.FirstName + ' ' + c.LastName) AS CustomerName,
ISNULL(m.MembershipLevelName, 'Unknown') AS MembershipLevel
FROM src.Customers AS c
LEFT OUTER JOIN src.MembershipLevels AS m
ON c.MembershipLevel = m.MembershipLevelID;
• Minimizing Logging
• Loading Indexed Tables
• Loading Partitioned Fact Tables
Minimizing Logging
Logon Information
Start 20467D-MIA-DC and 20467D-MIA-SQL, and
then log onto 20467D-MIA-SQL as
ADVENTUREWORKS\Student with the password
Pa$$w0rd.
• Review Question(s)