Sie sind auf Seite 1von 3

Data Flow ETL / ELT

Ingestion Data Warehouse /


Data Lake SQL Virtualization Engine Mart

Operational Database
Data Source Cloudera Data Platform Presentation Layer
CDH Cluster (n-node) Business Visualization /
Intelligence Tools Reporting
Unstructured / Semi
Structured Impala Kudu

ELT:
External Data Impala Job
ETLT:
Hue Talend

Data Scientist
Playground
Advance
Analytics

Cloudera Navigator
CDSW
Cloudera
Data Science
Workbench

Data Governance Ad-hoc Data Exploration


I'm Business User,
I'm Handling Blue
Data Source Cloudera Data Platform Presentation Layer Sector
CDH Cluster (n-node) Business
Intelligence Tools
I'm IT Ops, I'm
Impala Kudu Handling Green
Sector

ELT: I'm Data Steward,


Impala I'm Handling Purple
Job Sector
Hue ETLT:
Talend
I'm Data Engineer,
I'm Handling Orange
Advance Sector
Analytics

Cloudera Navigator I'm Data Scientist,


I'm Handling Yellow
Sector
Master Node Edge Node
8 Core 64GB RAM 16 Core 128GB Minimum Configuration
256GB SSD RAM 1M3W1E (+1U)
Preferable Physical 512GB SSD
OS&App Master Node for coordinating for all Worker Node (including load balancing)
Preferable
Worker Node Worker Node Worker Node Physical Worker Node for do the transformation process and keep the data
16 Core 16 Core 16 Core
128GB RAM 128GB RAM 128GB RAM (If Using Edge Node for all client tools that to interact with Master Node but get result
256GB SSD 256GB SSD 256GB SSD Talend) directly from Worker Node
OS&App OS&App OS&App 2 slotx16 Core
1TB HDD for 1TB HDD for 1TB HDD for 256GB RAM Utility Node for Cloudera Manager & Management Service backend (optional)
Data Data Data 512GB SSD
Must Physical Must Physical Must Physical OS&App Talend Big Data Integration Server for Talend ETL Tools could be placed on
500GB HDD for Edge Node
Utility Node
8 Core 64GM RAM Temp
256GB SSD
Preferable Physical