Beruflich Dokumente
Kultur Dokumente
Group 1:
- Nguyen Anh Vu
- Pham Quoc Tuan
- Tran Duy Trung
LOGICAL MODEL
Internal order fact (Order from supermarket to company)
SUPPLIER_DIM
SUPPLIER_ID (PK) AREA_DIM
SUPPLIER_NAME SUP_ID (PK)
PROD_DIM SUP_NAME
PROD_ID (PK) ADDR
PROD_NAME DISTRICT
PROD_GROUP (FK) CITY
SUPPLIER_ID(FK)
PROD_GROUP
INTERNAL_ORDER
PROD_GROUP (PK)
SUP_ID (FK)
GROUP_NAME
PROD_ID (FK) PK
DATE_ID (FK)
Quantity
D_IN_ORDER_DIM
DATE_ID
DATE
MONTH
YEAR
1. Mapping rules to DW conceptual schema:
In this step, we analysis what the attributions have the common meaning between DB1 and
DB2 (DB1 and DB2 column). After that, based on the DW logical design, we choose what
attributions will be used and extracted.
SALE BILLNO
TOTAL_VA S_AMOUNT
LUE
B_DATE S_DATE
NAME_
CASHIER
QTY S_QTY
UNIT_PRIC U_PRICE
E
UNIT_PRIC
E
QTY_IN_S QTY_IN_STO
TOCK CK
DAMAGED DAMAGED_Q
_QTY TY
ORDER_DATE/MONTH D_IN_ORDER_DIM/MONTH
ORDER_DATE/YEAR D_IN_ORDER_DIM/YEAR
SUPERMARKET/SM_STREET AREA_DIM/STREET
SUPERMARKET/SM_DISTRICT AREA_DIM/DISTRICT
SUPERMARKET/SM_CITY AREA_DIM/CITY
SUPERMARKET_TABLE: {
PROD_ID: Integer.
ORDER_DATE: String
PROD_NAME: String
GROUP_NAME: String
REQUESTED_QTY: Integer
SM_ADDRESS: String}
Example:
SUPERMARKET_TABLE:
PROD_ID* PROD_NAME* GROUP_NAME REQUESTED_QTY ORDER_DATE SM_ADDRESS
SUPERMARKET: {
SM_ID: Integer
SM_NAME: String
SM_STREET: String
SM_DISTRICT: String
SM_CITY: String }
ORDER_DATE: {
DATE_ID: Integer
DATE: Integer (from 0 to 31)
MONTH: Integer (from 1 to 12)
YEAR: Interger }
SUPERMARKET_ORDER: {
PROD_ID: Integer
DATE_ID: Integer
PROD_NAME: String
PROD_GROUP: String
QUANTITY: Integer
SM_ID: Integer }
Example:
SUPERMARKET:
SM_ID SM_NAME SM_STREET SM_DISTRICT SM_CITY
0 No information No Information No Information No Information
ORDER_DATE
DATE_ID DATE MONTH YEAR
1 25 11 2016
2 23 6 2016
3 23 12 2016
4 22 9 2016
SUPERMARKET_ORDER
PROD_ID DATE_I PROD_NAME PROD_GROUP QUANTITY SM_ID
D
5 1 Dieu Hong Fish Fish 10 1
134 2 Ba Co Gai canned food 100 12
34 3 OMO detergent 100 5
45 4 Neptune Food 100 0
2. Transformation:
2.1 Transformation in Extraction step:
The tasks of transformation in this step is changing different data types in sources to common
data type in database in Stage DS and filling missing value.
Change data type from excel file to database in Stage DS:
- Data type in excel file is not standard, it changes depended on Microsoft Excel or
manually by human. Thus, the data type of data in excel file is not consistent. We have
to change the data type read from excel file to common data type in Stage DS.
For example:
Normalization rule: separate each part in original date data with “/” or “ “ or “-“. Third part is
year. First part and second part: if it is greater than 12, it is Date, else, it is month.
SM_ADDRESS:
Normalizatio Standardizatio Correctio
n n n
Big C Thao => Name: Big C => Name: Big => Na
Diem, số 12 Thao Diem C Thao me:
đường Street: số Diem Big
Quốc 12 đường Street: 12 C
Hương, Quốc Quoc Huong Tha
phường Hương District: 2 o
Thảo Điền, District: City: Ho Chi Die
quận 2, quận 2 Minh n
HCM City: HCM Str
eet:
12
Qu
ốc
Hư
ơng
Dis
tric
t: 2
Cit
y:
Ho
Chi
Min
h
Big C Go => Name: Big C => Name: Big D
Vap, 792 Go Vap C Go Vap
Nguyễn Street: 792 Street: 792
Kiệm, P.3, Nguyễn Nguyen
Q.Gò Vấp, Kiệm Kiem
Ho Chi District: District: Go
Minh Q.Gò Vấp Vap
City: Ho City: Ho
Chi Minh Chi Minh
BIG C Hoang => Name: BIG => Name: BIG
Van Thu, 202B C Hoang C Hoang
Hoàng Văn Van Thu Van Thu
Thụ, P.9, Q. Street: 202B Street:
Phú Nhuận, Hoàng Văn 202B Hoang
TP.HCM. Thu District: Van Thu
Q. Phú District:
Nhuận Phu Nhuan
City: City: Ho Chi
TP.HCM. Minh
Normalization rule: separate each part in original address data with “,” or “. “ or “-“. First parth is
supermarket name, second part is street number, fourth part is district, fifth part is city.