Sie sind auf Seite 1von 45

Data Mining in Sustainability

EGCE 575

Hyunjoo Kim PhD, LEED A.P. Civil and Environmental Engineering

California State University at Fullerton

Todays lecture
Expert systems Intelligent systems AI (artificial intelligence) Neural Networks Machine Learning Knowledge process Data Mining Example of data mining in construction industry Data mining in Green Construction Green Building Design/Construction

Expert Systems
All our intellectual capabilities would be exceeded by computers.
Definition (Dym and Levitt, 1991): An expert system is a computer program that performs a task normally done by an expert or consultant and which, in so doing, uses captured, heuristic knowledge.

Computers are merely lumps of machinery that simply do what they are programmed to do. They cannot emulate human thought, creativity or feeling.

Example of an Expert System


Rule 0 IF Workload=large AND Laborsupply=nil THEN bid_decision=decline Rule 1 IF Workload=small AND Laborsupply=adequate THEN bid_decision=yes Rule 2 IF Workload=large AND Laborsupply=adequate THEN bid_decision=consider Rule 3 IF bid_decision=yes AND Historical_data=available THEN bid_decision=database Rule 4 IF bid_decision=consider AND Historical_data=unavailable THEN bid_decision=estimate Rule 5 IF bid_decision=consider AND client=important THEN final_decision=yes Rule 6 IF bid_decision=consider AND client=unimportant THEN final_decision=no Rule 7 IF bid_decision=yes AND client=* THEN final_decision=yes

Intelligent Systems
Intelligent systems can help experts to solve difficult analysis problems Intelligent systems can help experts to design new devices Intelligent systems can learn from examples Artificial intelligence is becoming less conspicuous, yet more essential.

What is AI?
A computer system that can perceive, reason, and act Three main academic disciplines Psychology (cognitive modeling) Philosophy (philosophy of mind) Computer science or engineering Through AI research, many representations and methods that people seem to use unconsciously have been crystallized and made easier for people to deploy deliberately.

Schematic diagram of neural networks


Error Back Propagation

Error
Input unit INPUT x1 x2 Hidden units Output unit Target OUTPUT y1

. . .

. . .

xN-1 xN Input Layer

. .

. .

. .

yN

Signal Propagation Direction Output Layer

Machine Learning - I
Computer program that makes decisions based on the accumulated experience contained in successfully solved cases

Machine Learning - II

Classification: Classify data based on the values in a classifying attribute, e.g., classify cars based on gas mileage. Prediction: Find and characterize trend, and sequential patterns. Association Rules: Find rules like buys (x, milk) buys (x, bread) Clustering: Cluster data to form new classes, e.g., cluster houses to find distribution patterns.

10

Example of Classification

11

Result of Classification

12

Data Mining Tool: See5.0

13

Knowledge ?
- Data
Numbers / words stored in a particular media

- Information
Data + meaning

- Knowledge
Internalized information + ability to utilize the information

Data

Information

Knowledge

14

Data Mining

15

Data Mining (DM) or Knowledge Discovery in Database (KDD)

Data Mining or KDD is the nontrivial process


of identifying valid, novel, potentially useful, and ultimately understandable patterns in data.

Providing the construction engineer with valuable information to control/manage his project efficiently.

16

Why Data Mining in construction?


Most data in a construction project are not analyzed.
- People in construction do not allocate enough time to data analysis. - Complexity of the analysis process is beyond the simple application.

- Construction managers often rely on past

experiences to be able to perform their daily tasks.


- How can we build knowledge base from our project data?

17

Data Analysis
Legal Data Geospatial Data Financial Data

Designer Data

Specifier Data

Owner / Occupier Data

Construction Data

Environmentalist Data

18

Data Mining and KDD


Inter-disciplinary field Database technology

Efficient ways of storing, retrieving and manipulating data


Machine learning and statistics

Database Query

Statistics

Learning knowledge from data


Visualization

On-line Analytical Processing

KDD

Machine Learning

Interface between humans and the stored data

Visualization

Mathematics

19

Case Study of Data Mining (DM) in Construction Industry

20

Data Mining Process


Understand & define the problem Data collection Data exploring Data analysis Evaluation and validation Cost/benefit

21

DM: Understand and Define the Problem


Fort Wayne IN: Flood Control Project
Phase I: CTRL-EAST, $4,488,450.21, 11/1/95-10/23/98 Phase II: East-North, $12,107,880.46, 1/6/97-11/5/98 Phase III: CTRL, $ 6,018,981.54, 9/14/98-8/6/99

Activity of 6-42 Drainage Pipeline Installation


Has been significantly delayed (54%). Main sub-activities Excavating the ground Installing pipelines Backfilling compacted material Erosion protection

22

DM: Data Collection


RMS (Resident Management System)
Manages Civil Works projects. Was developed by Army Corps of Engineers (1996) Consists of about 80 database tables, each of which has about more than 20 attributes. Contains data on construction project planning, contract administration, quality assurance, payments, correspondence, submittal management, safety and accident administration, modification processing, and management reporting.

23

DM: Data Exploring


ERRORS in RMS data: Mislabeling field names Unmatched Time Special Data Representation

24

DM: Data Analysis Informative Model (Decision Tree See 5)

25

DM: Data Analysis Predictive Model (Neural Networks)


Number of instances: 224 Cycles: 2,000 Layers: 3 Input features: Output feature: duration Error rate: 11%
0.3 0.25 Error Rate 0.2 0.15 0.1 0.05 0 100 500 1000 2000 3000 4000 No. of Cycles training test

Season, incomplete drawings, incomplete site survey, shortage of equipment, working on weekends, crew size

26

DM: Preliminary Evaluation


Validation with a construction project manual, RSMeans Validation with Monte Carlo simulation Cost/Benefit

27

DM: Validation using a project manual, RSMeans


Activity:
Drainage Pipe 320 units 10 workers

RS Means
Output: 10 units worker/day Duration: 3.2 days

NN
Duration: 3.12 3.80 days

Factors affecting the duration are not considered in RS Means

28

DM: Validation with Monte Carlo simulation

Monte Carlo simulation based on Inverse Gaussian Distribution for an activity of 320 units of an activity of 6"-42" pipe line installation with 10 workers a day

29

DM: Cost/Benefit
According to the result of this case study, the main cause of schedule delays was Inaccurate Site Survey rather than the weather related problems initially assumed by site managers. Discussions with site managers confirmed the importance of equipment, such as Ground Penetration Radar (GPR) with $5,000 investment for the equipment. Potential savings at the 4th stage: = $587,391 [$9,436 (daily construction cost) * 75 (expected number of instances) * 0.83 (the number of days to be saved through using GPR)]

30

Data Mining for Green Construction

31

Motivation
Sustainable design and construction became an important issue in the industry Traditionally, most building energy analyses have been conducted late in design
The ability to model a performance building early in the design process does not typically occur This is due to the difficulty and expense of modeling the building and energy systems

Todays 3D-CAD/BIMs (Building Information Models) provide the user with an opportunity to explore different energy saving alternatives in early design

32

Building Information Modeling Autodesk Revit

33

Community Emergency Services Station in Fort Bragg - I

34

BIM Approach for Energy Modeling Solutions


Day-lighting

Geometry (height)

Opening (window) Opening (door)

Need of Data Mining in Energy Modeling

36

Energy Modeling Process


Process Output

37

Comparison of Different Energy Estimation

Total

Energy savings: 42% LEED Platinum certified

38

Green Construction

U.S. Building Impacts:

12%
Water Use

30%
Greenhouse
Gas Emissions

65%
Waste Output

70%
Electricity Consumption

Test

Average Savings of Green Buildings

CARBON SAVINGS

WATER USE SAVINGS

WASTE COST SAVINGS

50-90%

30-50%

35%
ENERGY SAVINGS

30%

Source: Capital E

Test

Improved Bottom Line.

30-70% ENERGY SAVINGS

VERIFIED PERFORMANCE

ENHANCED PRODUCTIVITY

INCREASED VALUE

REDUCED LIABILITY & IMPROVED RISK MANAGEMENT

Test

Improved Bottom Line.

30-70% ENERGY SAVINGS

VERIFIED PERFORMANCE

REDUCED ABSENTEEISM

INCREASED VALUE

PRODUCTIVITY
ENHANCED RECRUITMENT IMPROVED EMPLOYEE MORALE

REDUCED LIABILITY & IMPROVED RISK MANAGEMENT

Test

Increased Productivity.

SCHOOLS

HOSPITALS

20% BETTER TEST PERFORMANCE

EARLIER DISCHARGE

RETAIL

FACTORIES

OFFICES

INCREASE IN SALES PER SQUARE FOOT

INCREASED PRODUCTION

2-16% PRODUCTIVITY INCREASE

Test

44

Todays lecture
Expert systems Intelligent systems AI (artificial intelligence) Neural Networks Machine Learning Knowledge process Data Mining Example of data mining in construction industry Data mining in Green Construction Green Building Design/Construction

45

Questions