Beruflich Dokumente
Kultur Dokumente
DELAY
PREDICTION
IN SOEKARNO-HATTA
INTERNATIONAL AIRPORT
Alva Thomson -- IGN Putra Sattvika -- M. Reza Qorib
Project Background
3
Data Science Questions
╺ How to predict flight delay in
Soekarno-Hatta International Airport
using weather and flight information?
Supporting questions:
4
Data Acquisition
Data acquisition, integration, and wrangling
5
Data Sources
Flight Data
Acquired from FlightRadar24 website every 6
hours
6
Features
Visibility
Day of the
Temperature
week
Dew Point
Wind Direction
Domestic/International
Air Pressure
Airline Code
Previous Flight Cloud Elevation
Wind Speed
Weather
Departure Hour
Cloud Condition
7
Data Analysis
Descriptive and Exploratory Analysis
8
Airlines with Worst Average
Delay (in minutes)
60
40
20
10
Experiments &
Results
Data preprocessing & Regression
11
Tools
╺ Python 3.6
╺ scikit-learn
12
Data preprocessing
╺ Numeric → Max-min scaler
13
Regression
╺ Linear Regression
╺ Kernel Ridge Regression
14
Performance Measure
╺ 10-fold Cross Validation
╺ Error measure:
15
Linear Regression
╺ CV RMSE: 20.9768 ± 1.8526
╺ Deemed to be inadequate
╺ More experiments are needed
with different algorithms
16
Ridge Regression
with polynomial kernel
Training CV
Degree CV StDev
RMSE RMSE
⠇
14 18.4433 20.0864 2.2570
15 18.2534 20.0745 2.2395
16 18.0641 20.0712 2.2172
17 17.8757 20.0770 2.1907
18 17.6891 20.0922 2.1605
⠇
19
Conclusion: Data Analysis
╺ Some international airlines
have worst delay time, namely
Korean Air and Qantas
╺ Correlation feature - delay
time
╶ Best: temperature
╶ Worst: destination
20
Conclusion: Regression
╺ Kernel Ridge Regression
╺ RSME: 20.0712 minutes
╺ Not a very good result
21
Conclusion: Future Works
╺ Introduce new features
╶ Real time airport congestion
╶ Weather in flight path
╶ Condition in destination airport
╺ Meta-algorithms (e.g. AdaBoost)
╺ Add more data
22
Thank you
23