Sie sind auf Seite 1von 7

GRD Journals- Global Research and Development Journal for Engineering | Volume 5 | Issue 11 | October 2020

ISSN- 2455-5703

Analysis and Generation of Multiple Linear


Regression for Residential Area of Vesu, Surat
Payal Zaveri Simerdeep Kaur Sood
Assistant Professor Bachelor of Engineering
Department of Civil Engineering Department of Civil Engineering
SCET, Gujarat Technological University SCET, Gujarat Technological University

Abstract
Regression analysis is a statistical approach to find the interrelation between dependent and independent variables. The basis of
regression analysis is correlation among the variables. Correlation is a statistical method to evaluate the strength of the relationships
between variables. The higher the coefficient of correlation the more is the strength. This paper focuses on the study of trip
generation and the factors affecting the same along with the degree to which the individual factor affects the trip. Linear regression
is the study of one dependent and few independent variables and their correlation. The data collection technique taken into
consideration is ‘Household survey’. Household survey is the procedure for collection and analysis of general situation and specific
characteristics affecting the individual households or residential areas. The precision of the result depends on the sample size and
the range of the sample. The more is the sample size better is the correlation.
Keywords- Correlation Analysis, Household Survey, Microsoft Excel, Multiple Regression Analysis, Trip Generation

I. INTRODUCTION
To economically develop a city, it is important to develop a good transportation infrastructure and to provide good communication
facilities. This includes planned development of the road connections in the city and railway connections for interstate travel. The
first phase of developing a good transportation facility is to plan. This includes data collection of the study area and inventory. The
second phase is to understand the pattern of trip generation. This includes the relation of various independent and dependent
variables affecting the trip generation. This analysis is done by generating a model for understanding and accurate results.
Trip is a one-way movement of a person by a mechanical means of mode. It has two ends an origin which is a starting
point of a trip and a destination which is the end point of the trip. Trips are divided into two types ‘Home based’ trips where one
end is home and ‘Non home based’ trips where none of the two end is home. [1] Before the Start of first phase it is important to
select a few factors affecting the trip generation, based on which our data collection survey will depend. These are called the
independent variables. These variables are themselves independent, but the trip generation is dependent of these factors. Factors
affecting the trip generation may include number of family members, income of the family, age group, number of vehicles owned
by the family, working hours, number of trips per day. [2]
The second phase of planning of a transportation system or facility is to start with modeling. Correlation analysis is the
first step to generation of the model. It is the process to find an independent variable most likely to affect the dependent variable.
It is the study of finding the strength of relationship among the variables. [3] Higher the correlation coefficient better is higher is
the strength of the variables. Generally, the values lie between -1 to 1 with 1 being the best relation among variables.
After finding the variables with suitable correlation coefficient the final step in analysis is to perform a regression analysis.
Multiple linear regression analysis is a statistical method of fitting mathematical relationship between the dependent and
independent variables. [4] [5] This model helps in generating equations which ultimately help in finding the number of trips
generated due to a selected number of factors affecting. Multiple linear regression analysis is an accurate method and can be used
easily by novices or students as well as experts.
This research paper is an original study conducted through field work by collection of data in residential areas and analysis
through computer software. This study is based on the data of the prevailing year collected through household survey.
Questionnaires are prepared to obtain most relevant data which can be used for the regression analysis.

II. STUDY AREA


Surat is the largest urban agglomeration and ninth eight largest city in India. Surat along with being famous for diamond and textile
industry is the commercial and economical hub of South Gujarat. Surat is known for being the ‘Diamond city’, ‘Silk city’ and the
‘Green city’ of India. It is the administrative capital of Surat district.
Surat is 284 km to the south of the Gandhinagar which is the capital of Gujarat, 265 km to the south of Ahmadabad and
289 km north of Mumbai. Tapi River is the major river passing through the city and is responsible for the economic growth of the
city. It has a coastline which touches the Arabian Sea. Surat has the GDP growth of 11.5% over seven fiscal years.

All rights reserved by www.grdjournals.com 20


Analysis and Generation of Multiple Linear Regression for Residential Area of Vesu, Surat
(GRDJE/ Volume 5 / Issue 11 / 002)

The major intercity transportation infrastructure in Surat includes a Railway station which is under the administrative
control of Western Railway Zone of the Indian Railways. Surat BRTS is a bus rapid transit system, an intracity network, having
245 buses connecting major locations. Surat has the second busiest airport in Gujarat Airport. There are 83 bridges in operation
and four flyovers and many still in the stage of construction phase.

Fig. 1: Location of the study area on Google Maps [6]

Vesu is 5 km away from airport and 10 km away from the railway station. Vesu is a suburb in the South West Zone of
Surat. It is the newest area and fast developing area in terms of public transport infrastructure, residential area, complex and
business parks. The latitude location of Vesu is 21.1447217 ̊ and longitude is 72.7717735 ̊. The study area selected is a residential
area which is well connected to commercial area as well as is extremely near to the airport and BRTS facilities. It is in the midst
of other residential apartments.
Vesu is a relatively newly developed area of Surat which continuously undergoes various transformations with respect to
its infrastructural development. This study will help the future development to occur in a systematic and organized manner which
latter on will require less reconstruction and rehabilitation. It will also increase the efficiency of the area in terms of transportation.

III. DATA COLLECTION


As discussed, the first step of the planning is data collection. Among the many data collection methods ‘Household Survey’ is
selected here. Household survey is the process of data collection of the general characteristics of an area usually residential. [2] It
is one of the easiest and moderately efficient data collection survey type. The process includes filling up forms which includes the
age, occupation, address/ zone of residence, number of family members, vehicles owned, type of vehicle owned, mode of public
transportation if chosen, working hour, and frequent place of visit and income of the family. [7] [8] These variables are the
independent variables and will affect the trip generation (dependent variable). A data sheet is prepared after the analysis of the
questionnaire. This sheet is finally used in the regression analysis. The data sheet for this research paper is given in annexure at
the end of the paper.

IV. METHODOLOGY

Fig. 2: Flowchart of the Methodology

To start with the regression analysis the following steps should be followed: -
Step 1: Prepare a Data table based on household survey
Step 2: Install the ‘Analysis ToolPak’ in Microsoft Excel
Step 3: Correlation analysis in Excel
Step 4: Selection of ‘Independent Variables’

All rights reserved by www.grdjournals.com 21


Analysis and Generation of Multiple Linear Regression for Residential Area of Vesu, Surat
(GRDJE/ Volume 5 / Issue 11 / 002)

Step 5: Regression Analysis in Excel


Step 6: Equation generation from the regression tables
Step 7: Selection of the best and most effective equations
Equation for the regression analysis is as follows: -
Y = A + B1 X1
Y = A + B1X1 + B2 X2
Y = A + B1 X1 + B2 X2 +B3 X3
Here,
Y = Dependent variable = Trip generation
A = Intercept from multiple linear regression table
B1, B2, B3 = variables in the table
X1, X2, X3 = Independent variables = Earning people, Working hours, Vehicles owned
The first step is the correlation analysis. The correlation coefficient for different independent variables is given below: -

A. Number of Vehicles and Trips


Table 1: Vehicles v/s Trips
Vehicles Trips
Vehicles 1 -
Trips 0.89723 1

B. Number of Earning People and Trips


Table 2: Earning people v/s Trips
Earning people No. of trips
Earning people 1 -
No. of trips 0.9583148 1

C. Number of Age Group and Trips


Table 3: Age Group v/s Trips
Age group Trips
Age group 1 -
Trips 0.2677264 1

D. Trip duration (hours) and Trips


Table 4: Trip duration and Trips
Trip duration Trips
Trip duration 1 -
Trips -0.1137283 1

E. Working hours and Trips


Table 5: Working Hours and Trips
Working hours Trips
Working hours 1 -
Trips 0.8701299 1

F. Number of Family Members and Trips


Table 6: Family members v/s Trips
Family members Trips
Family members 1 -
Trips 0.8660714 1
In the above correlation analysis, the first variable is the independent variable which affects the selected dependent variable which
is trip generation. Depending on the correlation coefficients the best three independent variables selected for further procedure are
as follows: -
Table 7: Selected independent variables
Sr. no. Independent Variable Coefficient
1 Number of Earning people 0.95
2 Number of vehicles owned 0.89
3 Number of Working hours 0.87

V. RESULT AND ANALYSIS


After selecting the independent variables various multiple linear regression analysis is done with one, two or three independent
variables. [6]

All rights reserved by www.grdjournals.com 22


Analysis and Generation of Multiple Linear Regression for Residential Area of Vesu, Surat
(GRDJE/ Volume 5 / Issue 11 / 002)

A. Number of Earning People and Trips


Number of people earning may increase the trips as the will be more trips for going to office from home and back and for other
necessities. This analysis considers the following: -
Independent variables = Number of earning people
Dependent variable = Number of trips
Table 8: Regression Analysis
Multiple R 0.958314847
R Square 0.918367347
Adjusted R Square 0.908163265
Standard Error 0.447213595
Observations 10
Table 9: Coefficients for equations
Coefficients Standard Error
Intercept (A) 0.295918367 0.242011
Earning people (B1) 0.459183673 0.0484022

B. Number of Working Hours and Trips


Number of working hours may range from 6 -11 hours in a day, depending on the number of people working and their occupation.
This analysis considers the following: -
Independent variables = Number of Working hours
Dependent variable = Number of trips
Table 10: Regression analysis
Multiple R 0.87012987
R Square 0.757125991
Adjusted R Square 0.72676674
Standard Error 0.771389216
Observations 10
Table 11: Coefficients for equations
Coefficients Standard Error
Intercept (A) -3.727272727 1.7248787
Working hours (B2) 1.107438017 0.2217588

C. Number of Vehicles owned and Trips


Number of Vehicles may increase or decrease the number of trips. A person with higher ownership of vehicles will not use public
transport. However, sometimes he may try to save money on fuel and use other means of transportation. This analysis considers
the following: -
Independent variables = Number of Vehicles
Dependent variable = Number of trips
Table 12: Regression analysis
Multiple R 0.897234169
R Square 0.805029155
Adjusted R Square 0.780657799
Standard Error 0.691142946
Observations 10
Table 13: Coefficients for equations
Coefficients Standard Error
Intercept (A) 1.107142857 0.6786889
Vehicles (B3) 1.678571429 0.2920612

D. Number of Earning people, Working hours and Trips


Number of people earning in the family and the working hours are directly related to each other and will also increase the trip
generation. This analysis considers the following: -
Independent variables = Earning people, working hours
Dependent variable = Number of trips
Table 14: Regression analysis
Multiple R 0.98641472
R Square 0.973013999
Adjusted R Square 0.965303713
Standard Error 0.274883253
Observations 10

All rights reserved by www.grdjournals.com 23


Analysis and Generation of Multiple Linear Regression for Residential Area of Vesu, Surat
(GRDJE/ Volume 5 / Issue 11 / 002)

Table 15: Coefficients for equations


Coefficients Standard Error
Intercept (A) -2.272727273 0.6446584
Earning people (B1) 1.454545455 0.1943718
Working Hours (B2) 0.446280992 0.1185351

E. Number of Earning people, Vehicles owned and Trips


If the number of people earnings more there is a higher chance of having more number of vehicles and hence their number of trips
may increase individual but decrease in a group may increase the trips as the will be more trips for going to office from home and
back and for other necessities. This analysis considers the following: -
Independent variables = Earning people, vehicles owned
Dependent variable = Number of trips
Table 16: Regression analysis
Multiple R 0.983504138
R Square 0.96728039
Adjusted R Square 0.957931931
Standard Error 0.302679545
Observations 10
Table 17: Coefficients for equations
Coefficients Standard Error
Intercept (A) -0.184782609 0.3693596
Earning people (B1) 1.391304348 0.2361474
Vehicle (B3) 0.684782609 0.2116876

F. Number of Working hours, Vehicles owned and Trips


Working hours and vehicles owned does not have much of a correlation but both affect the trip generation. This analysis considers
the following: -
Independent variables = Working hours, vehicles owned
Dependent variable = Number of trips
Table 20: Regression analysis
Multiple R 0.932016134
R Square 0.868654073
Adjusted R Square 0.831126666
Standard Error 0.606439276
Observations 10
Table 21: Coefficients for equations
Coefficients Standard Error
Intercept (A) -1.636363636 1.6044875
Vehicles (B3) 1.045454545 0.4288173
Working Hours (B2) 0.537190083 0.291725

G. Number of Earning People, Working hours and Trip


This analysis includes all three independent variables and the dependent variable. This analysis considers the following: -
Independent variables = Earning people, Vehicles and Working hours
Dependent variable = Number of trips3
Table 22: Regression analysis
Multiple R 0.993372087
R Square 0.986788104
Adjusted R Square 0.980182156
Standard Error 0.207747109
Observations 10
Table 23: Coefficients for equations
Coefficients Standard Error
Intercept (A) -1.636363636 0.5496472
Working hours (B2) 0.311294766 0.1045865
Earning people (B1) 1.242424242 0.1696248
Vehicles (B3) 0.424242424 0.1696248

VI. RESULT AND DISCUSSION


After the regression analysis with ‘Microsoft Excel’ equations are made for every different combination of independent and
dependent variables. [9] [10] In the following equations,

All rights reserved by www.grdjournals.com 24


Analysis and Generation of Multiple Linear Regression for Residential Area of Vesu, Surat
(GRDJE/ Volume 5 / Issue 11 / 002)

Y = Number of Trips
X1 = Number of Earning people
X2 = Number of Working Hours
X3 = Number of Vehicles owned
Table 24: Generated equations
Sr. No. Conditions Equation r2
1 Number of Earning people v/s Trips Y = 0.296 + 0.459X1 0.918
2 Number of Working hours v/s Trips Y = -3.727 + 1.11X2 0.757
3 Number of Vehicles owned v/s Trips Y = 1.10 + 1.678X3 0.805
4 Number of Earning people and Vehicles owned v/s Trips Y = -0.1847 + 1.39X1 + 0.685X3 0.967
5 Number of Earning people and Working hours v/s Trips Y = -2.272 + 1.45X1 + 0.446X2 0.973
6 Number of Vehicles and Working hours v/s Trips Y = -1.636 + 1.04X3 + 0.537X2 0.869
7 Number of Working hours, earning people and Vehicles owned v/s Trips Y = -1.636 + 0.311X1 + 1.24X2 + 0.424X3 0.987
There are two criteria to be taken into consideration for the final selection of the model which is best suited to estimate future trip
generation: -
The regression coefficient r2 should be greater than 0 and as close to 1 as possible.
It is preferable to not have negative values of constants in the equation.
Based on the first criteria the model selected is (7) Number of Working hours, earning people and Vehicles owned v/s Trips
Y = -1.636 + 0.311X1 + 1.24X2 + 0.424X3
r2= 0.987
However, here the constant is negative. Hence the best suited model is (1) Earning people v/s Trips
Y = 0.296 + 0.459X1
r2= 0.918
And the equation is taken as the most influential independent variable affecting the trip generation.

VII. LIMITATIONS
One of the most important factors for the successful analysis of the model is the legitimacy of the data collected through household
survey, because this data is provided to us by the people we must consider a certain percent of human error. Factors affecting the
trips and the number of factors considered may be large and may change as per individual preference which will give different
results for different studies. [11] [12] The sample size is also a limitation, as it may not include a large area because it is physically
difficult. However, census data can be used in this study if this data is up to date.

VIII. CONCLUSION
The conclusion of the survey work can be explained by a few pointers.
– Sample size affects the precision and accuracy of the coefficients and the constants.
– The range of data should not be vast. This may result to faulty correlation analysis.
– The best suited model should follow the two criteria and if the equation is not satisfactory another analysis and planning
method can be selected.
The best model here selected has a correlation coefficient of 0.958 and has no negative constants. Also, the number of earning
people will affect the home-based trips largely as people come home and go to office individually and trips increases.

APPENDIX
The data collected from the questionnaire is put up in a table called data sheets which is then used for correlation analysis and
multiple linear regression analysis. The grouped data can be converted into numerical data by dividing it into smaller groups and
counting the number of groups affecting the individual entry. For instance, the age group here age groups are divided into 0-16,
17- 30, 31-45,45-60,61 and above depending on their work efficiency and contribution.
Table 25: Data Collection from Household Survey
Family Income Earning Vehicle Trips Trip Age Working No. of
Sr.
Members Range People Ownership (Numbe Duration Grou Hours Age
No.
(Number) (Lakhs) (Number) (Number) r) (Hours) p (Hours) Group
1 6 2.5 - 10 3 3 6 8am - 6pm 18-50 9 3
2 4 2.5 - 10 2 2 4 9am - 6pm 14-42 7 3
3 6 > 10 3 3 6 7am - 5pm 17-85 8 4
4 4 1 - 2.5 2 2 4 8am - 11 pm 13-50 8 4
5 5 2.5 - 5 3 2 5 8am - 11 pm 19-59 7 3
6 2 > 2.5 2 1 3 7am - 5pm 19-59 6 3
7 1 1- 2.5 1 1 2 8 am - 6pm 13-50 6 4

All rights reserved by www.grdjournals.com 25


Analysis and Generation of Multiple Linear Regression for Residential Area of Vesu, Surat
(GRDJE/ Volume 5 / Issue 11 / 002)

8 4 2.5 - 5 3 2 6 7am - 5pm 15-65 9 4


9 4 2.5 - 10 3 3 6 8 am - 6pm 15-54 8 4
10 6 2.5 - 5 3 3 6 9am - 6pm 13-70 9 5

ACKNOWLEDGMENT
I would like to show my gratitude towards my mentor and guide Prof. Payal Zaveri for her guidance and encouragement throughout
the duration of the research. I am extremely grateful for the words of wisdom and the support provided by her during the research.
I would take this opportunity to thank her for her counseling and remarks on the earlier versions of the manuscript, however any
errors are of my own and should not tarnish the reputation of the esteemed professor. I would also like to thank my family for their
belief, support, and assistance in finishing the paper in limited frame of time.

REFERENCES
[1] K. V. K. R. Tom V. Mathew, “Trip generation. Introduction to Transportation Engineering.” NPTEL India. 2007.
[2] L. Kadiyali, Ninth edition, Traffic Engineering and Transport Planning, India: Khanna Publishers., 1999.
[3] S. Senthilnathan, “Usefulness of correlation analysis.” new zealand , 2019.
[4] T. B. &. S. Fidell, “Using multivariate statistics. (Third Edition).,” Harper Collins College Publishers, New York., 1996.
[5] S. K. Mahak Dawr, “Developing Trip Generation Model Utilizing Multiple Regression Analysis Case Stud. International,” Journal of Innovative Research
in Science, Engineering and Technology, vol. 6, no. 2, 2017.
[6] “Google Maps,” [Online]. Available: https://goo.gl/maps/rhopX18JaPahGk236.
[7] G. P. H. Wootton, ““A Model for Trips Generated by Household”,” Journal of Transport Economics and Policy, pp. 137-153, 1967.
[8] M. J. K. R. R. a. G. T. Ravi Gadepalli, “Multiple Classification Analysis for Trip Production Models Using Household Data: Case Study of Patna, India.”
2014.
[9] B. N. T. a. C. D. Ramesh B. Ranpise, “Assessment and MLR Modeling of Traffic Noise at Major Urban Roads of Residential and Commercial Areas of Surat
City,” Springer Nature Singapore., p. 2020.
[10] F. A. W. A. C. Cameron, ““An R-squared measure of goodness of fit for some common nonlinear regression models”,” pp. 16-32, 1995.
[11] A. B. Asokan Mulayath Variyath, “Variable selection in multivariate multiple regression,” 2020.
[12] B. A. Sathiyaraj Rajendran, “Short term trafc prediction model for urban transportation using structure pattern and regression: an Indian context,” 2020.

All rights reserved by www.grdjournals.com 26

Das könnte Ihnen auch gefallen