Beruflich Dokumente
Kultur Dokumente
ISSN- 2455-5703
Abstract
Regression analysis is a statistical approach to find the interrelation between dependent and independent variables. The basis of
regression analysis is correlation among the variables. Correlation is a statistical method to evaluate the strength of the relationships
between variables. The higher the coefficient of correlation the more is the strength. This paper focuses on the study of trip
generation and the factors affecting the same along with the degree to which the individual factor affects the trip. Linear regression
is the study of one dependent and few independent variables and their correlation. The data collection technique taken into
consideration is ‘Household survey’. Household survey is the procedure for collection and analysis of general situation and specific
characteristics affecting the individual households or residential areas. The precision of the result depends on the sample size and
the range of the sample. The more is the sample size better is the correlation.
Keywords- Correlation Analysis, Household Survey, Microsoft Excel, Multiple Regression Analysis, Trip Generation
I. INTRODUCTION
To economically develop a city, it is important to develop a good transportation infrastructure and to provide good communication
facilities. This includes planned development of the road connections in the city and railway connections for interstate travel. The
first phase of developing a good transportation facility is to plan. This includes data collection of the study area and inventory. The
second phase is to understand the pattern of trip generation. This includes the relation of various independent and dependent
variables affecting the trip generation. This analysis is done by generating a model for understanding and accurate results.
Trip is a one-way movement of a person by a mechanical means of mode. It has two ends an origin which is a starting
point of a trip and a destination which is the end point of the trip. Trips are divided into two types ‘Home based’ trips where one
end is home and ‘Non home based’ trips where none of the two end is home. [1] Before the Start of first phase it is important to
select a few factors affecting the trip generation, based on which our data collection survey will depend. These are called the
independent variables. These variables are themselves independent, but the trip generation is dependent of these factors. Factors
affecting the trip generation may include number of family members, income of the family, age group, number of vehicles owned
by the family, working hours, number of trips per day. [2]
The second phase of planning of a transportation system or facility is to start with modeling. Correlation analysis is the
first step to generation of the model. It is the process to find an independent variable most likely to affect the dependent variable.
It is the study of finding the strength of relationship among the variables. [3] Higher the correlation coefficient better is higher is
the strength of the variables. Generally, the values lie between -1 to 1 with 1 being the best relation among variables.
After finding the variables with suitable correlation coefficient the final step in analysis is to perform a regression analysis.
Multiple linear regression analysis is a statistical method of fitting mathematical relationship between the dependent and
independent variables. [4] [5] This model helps in generating equations which ultimately help in finding the number of trips
generated due to a selected number of factors affecting. Multiple linear regression analysis is an accurate method and can be used
easily by novices or students as well as experts.
This research paper is an original study conducted through field work by collection of data in residential areas and analysis
through computer software. This study is based on the data of the prevailing year collected through household survey.
Questionnaires are prepared to obtain most relevant data which can be used for the regression analysis.
The major intercity transportation infrastructure in Surat includes a Railway station which is under the administrative
control of Western Railway Zone of the Indian Railways. Surat BRTS is a bus rapid transit system, an intracity network, having
245 buses connecting major locations. Surat has the second busiest airport in Gujarat Airport. There are 83 bridges in operation
and four flyovers and many still in the stage of construction phase.
Vesu is 5 km away from airport and 10 km away from the railway station. Vesu is a suburb in the South West Zone of
Surat. It is the newest area and fast developing area in terms of public transport infrastructure, residential area, complex and
business parks. The latitude location of Vesu is 21.1447217 ̊ and longitude is 72.7717735 ̊. The study area selected is a residential
area which is well connected to commercial area as well as is extremely near to the airport and BRTS facilities. It is in the midst
of other residential apartments.
Vesu is a relatively newly developed area of Surat which continuously undergoes various transformations with respect to
its infrastructural development. This study will help the future development to occur in a systematic and organized manner which
latter on will require less reconstruction and rehabilitation. It will also increase the efficiency of the area in terms of transportation.
IV. METHODOLOGY
To start with the regression analysis the following steps should be followed: -
Step 1: Prepare a Data table based on household survey
Step 2: Install the ‘Analysis ToolPak’ in Microsoft Excel
Step 3: Correlation analysis in Excel
Step 4: Selection of ‘Independent Variables’
Y = Number of Trips
X1 = Number of Earning people
X2 = Number of Working Hours
X3 = Number of Vehicles owned
Table 24: Generated equations
Sr. No. Conditions Equation r2
1 Number of Earning people v/s Trips Y = 0.296 + 0.459X1 0.918
2 Number of Working hours v/s Trips Y = -3.727 + 1.11X2 0.757
3 Number of Vehicles owned v/s Trips Y = 1.10 + 1.678X3 0.805
4 Number of Earning people and Vehicles owned v/s Trips Y = -0.1847 + 1.39X1 + 0.685X3 0.967
5 Number of Earning people and Working hours v/s Trips Y = -2.272 + 1.45X1 + 0.446X2 0.973
6 Number of Vehicles and Working hours v/s Trips Y = -1.636 + 1.04X3 + 0.537X2 0.869
7 Number of Working hours, earning people and Vehicles owned v/s Trips Y = -1.636 + 0.311X1 + 1.24X2 + 0.424X3 0.987
There are two criteria to be taken into consideration for the final selection of the model which is best suited to estimate future trip
generation: -
The regression coefficient r2 should be greater than 0 and as close to 1 as possible.
It is preferable to not have negative values of constants in the equation.
Based on the first criteria the model selected is (7) Number of Working hours, earning people and Vehicles owned v/s Trips
Y = -1.636 + 0.311X1 + 1.24X2 + 0.424X3
r2= 0.987
However, here the constant is negative. Hence the best suited model is (1) Earning people v/s Trips
Y = 0.296 + 0.459X1
r2= 0.918
And the equation is taken as the most influential independent variable affecting the trip generation.
VII. LIMITATIONS
One of the most important factors for the successful analysis of the model is the legitimacy of the data collected through household
survey, because this data is provided to us by the people we must consider a certain percent of human error. Factors affecting the
trips and the number of factors considered may be large and may change as per individual preference which will give different
results for different studies. [11] [12] The sample size is also a limitation, as it may not include a large area because it is physically
difficult. However, census data can be used in this study if this data is up to date.
VIII. CONCLUSION
The conclusion of the survey work can be explained by a few pointers.
– Sample size affects the precision and accuracy of the coefficients and the constants.
– The range of data should not be vast. This may result to faulty correlation analysis.
– The best suited model should follow the two criteria and if the equation is not satisfactory another analysis and planning
method can be selected.
The best model here selected has a correlation coefficient of 0.958 and has no negative constants. Also, the number of earning
people will affect the home-based trips largely as people come home and go to office individually and trips increases.
APPENDIX
The data collected from the questionnaire is put up in a table called data sheets which is then used for correlation analysis and
multiple linear regression analysis. The grouped data can be converted into numerical data by dividing it into smaller groups and
counting the number of groups affecting the individual entry. For instance, the age group here age groups are divided into 0-16,
17- 30, 31-45,45-60,61 and above depending on their work efficiency and contribution.
Table 25: Data Collection from Household Survey
Family Income Earning Vehicle Trips Trip Age Working No. of
Sr.
Members Range People Ownership (Numbe Duration Grou Hours Age
No.
(Number) (Lakhs) (Number) (Number) r) (Hours) p (Hours) Group
1 6 2.5 - 10 3 3 6 8am - 6pm 18-50 9 3
2 4 2.5 - 10 2 2 4 9am - 6pm 14-42 7 3
3 6 > 10 3 3 6 7am - 5pm 17-85 8 4
4 4 1 - 2.5 2 2 4 8am - 11 pm 13-50 8 4
5 5 2.5 - 5 3 2 5 8am - 11 pm 19-59 7 3
6 2 > 2.5 2 1 3 7am - 5pm 19-59 6 3
7 1 1- 2.5 1 1 2 8 am - 6pm 13-50 6 4
ACKNOWLEDGMENT
I would like to show my gratitude towards my mentor and guide Prof. Payal Zaveri for her guidance and encouragement throughout
the duration of the research. I am extremely grateful for the words of wisdom and the support provided by her during the research.
I would take this opportunity to thank her for her counseling and remarks on the earlier versions of the manuscript, however any
errors are of my own and should not tarnish the reputation of the esteemed professor. I would also like to thank my family for their
belief, support, and assistance in finishing the paper in limited frame of time.
REFERENCES
[1] K. V. K. R. Tom V. Mathew, “Trip generation. Introduction to Transportation Engineering.” NPTEL India. 2007.
[2] L. Kadiyali, Ninth edition, Traffic Engineering and Transport Planning, India: Khanna Publishers., 1999.
[3] S. Senthilnathan, “Usefulness of correlation analysis.” new zealand , 2019.
[4] T. B. &. S. Fidell, “Using multivariate statistics. (Third Edition).,” Harper Collins College Publishers, New York., 1996.
[5] S. K. Mahak Dawr, “Developing Trip Generation Model Utilizing Multiple Regression Analysis Case Stud. International,” Journal of Innovative Research
in Science, Engineering and Technology, vol. 6, no. 2, 2017.
[6] “Google Maps,” [Online]. Available: https://goo.gl/maps/rhopX18JaPahGk236.
[7] G. P. H. Wootton, ““A Model for Trips Generated by Household”,” Journal of Transport Economics and Policy, pp. 137-153, 1967.
[8] M. J. K. R. R. a. G. T. Ravi Gadepalli, “Multiple Classification Analysis for Trip Production Models Using Household Data: Case Study of Patna, India.”
2014.
[9] B. N. T. a. C. D. Ramesh B. Ranpise, “Assessment and MLR Modeling of Traffic Noise at Major Urban Roads of Residential and Commercial Areas of Surat
City,” Springer Nature Singapore., p. 2020.
[10] F. A. W. A. C. Cameron, ““An R-squared measure of goodness of fit for some common nonlinear regression models”,” pp. 16-32, 1995.
[11] A. B. Asokan Mulayath Variyath, “Variable selection in multivariate multiple regression,” 2020.
[12] B. A. Sathiyaraj Rajendran, “Short term trafc prediction model for urban transportation using structure pattern and regression: an Indian context,” 2020.