NCHRP Syn 295

NCHRP
Statistical Methods in Highway Safety Analysis
SYNTHESIS 295
NATIONAL COOPERATIVE HIGHWAY RESEARCH PROGRAM
A Synthesis of Highway Practice
TRANSPORTATION RESEARCH BOARD
NATIONAL RESEARCH COUNCIL
TRANSPORTATION RESEARCH BOARD EXECUTIVE COMMITTEE 2001 Officers

Chair: JOHN M. SAMUELS, Senior Vice President-Operations Planning & Support, Norfolk Southern Corporation, Norfolk, VA Vice Chairman: E. DEAN CARLSON, Secretary of Transportation, Kansas DOT Executive Director: ROBERT E. SKINNER, JR., Transportation Research Board
Members
WILLIAM D. ANKNER, Director, Rhode Island DOT THOMAS F. BARRY, JR., Secretary of Transportation, Florida DOT JACK E. BUFFINGTON, Research Professor, Mark-Blackwell National Rural Transportation Study Center, University of Arkansas SARAH C. CAMPBELL, President, TransManagement, Inc., Washington, D.C. JOANNE F. CASEY, President, Intermodal Association of North America, Greenbelt, MD JAMES C. CODELL III, Secretary, Kentucky Transportation Cabinet JOHN L. CRAIG, Director, Nebraska Department of Roads ROBERT A. FROSCH, Senior Research Fellow, John F. Kennedy School of Government, Harvard University GORMAN GILBERT, Director, Oklahoma Transportation Center, Oklahoma State University GENEVIEVE GIULIANO, Professor, School of Policy, Planning, and Development, University of Southern California LESTER A. HOEL, L.A. Lacy Distinguished Professor, Department of Civil Engineering, University of Virginia H. THOMAS KORNEGAY, Executive Director, Port of Houston Authority BRADLEY L. MALLORY, Secretary of Transportation, Pennsylvania DOT MICHAEL D. MEYER, Professor, School of Civil and Environmental Engineering, Georgia Institute of Technology JEFF P. MORALES, Director of Transportation, California DOT JEFFREY R. MORELAND, Executive Vice President-Law and Chief of Staff, Burlington Northern Santa Fe Corporation, Fort Worth, TX JOHN P. POORMAN, Staff Director, Capital District Transportation Committee, Albany, NY CATHERINE L. ROSS, Executive Director, Georgia Regional Transportation Agency WAYNE SHACKELFORD, Senior Vice President, Gresham Smith & Partners, Alpharetta, GA PAUL P. SKOUTELAS, CEO, Port Authority of Allegheny County, Pittsburgh, PA MICHAEL S. TOWNES, Executive Director, Transportation District Commission of Hampton Roads, Hampton, VA MARTIN WACHS, Director, Institute of Transportation Studies, University of California at Berkeley MICHAEL W. WICKHAM, Chairman and CEO, Roadway Express, Inc., Akron, OH JAMES A. WILDING, President and CEO, Metropolitan Washington Airports Authority M. GORDON WOLMAN, Professor of Geography and Environmental Engineering, The Johns Hopkins University MIKE ACOTT, President, National Asphalt Pavement Association (ex officio) EDWARD A. BRIGHAM, Acting Deputy Administrator, Research and Special Programs Administration, U.S. DOT (ex officio) BRUCE J. CARLTON, Acting Deputy Administrator, Maritime Administration, U.S. DOT (ex officio) JULIE A. CIRILLO, Assistant Administrator and Chief Safety Officer, Federal Motor Carrier Safety Administration, U.S. DOT (ex officio) SUSAN M. COUGHLIN, Director and COO, The American Trucking Associations Foundation, Inc. (ex officio) JENNIFER L. DORN, Federal Transit Administrator, U.S. DOT (ex officio) ROBERT B. FLOWERS (Lt. Gen., U.S. Army), Chief of Engineers and Commander, U.S. Army Corps of Engineers (ex officio) HAROLD K. FORSEN, Foreign Secretary, National Academy of Engineering (ex officio) JANE F. GARVEY, Administrator, Federal Aviation Administration, U.S. DOT (ex officio) THOMAS J. GROSS, Deputy Assistant Secretary, Office of Transportation Technologies, U.S. Department of Energy (ex officio) EDWARD R. HAMBERGER, President and CEO, Association of American Railroads (ex officio) JOHN C. HORSLEY, Executive Director, American Association of State Highway and Transportation Officials (ex officio) MICHAEL P. JACKSON, Deputy Secretary of Transportation, U.S. DOT (ex officio) JAMES M. LOY (Adm., U.S. Coast Guard), Commandant, U.S. Coast Guard (ex officio) WILLIAM W. MILLAR, President, American Public Transit Association (ex officio) MARGO T. OGE, Director, Office of Transportation and Air Quality, U.S. EPA (ex officio) VALENTIN J. RIVA, President and CEO, American Concrete Paving Association (ex officio) JON A. RUTTER, Federal Railroad Administrator, U.S. DOT (ex officio) VINCENT F. SCHIMMOLLER, Deputy Executive Director, Federal Highway Administration, U.S. DOT (ex officio) ASHISH K. SEN, Director, Bureau of Transportation Statistics, U.S. DOT (ex officio) L. ROBERT SHELTON III, Executive Director, National Highway Traffic Safety Administration, U.S. DOT (ex officio) MICHAEL R. THOMAS, Applications Division Director, Office of Earth Sciences Enterprise, National Aeronautics Space Administration (ex officio)

Transportation Research Board Executive Committee Subcommittee for NCHRP JOHN M. SAMUELS, Norfolk Southern Corporation (Chair) E. DEAN CARLSON, Kansas DOT LESTER A. HOEL, University of Virginia JOHN C. HORSLEY, American Association of State Highway and Transportation Officials Field of Special Projects Project Committee SP 20-5 C. IAN MACGILLIVRAY, Iowa DOT (Chair) KENNETH C. AFFERTON, New Jersey DOT (Retired) SUSAN BINDER, Federal Highway Administration THOMAS R. BOHUSLAV, Texas DOT NICHOLAS J. GARBER, University of Virginia DWIGHT HORNE, Federal Highway Administration YSELA LLORT, Florida DOT WESLEY S.C. LUM, California DOT GARY TAYLOR, Michigan DOT J. RICHARD YOUNG, JR., Post Buckley Schuh & Jernigan, Inc. MARK R. NORMAN, Transportation Research Board (Liaison) WILLIAM ZACCAGNINO, Federal Highway Administration (Liaison) TRB Staff for NCHRP Project 20-5 STEPHEN R. GODWIN, Director for Studies and Information Services DONNA L. VLASAK, Senior Program Officer DON TIPPMAN, Editor STEPHEN F. MAHER, Manager, Synthesis Studies CHERYL Y. KEITH, Senior Secretary VINCENT F. SCHIMMOLLER, Federal Highway Administration ROBERT E. SKINNER, JR., Transportation Research Board MARTIN WACHS, Institute of Transportation Studies, University of California, Berkeley
Program Staff ROBERT J. REILLY, Director, Cooperative Research Programs CRAWFORD F. JENCKS, Manager, NCHRP DAVID B. BEAL, Senior Program Officer HARVEY BERLIN, Senior Program Officer B. RAY DERR, Senior Program Officer AMIR N. HANNA, Senior Program Officer EDWARD T. HARRIGAN, Senior Program Officer CHRISTOPHER HEDGES, Senior Program Officer TIMOTHY G. HESS, Senior Program Officer RONALD D. MCCREADY, Senior Program Officer CHARLES W. NIESSNER, Senior Program Officer EILEEN P. DELANEY, Editor HILARY FREER, Associate Editor
NCHRP SYNTHESIS 295

Statistical Methods in Highway Safety Analysis
A Synthesis of Highway Practice
CONSULTANT BHAGWANT N. PERSAUD Department of Civil Engineering Ryerson University
TOPIC PANEL MICHAEL S. GRIFFITH, FEDERAL HIGHWAY ADMINISTRATION CARLTON M. HAYDEN, FEDERAL HIGHWAY ADMINISTRATION JAKE KONONOV, COLORADO DEPARTMENT OF TRANSPORTATION CHARLES R. LEWIS II, WEST VIRGINIA DEPARTMENT OF TRANSPORTATION RICHARD F. PAIN, TRANSPORTATION RESEARCH BOARD PHILIP M. SALZBERG, WASHINGTON TRAFFIC SAFETY COMMISSION ROBERT A. SCOPATZ, DATA NEXUS, INC. DELBERT E. STEWART, ROAD SAFETY TRANSPORT CANADA SIMON P. WASHINGTON, GEORGIA INSTITUTE OF TECHNOLOGY
SUBJECT AREAS Planning and Administration and Highway Operations, Capacity, and Traffic Control
Research Sponsored by the American Association of State Highway and Transportation Officials in Cooperation with the Federal Highway Administration
TRANSPORTATION RESEARCH BOARD NATIONAL RESEARCH COUNCIL

NATIONAL ACADEMY PRESS WASHINGTON, D.C. 2001
NCHRP SYNTHESIS 295 Project 20-5 FY 1999 (Topic 31-02) ISSN 0547-5570 ISBN 0-309-06905-X Library of Congress Control No. 2001 132366 2001 Transportation Research Board
Systematic, well designed research provides the most effective approach to the solution of many problems facing highway administrators and engineers. Often, highway problems are of local interest and can best be studied by highway departments individually or in cooperation with their state universities and others. However, the accelerating growth of highway transportation develops increasingly complex problems of wide interest to highway authorities. These problems are best studied through a coordinated program of cooperative research. In recognition of these needs, the highway administrators of the American Association of State Highway and Transportation Officials initiated in 1962 an objective national highway research program employing modern scientific techniques. This program is supported on a continuing basis by funds from participating member states of the Association and it receives the full cooperation and support of the Federal Highway Administration, United States Department of Transportation. The Transportation Research Board of the National Research Council was requested by the Association to administer the research program because of the Board's recognized objectivity and understanding of modern research practices. The Board is uniquely suited for this purpose as it maintains an extensive committee structure from which authorities on any highway transportation subject may be drawn; it possesses avenues of communication and cooperation with federal, state, and local governmental agencies, universities, and industry; its relationship to the National Research Council is an insurance of objectivity; it maintains a full-time research correlation staff of specialists in highway transportation matters to bring the findings of research directly to those who are in a position to use them. The program is developed on the basis of research needs identified by chief administrators of the highway and transportation departments and by committees of AASHTO. Each year, specific areas of research needs to be included in the program are proposed to the National Research Council and the Board by the American Association of State Highway and Transportation Officials. Research projects to fulfill these needs are defined by the Board, and qualified research agencies are selected from those that have submitted proposals. Administration and surveillance of research contracts are the responsibilities of the National Research Council and the Transportation Research Board. The needs for highway research are many, and the National Cooperative Highway Research Program can make significant contributions to the solution of highway transportation problems of mutual concern to many responsible groups. The program, however, is intended to complement rather than to substitute for or duplicate other highway research programs.
Price $29.00
NOTICE The project that is the subject of this report was a part of the National Cooperative Highway Research Program conducted by the Transportation Research Board with the approval of the Governing Board of the National Research Council. Such approval reflects the Governing Board's judgment that the program concerned is of national importance and appropriate with respect to both the purposes and resources of the National Research Council. The members of the technical committee selected to monitor this project and to review this report were chosen for recognized scholarly competence and with due consideration for the balance of disciplines appropriate to the project. The opinions and conclusions expressed or implied are those of the research agency that performed the research, and, while they have been accepted as appropriate by the technical committee, they are not necessarily those of the Transportation Research Board, the National Research Council, the American Association of State Highway and Transportation Officials, or the Federal Highway Administration of the U.S. Department of Transportation. Each report is reviewed and accepted for publication by the technical committee according to procedures established and monitored by the Transportation Research Board Executive Committee and the Governing Board of the National Research Council. The National Research Council was established by the National Academy of Sciences in 1916 to associate the broad community of science and technology with the Academy's purposes of furthering knowledge and of advising the Federal Government. The Council has become the principal operating agency of both the National Academy of Sciences and the National Academy of Engineering in the conduct of their services to the government, the public, and the scientific and engineering communities. It is administered jointly by both Academies and the Institute of Medicine. The National Academy of Engineering and the Institute of Medicine were established in 1964 and 1970, respectively, under the charter of the National Academy of Sciences. The Transportation Research Board evolved in 1974 from the Highway Research Board, which was established in 1920. The TRB incorporates all former HRB activities and also performs additional functions under a broader scope involving all modes of transportation and the interactions of transportation with society.
Published reports of the

are available from: Transportation Research Board National Research Council 2101 Constitution Avenue, N.W. Washington, D.C. 20418
NOTE: The Transportation Research Board, the National Research Council, the Federal Highway Administration, the American Association of State Highway and Transportation Officials, and the individual states participating in the National Cooperative Highway Research Program do not endorse products or manufacturers. Trade or manufacturers' names appear herein solely because they are considered essential to the object of this report.
and can be ordered through the Internet at: http://www.nationalacademies.org/trb/bookstore
Printed in the United States of America
PREFACE
A vast storehouse of information exists on nearly every subject of concern to highway administrators and engineers. Much of this information has resulted from both research and the successful application of solutions to the problems faced by practitioners in their daily work. Because previously there has been no systematic means for compiling such useful information and making it available to the entire community, the American Association of State Highway and Transportation Officials has, through the mechanism of the National Cooperative Highway Research Program, authorized the Transportation Research Board to undertake a continuing project to search out and synthesize useful knowledge from all available sources and to prepare documented reports on current practices in the subject areas of concern. This synthesis series reports on various practices, making specific recommendations where appropriate but without the detailed directions usually found in handbooks or design manuals. Nonetheless, these documents can serve similar purposes, for each is a compendium of the best knowledge available on those measures found to be the most successful in resolving specific problems. The extent to which these reports are useful will be tempered by the user's knowledge and experience in the particular problem area.
This synthesis report will be of interest to individuals with state transportation departFOREWORD ments and with district and local agencies involved directly or indirectly with safety By Staff analysis in highway jurisdictions, as well as to contractors undertaking safety analysis and Transportation Research Board associated work for them. Highway safety analysts in many countries around the world
might also find this synthesis of interest. The focus of this report is on the type of safety analysis required to support traditional engineering functions, such as the identification of hazardous locations and the development and evaluation of countermeasures. Analyses related specifically to driver and vehicle safety are not covered, but some statistical methods used in these areas are of relevance and are summarized where appropriate. This synthesis may benefit analysts working in these other areas as well. Administrators, engineers, and researchers are continually faced with highway problems on which much information exists, either in the form of reports or in terms of undocumented experience and practice. Unfortunately, this information often is scattered and unevaluated and, as a consequence, in seeking solutions, full information on what has been learned about a problem frequently is not assembled. Costly research findings may go unused, valuable experience may be overlooked, and full consideration may not be given to available practices for solving or alleviating the problem. In an effort to correct this situation, a continuing NCHRP project has the objective of reporting on common highway problems and synthesizing available information. The synthesis reports from this endeavor constitute an NCHRP publication series in which various forms of relevant information are assembled into single, concise documents pertaining to specific highway problems or sets of closely related problems. This report of the Transportation Research Board is also being coordinated with NCRP Project 20-45, which is developing a website manual aimed at providing guidance on the application of basic statistical tools in transportation research. Thus,
although such guidance is outside the scope of this synthesis, information about the website manual is provided. To develop this synthesis in a comprehensive manner and to ensure inclusion of significant knowledge, the available information was assembled from numerous sources, including a large number of state highway and transportation departments. A topic panel of experts in the subject area was established to guide the author's research in organizing and evaluating the collected data, and to review the final synthesis report. This synthesis is an immediately useful document that records the practices that were acceptable within the limitations of the knowledge available at the time of its preparation. As the processes of advancement continue, new knowledge can be expected to be added to that now at hand.
CONTENTS
SUMMARY
CHAPTER ONE
INTRODUCTION
Problem Statement and Synthesis Objectives, 3 Matters of Organization and Style, 3 Background, 4
CHAPTER TWO
STATE OF RESEARCH
Published Research, 8 Major Recent and On-Going Research Initiatives, 13 Simplified Illustration of the IHSDM Accident Prediction Algorithm, 15
18
CHAPTER THREE
STATE OF PRACTICE
Survey Results Part I: General Information, 18 Survey Results Part II: Details of Safety Analyses Undertaken in the Past 5 Years, 18 Survey Results Part III: Problems/Issues in Safety Analyses, 20
22
CHAPTER FOUR
CONCLUSIONS AND RECOMMENDATIONS
24
REFERENCES
27
GLOSSARY
28
APPENDIX A
SURVEY QUESTIONNAIRE
36
APPENDIX B
SUMMARY OF SURVEY RESPONSES
52
APPENDIX C
SOME ELECTRONIC RESOURCES RELEVANT TO HIGHWAY SAFETY ANALYSIS
55
APPENDIX D
A PRIMER ON THE APPLICATION OF SOME BASIC STATISTICAL TOOLS TO HIGHWAY SAFETY ANALYSES
67
APPENDIX E
REVIEW OF A SAMPLE OF RELEVANT METHODOLOGY FROM NON-MAINSTREAM TYPES OF SAFETY ANALYSES
ACKNOWLEDGMENTS Bhagwant N. Persaud, Ph.D., Department of Civil Engineering, Ryerson University, Toronto, was responsible for collection of the data and preparation of the report. Valuable assistance in the preparation of this synthesis was provided by the Topic Panel, consisting of Michael S. Griffith, Mathematical Statistician, Turner Fairbank Highway Research Center, Federal Highway Administration; Carlton M. Hayden, Highway Engineer, Office of Highway Safety, Federal Highway Administration; Jake Kononov, Senior Engineer, Traffic Safety and Engineering Branch, Colorado Department of Transportation; Charles R. Lewis II. Planning and Research Engineer, Division of Highways, West Virginia Department of Transportation; Richard F. Pain, Transportation Safety Coordinator, Transportation Research Board; Philip M. Salzberg, Ph.D. Research Director, Washington Traffic Safety Commission; Robert A. Scopatz, Ph.D., Research Scientist, Data Nexus, Inc.; Delbert E. Stewart, Senior Statistician, Road Safety Transport Canada; and Simon P. Washington, Ph.D., Assistant Professor, School of Civil Engineering, Georgia Institute of Technology. This study was managed by Donna L. Vlasak, Senior Program Officer, who worked with the consultant, the Topic Panel, and the Project 20-5 Committee in the development and review of the report. Assistance in project scope development was provided by Stephen F. Maher, P.E., Manager, Synthesis Studies. Don Tippman was responsible for editing and production. Cheryl Keith assisted in meeting logistics and distribution of the questionnaire and draft reports. Crawford F. Jencks, Manager, National Cooperative Highway Research Program, assisted the NCHRP 20-5 Committee and the Synthesis staff. Information on current practice was provided by many highway and transportation agencies. Their cooperation and assistance are appreciated.
STATISTICAL METHODS IN HIGHWAY SAFETY ANALYSIS
SUMMARY
The purpose of this synthesis is to summarize the current practice and research on statistical methods in highway safety analysis. The focus is on highway engineering functions such as establishing relationships between crashes and associated factors, identifying locations for treatment, and evaluating the safety effect of engineering improvements. However, useful insights were gained by reviewing relevant research and practice related to statistical methods in driver and vehicle safety analyses, and the synthesis can also be useful to those working in these areas. The synthesis attempts to identify gaps between available knowledge and practice and to provide insights into bridging these gaps. To this end, the following basic methodology was employed: A survey of jurisdictions with highway engineering functions was conducted to assess current practices in highway safety analysis, highlight examples of good practice, and identify deficiencies that may be addressed in the synthesis. A literature review was conducted to supplement the survey and to gather knowledge on the best available statistical tools that may be available for safety analysts. Leading researchers were contacted to gain knowledge on more recent and on-going research of relevance to highway safety analysis.
The survey was sent to all 50 state departments of transportation (DOTs) in the United States and to the 11 provincial transportation departments in Canada. Twenty-seven states and five provinces responded, with more than one response coming from one state. Six states provided examples of highway safety analyses conducted. Although gaps between the state of research and the state of practice were evident, it is encouraging that several transportation jurisdictions are up to speed on the complexity of highway safety analysis. Particularly encouraging is that many jurisdictions recognize the peculiarities of highway safety data and the special analytical methods needed to accommodate them. These peculiarities include the poor quality of accident and traffic volume data and accident reporting differences across jurisdictions and over time. Also encouraging is that jurisdictions are conscious of the need for maintaining quality accident data and are constantly making efforts to improve the data collection process. To this end, a few jurisdictions have in place, or are developing, a facility to easily link accident, traffic, and inventory data to create databases that would facilitate the most advanced methods of highway safety analysis. Despite these positive aspects, engineers have at their disposal relatively little sound knowledge on the safety implications of their design and operational decisions and much remains to be done to improve both the state of research and the state of practice. These
2 needs relate largely to the development and evaluation of treatments, the identification of sites that require safety investigation, and the tools such as accident modification factors and safety performance functions that support these analyses. A major obstacle to fulfilling these needs is that although several jurisdictions are aware of regression to the mean, this phenomenon is generally not well understood and the implications tend to be underestimated. As a result, the safety effect of treatments can be exaggerated and resources can be inefficiently allocated, because relatively safe sites with a randomly high count of accidents in a recent period can be wrongly identified for treatment and unsafe sites can go untreated. (See Appendix D for more details, including an illustration of regression to the mean and methodology to account for its effects.) Improving the state of practice requires the availability of reliable data and a commitment to provide safety analysts with the knowledge and resources to use the best available methods, particularly those that account for the effects of regression to the mean. It is hoped that this synthesis in itself will go a long way towards providing jurisdictions with a feel for what is required to bridge any identifiable existing gaps. Improving the state of research requires the continued commitment of not only researchers but also agencies/programs such as the FHWA and NCHRP. To this end, research continues to advance the state of knowledge to refine and simplify the tools used in highway safety analysis. Since this synthesis was commissioned, there have been three significant initiatives aimed at providing highway agencies with the most advanced tools available to conduct highway safety analyses. These are NCHRP's Highway Safety Manual (HSM), FHWA's Comprehensive Highway Safety Improvement Model (CHSIM), and ongoing research on accident prediction methodology for FHWA's Interactive Highway Safety Design Model (IHSDM). The expectation is that these significant research efforts will not only improve the state of research but will greatly facilitate the bridging of the gaps between research and practice in highway safety analysis. It should be noted that a major component of the CHSIM project is the accommodation of the training needs that it will of necessity create. Delivery of the products of these initiatives is still some time in the future. In the meantime it would be beneficial for agencies to undertake preparatory work to develop the infrastructure for implementing the tools that would become available. Such work might include the assembly of reliable databases and the planning of human and financial resources.
CHAPTER ONE
INTRODUCTION
PROBLEM STATEMENT AND SYNTHESIS OBJECTIVES
Over the past decade, considerable progress has been made in the development and application of appropriate statistical methods in highway safety analysis to accommodate nonideal conditions that often arise and which cannot be handled by conventional statistical methods. Although the application of these newer methods is increasing, there is concern that the required tools and information to adequately undertake these analyses are not readily accessible to practicing highway safety analysts. On the other hand, widespread availability of statistical software has increased the risk of misapplying statistical techniques in safety investigations. The announcement for the 2001 TRB Annual Meeting Human Factors Workshop speculates that this danger is largely due to the incorporation of a vast number and variety of advanced statistical and econometric techniques into standard software packages. This, according to that announcement, has made selection of the appropriate statistical technique(s) for a specific research problem a complex decisiona decision whereby researchers are left in search of simple, unifying, and comprehensive guidance. These difficulties lead to analyses that could yield incorrect results and could also be a deterrent to conducting such analyses. The result is the likelihood that highway safety improvement programs may not be optimized for maximum costeffectiveness. The fundamental objective of this synthesis is to identify gaps between available knowledge and practice in highway safety analysis and to provide insights into bridging these gaps. To this end, a survey of current practices in jurisdictions was conducted. The survey sought details on how safety analysis is conducted in highway agencies in order to identify crucial issues and needs. Supplementing the survey was a review of published and unpublished literature as well as on-going research of relevance. The focus of this report is on the type of safety analyses required to support traditional highway engineering functions, such as the identification of hazardous locations and the development and evaluation of countermeasures. Analyses related specifically to driver and vehicle safety are not covered by this synthesis, but some statistical methods used in these areas are of relevance and are summarized where appropriate. Conversely, the synthesis may also benefit analysts working in these other areas. Also reviewed are some other methods used in other aspects of
transportation data analysis that may be of relevance. This synthesis is being coordinated with NCHRP Project 20-45, which is developing a website manual aimed at providing guidance on the application of basic statistical tools in transportation research. Therefore, such guidance is generally outside the scope of this synthesis. Information about the website manual is, however, provided. The target audience of this synthesis is those individuals or groups involved directly or indirectly with safety analysis in highway jurisdictions. This includes in-house analysts; managers responsible for planning, implementing, and evaluating safety improvement programs; those involved with the collection and assembly of traffic records and related data; and contractors undertaking safety analysis and associated work for the jurisdictions. Judging from the survey results, the audience will have had a wide range of statistical expertise. Although the synthesis will be of more value to those with some background in statistics, the level of the narrative is intended to be fundamental. For those less adept in statistical methods, however, a list of web-based primers that provide background in basic statistical concepts is given in Appendix C. Statistical knowledge is a fundamental skill required by those conducting highway safety analysis. This knowledge is just as important for highway safety engineers appraising the literature to obtain information, e.g., on the safety effect of a particular treatment. The synthesis is therefore targeted at those charged with making such appraisals as well as those conducting highway safety analysis. Because the survey was, for convenience, confined to state and provincial jurisdictions, the vast majority of the synthesis is of relevance to those who undertake safety analysis in municipal and county jurisdictions. Finally, even though the primary target audience is U.S. jurisdictions, it is expected that the synthesis will be of interest to highway safety analysts in many countries around the world.
MATTERS OF ORGANIZATION AND STYLE
This report consists of four chapters and five appendixes. Introductory and background material is presented in this, the first chapter. Chapter 2 discusses the state of completed and on-going research of relevance. The intent is to provide an overview, while being as comprehensive as possible.
4 Chapter 3 focuses on the state of practice by means of a discussion of the results of the survey of jurisdictions. Chapter 4 presents the conclusions and recommendations for bridging the gaps between the state of research and the state of practice. The appendixes are devoted to providing background material for the main body of the text as well as a synopsis of reference material for those conducting or reviewing highway safety analyses. Appendix A presents the survey questionnaire, whereas Appendix B provides detailed tabulations of the survey responses. Appendix C summarizes valuable electronic resources for statistical analysis in highway safety with documentation of useful websites and software. Appendix D is a primer on the application of some basic statistical tools for conducting highway safety analyses. This appendix addresses some of the more common difficulties, identified on the basis of the survey and literature search, in applying statistical methods to highway safety analysis. The intent is to provide some insight into how to diagnose these problems and how to resolve them. Finally. Appendix E reviews a sample of relevant methodology from non-mainstream types of safety analyses, including driver/vehicle-related research. Two matters of style should be mentioned. First, the terms accident, collision, and crash are used synonymously, although the consultant's preference for accident may be apparent. Second, there a number of verbatim extracts from published and unpublished sources, many from documents written by the consultant. These extracts are prominently acknowledged. This form of presentation should be seen in the light of the general objective of the report to synthesize information. the intent here is merely to introduce the types of safety analysis typically conducted by highway agencies. The methods are covered in detail in subsequent chapters and in Appendix D. In this list, the percentages of responding jurisdictions reporting that they conduct that type of analysis are shown in parentheses. 1. 2. 3. 4. 5. 6. 7. 8. Before and after evaluations Identification of hazardous locations Cost-benefit analysis in development of countermeasures Analysis of collision trends Collision rate comparisons of locations with different features Cross-sectional evaluations Comparison group evaluations Risk estimation/analyses/evaluations (97%) (100%) (88%) (81%) (72%) (25%) (31%) (16%)
Before and after evaluations are conducted to assess the safety effectiveness of a given type of improvement or an improvement program as a whole. The information obtained provides feedback to the process of planning future safety improvements. These studies range from simple before and after comparisons of accident counts to the more complicated empirical Bayes (EB) approaches [see e.g., Griffith (1999), who conducted a simple before and after study using comparison groups to study the safety effect of rumble strips on freeways, and Persaud et al. (2001), who used the EB approach to evaluate the conversion of conventional intersections to roundabouts]. Identification of hazardous locations is the starting point of the process by which locations are selected for safety improvement Typically, the safety record of a location, along with other information, is used to identify and rank sites that should be investigated for safety deficiencies and possible treatment of these deficiencies. The process is sometimes known as blackspot identification. More recently, the term identification of sites with promise has been used. Cost-benefit analysis in development of countermeasures involves the estimation and comparison of the costs and the safety and other benefits of the alternative ways of remedying safety problems diagnosed at a location. This process not only ensures that only cost-effective measures are implemented, but also facilitates the ranking of measures at a location and the ranking of all possible improvements in a jurisdiction, given the usual budgetary and other resource constraints. Analysis of collision trends has multiple objectives. This analysis could bring out patterns in collision experience that may indicate that specific highway features should undergo safety investigation, or that particular types of collisions should be targeted for countermeasure development.
BACKGROUND
It is useful to first provide some background on the types of safety analyses typically conducted by highway agencies and on traditional and new methods used in these analyses. This background is intentionally brief, because further detail as required is provided elsewhere in the synthesis. Readers unfamiliar with any of these types of analysis are advised to first review the relevant material in Appendix D.
Types of Statistical Analyses
From the survey, it was learned that there are several common types of analyses undertaken in transportation jurisdictions. These types of analyses, which are the focus of this report, are listed here followed by a brief description. Details of the methods used in these analyses and the associated difficulties are not covered at this point, because
5 One can also look at time trends to detect deterioration in safety related to specific features and collision types or to detect patterns indicating whether or not investments in collision reduction are paying off in general or for specific types of collisions. Collision rate comparisons of locations with different features are frequently done with a view to attributing differences in collision rates to differences in features. Safety effect estimates for various improvements are often obtained in this way. A collision rate is used to normalize for differences in exposure to risk, e.g., in traffic volumes, between locations. These studies are primarily done where a study of collision experience before and after an improvement is deemed to be impractical. Cross-sectional evaluations are also undertaken to obtain safety effect estimates for various possible improvements using cross-sectional as opposed to before and after data. There are basically two varieties. One is the simple comparison of collision rates as outlined previously. For example, Sebastian (1999) examined the collision rates of signalized intersections in Wisconsin with various types of left-turn treatments and concluded that fully protected left turns are the safest and that protected/permissive phasing is less safe than permissive only. Cross-sectional evaluations can also take the form of complex modeling in which collisions are first related in a regression equation to a variety of highway features, including traffic volume. The safety effect of making a change in one or more variable can then be estimated using the equation to calculate the resulting change in collisions [see e.g., Council et al. (1999), who evaluated the safety effects of converting rural two-lane roadways to four lanes based on regression equations relating accidents to average annual daily traffic (AADT) for roads with these two types of cross sections]. Comparison group evaluations involve assessments of the suitability of untreated sites for use in a comparison group in before and after studies (see e.g., Pendleton 1996). The comparison group is essentially used to control for other factors that may cause a change in safety when a treatment is implemented. The intent is to separate the change in safety due to the treatment from the changes due to other factors [see e.g., Griffith (1999), who studied the safety effect of rumble strips on freeways]. Risk estimation/analyses/evaluation is a process for measuring, monitoring, comparing, and evaluating levels of risk (Stewart 1998). It is done through an integrated series of steps that include combining accident data with exposure (to risk) data in order to compute road travel risk performance measure indicators, assessing the accuracy associated with the estimated travel risk indicators, interpreting the various travel risk indicators, computing effectiveness estimates for countermeasures, and defining and applying methodology for measuring safety and economic benefits. The risk estimation methods can be used to measure the road travel risk levels for any road user, vehicle, road/infrastructure, environment, or temporal characteristic. For example, one can do risk analysis for pedestrians (Hunter et al. 1996), or for specific accident types such as run-off-road accidents.
Overview of Issues in Highway Safety Analysis
Statistical analyses in highway safety involve the use of data and the use of methods. The difficulties with methodology are not unrelated to data problems, because it is the failure to recognize and properly account for the peculiarities of highway safety data that often causes difficulties with methodology. Thus, in identifying the basic issues in highway safety analysis, it is necessary to first focus on some fundamentals of data.
Overview of Data Issues Data are fundamental to all types of safety analyses. The responses from the survey suggest that, unfortunately, data difficulties are a major obstacle to the proper conduct of highway safety analyses. The most fundamental data item, accident information, is typically collected by the police, and it is often the case that agencies conducting highway safety analyses have little influence over this process. The result being that data not material to the police investigation are often of poor quality. Compounding the problem of quality are the issues related to quantity. Most basic are the problems caused by differences in reporting practice over time and across jurisdictions. These relate, for example, to the reporting threshold for property damage accidents and the definition of injury accidents. These variations in reporting practice make it difficult to transfer research results, and differences in reporting practice over time create a formidable challenge in the conduct of any time-series studies, such as before and after evaluations. The other major issue of quantity is related to the reality that, despite the fact that highway safety is a major concern, the count of accidents for individual intersections or short road sectionsthe desired units for most highway safety analysestend to be small in statistical terms and subject to large random fluctuations. This creates difficulties in trying to model or explain accident occurrence or in trying to detect differences in safety over time or between locations. The upshot is that results of safety analyses are often statistically insignificant not because of the lack of an underlying safety effect but because of data limitations.
6 Often overlooked is the value of additional data elements used in highway safety analyses: the traffic and the physical characteristics of locations, vital information for the effective management of safety programs. Yet, traffic volume data are often not available, particularly for intersections. And while information on physical characteristics usually exists, much of it is not in the electronic form desired for modern analytical methods. More importantly, the facility to efficiently link traffic, accident, and location characteristics data, so vital to the conduct of meaningful safety analyses, is often lacking. A useful summary of accident data quality issues is contained in NCHRP Synthesis of Highway Practice 192 (O'Day 1993), which, though slightly dated, presents issues that are still very relevant today, judging by the results of the survey for the current synthesis. O'Day points to fundamental differences between accident data and that collected from scientific surveys. These pose special challenges for analyses that use accident data, because traditional statistical methods tend to be geared to data collected from scientific surveys. Fundamentally, there is usually only a modest effort to get complete and accurate accident data, which, in turn, is further compromised by the fact that most safety studies are retrospectivethey are not planned in advance of the experiment. The result is that usually there are many missing cases in accident data, and the missing data are typically, if not always, biased relative to the rest of the data, because reporting quality and completeness often varies with time and/or location. sites either incorrectly identified or overlooked and, correspondingly, to an inefficient allocation of safety improvement resources. In addition, selection on the basis of accident rates tends to wrongly identify sites with low volumes. EB approaches have been proposed of late to overcome these difficulties. Information on safety effectiveness of potential treatments [Accident Modification Factors (AMFs)] is vital to effective safety management and should properly come from before and after evaluations. However, it is often the case that sufficient data are unavailable for such evaluations. This explains the increasing tendency to use cross-sectional analysis to derive AMFs. In the most fundamental of crosssection analyses, the AMF for an element is estimated as the difference in safety for locations with and without that element. In most cases locations are different in other elements that could also account for any observed difference in safety. In addition, interactions among multiple elements may be at work; therefore, attributing the difference to a specific element is problematic. Deriving AMFs from regression models with several explanatory variables mitigates this difficulty but does not overcome it, because it is never possible for all of the factors that affect safety to be measured and accounted for in such models. The use of more advanced modeling techniques could partly overcome this difficulty. At the same time, there is increasing recognition that AMFs from crosssectional studies should always be corroborated by before and after data (Hauer 1997; Council and Stewart 1999). Accounting for differences in accident reporting across jurisdictions poses a formidable challenge to analysts. Fundamental tools such as accident modification factors and safety performance functions require considerable resources to develop, so the ability to transfer these across jurisdictions is important. There have been recent research developments in this regard (Harwood et al. 2000). Accounting for general time trends in accident experience creates a difficulty for highway safety analysis that is compounded by the need to account for traffic volume changes in analyzing time-series data. Of late, effective methods for doing so have been developed (Hauer 1997). Accounting for uncertainty in estimates is a frequently overlooked aspect of highway safety analysis. In examining differences in accident experience between elements or before and after an improvement, it is especially important to explicitly account for uncertainty in estimates and to properly interpret the results, since sample sizes and differences of interest are typically small. Often, estimates are provided without a measure of uncertainty such as the variance, and either incorrect methods are applied to
Overview of Methodological Issues It is apparent from this brief overview of data issues that highway safety analysis is not well suited to conventional statistical methodology. From the literature review and the results obtained from the synthesis survey, there are several issues of concern. These issues are summarized here, with more discussion provided in chapter 2 and in Appendix D. Conventional before and after studies involving a simple comparison of accident experience before and after an improvement can overestimate treatment benefits if locations with unusually high-accident counts in recent years tend to be selected for treatment. To guard against this possibility, an EB approach has been developed. Using a comparison group in a simple before and after study can also provide a remedy, but the selection of a proper comparison group can be challenging. Conventional procedures for identifying sites for safety investigation tend to select sites with highaccident counts and/or accident rates. However, accident counts could be high or low in a given period solely due to random fluctuations, leading to many
7 calculate the measure or incorrect tests are used to interpret the results. An appreciation of these issues is also required by those analysts who peruse the literature to obtain information from safety evaluations conducted elsewhere. For example, many statistically insignificant reductions in accidents in diverse studies following a specific treatment could be viewed as a general trend and be amalgamated into an assessment that indicates that the treatment is safety effective (Hauer 1997). See Appendix D for further discussion and the website for NCHRP 20-45 for additional guidance on this issue.
CHAPTER TWO
STATE OF RESEARCH
This chapter synthesizes the state of research related to the statistical methods in highway safety analysis. The research is split into two categories: the current state of published research, and major recent and on-going research initiatives that are intended to improve the state of practice in the near future. The review is by no means comprehensive. Given the limited scope of the synthesis, the intent is to provide a synopsis of what appears to be most relevant on the basis of the survey results relating to the types of safety analyses conducted. Published and completed research is first presented followed by information on major on-going initiatives of relevance. lane rural roads and to determine whether such an effect would be similar across multiple Highway Safety Information System (HSIS) statesCalifornia, Washington, Michigan, and North Carolina. For each state, regression models were calibrated for two-lane roads and for four-lane divided roads; models for four-lane undivided roads were calibrated for California. For example, the models for two-lane roads were of the form
PUBLISHED RESEARCH Methodology for Evaluating Treatment Effects
where a, b1, b2, and b3 are parameters calibrated from data. Most of the four-lane divided models also included median width as a variable. The complexity of the model fitting process and the reality that it does require a fair amount of statistical knowledge to undertake this type of analysis is evident in the reported statistical details in Council and Stewart (1999). These indicate as follows:
Over-dispersed Poisson models were fit using the SAS PROC GENMOD software facility. Variables not significant (P > 0.05) were omitted and the model was re-estimated without the omitted variables. Standard errors were inflated to account for overdispersion using a scale factor estimated as the square root of the chi-squared statistic divided by the degrees of freedom.
The evaluation of treatment effects is vital to the process of countermeasure development. What is sought, in essence, is reassurance that treatments are working and, more importantly, information for the development and refinement of accident modification factors used in planning countermeasures. There are two fundamental approaches to developing these factorsbefore and after studies and cross-section studies. As discussed in chapter 1 and elsewhere, each has its difficulties, but the before and after method is generally preferred where appropriate data are available. Some of the more credible and recent research efforts for the two types of studies are reviewed here.
Cross-Sectional Evaluations The vast majority of studies from which current knowledge on treatment effects is derived are cross-sectional evaluations. Therefore, a comprehensive review of all of these studies would be too voluminous. Instead, a sample of recent research that brings out the essential features of these studies is covered. These are studies that recognize the difficulties of making inferences about safety effects from crosssectional studies but nevertheless realize the practical difficulties of determining these effects from before and after studies. Council and Stewart (1999) lament the absence of adequate samples of before and after data in using crosssectional analysis to develop what they consider an initial estimate of the safety effect of conversion of two- to four-
Application of the models to estimate AMFs indicates that the effects of conversion of two- to four-lane divided sections were in accord with intuition, with reductions in total accidents ranging from 40 to 60 percent. However, the reduction for conversion to a four-lane undivided configuration is much less well defined, ranging from no effect to 20 percent. Prominent in the list of recommendations for further work are the requirements that these results be corroborated by before and after data and for models to be calibrated to determine AMFs separately for injury crashes. Because injuries are typically present in only about onethird of all crashes, obtaining injury AMFs from crosssectional models could be problematic. The modeling to estimate the effect of conversion from two- to four-lane roads was conducted by researchers at the University of North Carolina Highway Safety Research Center. That research team has been involved in using HSIS data to establish AMFs for a variety of treatments. These efforts include the effects of spiral transitions on two-lane rural roads (Council and Stewart 1999), the effects
9 of cross-section design features (Zegeer and Council 1994), and the effects of safety upgrading of horizontal curves (Zegeer et al. 1991). Much of this research is intended to facilitate the incorporation of AMFs in FHWA's Interactive Highway Safety Design Model (IHSDM). Similar crosssection evaluations for this same purpose have been recently conducted by others [e.g., Vogt and Bared (1998); Vogt (1999)], for two-lane rural roads and intersections. Other recent, related efforts of significance include Tarko et al. (1999), who set out to develop crash reduction factors for improvement projects on urban and rural road sections in Indiana using regression models. The software package LIMDEP was used in a step-wise regression analysis in which explanatory variables were added to the model in order of significance. The final models, which included only those factors that were significant at the 20 percent level, facilitated the development of crash reduction factors for lane width, access control, median width, continuous left-turn lanes, short left-turn lanes, number of lanes, pavement serviceability, and surface type and shoulder type. It is recognized that missing or correlated variables constitute a serious limitation of the method in that conclusions regarding safety measures may then be entirely incorrect. However, they argue that such cases are easily detectable where conclusions derived from regression results contradict common sense. This approach seems reasonable in some cases but in other cases there is no conventional wisdom on the direction of the effect, and in most cases knowledge on the magnitude of the effect is not part of common sense. and for the before and after analysis using the comparison group data and considering changes in traffic volumes. Of special importance are the methods for estimating uncertainty in the results. A more elaborate statistical treatise on conventional before and after methodology is provided in a draft FHWA report by Griffin and Flowers (1997), which has been proposed for publication on the the FHWA website. According to the abstract, the report is a manual that documents and discusses six different evaluation designs (and supporting statistical procedures) that may be used to determine if, and to what degree, selected highway projects are reducing crashes. The six evaluation designs covered in the report are: Simple before and after design, Multiple before and after design, Simple before and after design with yoked comparison, Multiple before and after design with yoked comparison, Simple before and after design with yoked comparison and check for comparability, and Multiple before and after design with comparisons and check for comparability. The difference between simple and multiple designs is that in multiple designs information from a series of treatments are combined to produce a more stable estimate of treatment effect. Designs with a yoked comparison are characterized by four measures in time per treatment site, before and after at the treatment site, and before and after at a comparison site, to control for extraneous factors such as changes in traffic conditions, reporting thresholds, and other factors known and unknown. The comparability check in certain designs is to ensure that accident trends in the comparison group mirror that in the treatment group in each before and after period. For example, if crashes are rising at 5 percent per year in the treatment group during the before period, then one should expect that accidents should rise by 5 percent per year in the comparison group during the after period. Hauer's book, the report by Griffin and Flowers, and other prominent sources, such as Pendleton (1996) emphasize the problem of regression to the mean (RTM) that is created when a safety record of high-accident counts at a site is used in the decision to treat it. A decrease in accidents occurs even if nothing is done to sites so selected. (See Appendix D for more details and an illustration of the RTM phenomenon.) Therefore, attributing that decrease to the treatment would overestimate its safety effect; conversely, the safety effect of treatment at sites with a randomly low count of accidents can be underestimated. As long as improvement projects are motivated, at least partly, by safety concerns, then RTM is likely to be at play and its effects must be accounted for. The point is made by Hauer
Before and After Evaluations The state of research in before and after evaluation methodology is well covered in a recent and, according to the survey results, well-known book by Hauer (1997), who has been responsible for much of the methodological development in this area of safety analysis. The book identifies the special problems created by the peculiarities of accident and related data, and presents the latest methods for accommodating these problems in the proper conduct of observational before and after studies. Fundamental to the concepts presented is a recognition that some or all of the observed changes in safety following a treatment can be due to factors other than the treatment and need to be separated from the treatment effect. These factors include traffic volume changes, secular trends in accident occurrence, and random fluctuation in accident counts. Two distinct methodologies are presented: conventional before and after comparisons and the EB procedure. These are summarized in the following sections. Conventional Before and After ComparisonHauer's book provides guidance on the proper design of a comparison group to account for secular changes in accident occurrence
10 that using a comparison group to control for RTM is problematic, because sites must be matched on accident frequency to control for changes in safety due to a random up or down fluctuation in accident counts. For example, if a treatment site had, in the before period, five accidents of the type being evaluated, the matched comparison site should also have had five accidents in the same period to control for the effects of RTM. Given this substantial data requirement, the EB approach is preferred over conventional before and after designs, as acknowledged by Griffin and Flowers (1997), for situations where RTM might be at play. The Empirical Bayes (EB) ApproachThe EB approach accounts for RTM effects, but does not require the matching of the comparison sites on the number of accidents. It also facilitates the proper accommodation of traffic volume changes and time trends in accident experience in a jurisdiction. The objective, as it is in the conventional before and after comparison, is to estimate the number of accidents that would have been expected in the after period had there been no treatment. The treatment effect is the difference between this estimate and the number of accidents actually recorded after treatment. The number of accidents that would have been expected in the after period had there been no treatment is a weighted average of information from two sources: the number of accidents in the before period and the number of accidents expected on sites with similar traffic and physical characteristics. To estimate the weights and the number of accidents expected on sites with similar traffic and physical characteristics, a reference group of sites similar to the treated ones is used as described in Pendleton (1996). Where sufficient data are available, a multivariate model, or safety performance function, that relates accident experience to traffic and physical characteristics of sites in the reference group is calibrated and used to estimate the weights and the number of accidents expected on sites similar to the treated ones. Hauer (1997) refers to this as the multivariate EB method, whereas Pendleton (1996) calls it the EB method with covariates. This approach is preferred over conventional approaches that directly estimate the reference group accident experience. However, there are two drawbacks: suitable reference population data for calibrating the models are rare in practice, and the task of calibrating a multivariate model can be challenging even for those with substantial statistical knowledge. To overcome these drawbacks, some analysts seek to adapt models developed by others for reference populations similar to those of interest. To this end, there is considerable research underway to develop comprehensive suite models for a variety of reference populations. The state of that research is summarized in Appendix D by way of a synthesis of available models. The main obstacle to applying the EB approach is that the methodology, though conceptually simple, can be cumbersome to apply, especially for those analysts without the required background in statistics. Even the provision of software packages such as the FHWA's BEATS (Bayesian Estimation of Accidents in Transportation Studies) (Pendleton 1991) has done little to help in this regard. As Pendleton notes, the issue of who should use this complex methodology requires careful consideration and the version of BEATS existing at the time requires additional effort to even be useable by the statistically sophisticated researcher Pendleton (1996). Nevertheless, real-world applications of the EB methodology are on the increase. The Insurance Corporation of British Columbia, which supports engineering improvement projects in the province, uses the EB methodology as standard practice to evaluate these improvements. In addition, the California DOT recently used the EB methodology to evaluate five types of improvement projects. Details of this and other recently documented applications of the EB approach are as follows: (Hanley et al. 2000) evaluated the effects of five treatments applied on California highways: rumble strips, shoulder widening, superelevation correction, curve correction, and wet pavement treatments. The software package, BEATS, referred to previously, was used. Wang (1994) conducted an empirical Bayesian evaluation of 13 intersections in Minnesota where new traffic signals were installed. The reference group of untreated intersections included 79 intersections that were comparable to the treatment group with respect to daily traffic volumes, intersection configuration, etc. The EB method estimated an accident reduction of 25 percent at the treatment sites. This compares to an estimate of 30 percent using a conventional before and after comparison. That this estimate is higher than that for the EB method is clear evidence that RTM was at play, resulting in a 5 percent overestimation of the treatment effectiveness by the conventional before and after comparison. Pendleton (1996) demonstrated the EB approach for two-real world evaluations: 17 locations in Michigan where raised pavement markers were installed, and a total of 54 sites in Michigan where speed limits were either raised or lowered. - For the 17 locations where raised pavement markers were installed, 42 untreated locations were used as the reference groups. For raised pavement markers, daytime accidents were used as a control group. Neither the conventional before and after comparison nor the EB method found a significant treatment effect but the point estimates of safety effect were larger for the simple before and after comparison when compared with the EB method, again evidence that RTM effects needed to be accounted for. - For the 38 sites where speed limits were lowered, the reference group consisted of 47 sites, whereas 22
11 sites were used as the reference group for the 16 locations where speed limits were lowered. Overall, when a comparison group was used to control for time trends, both the EB and the conventional before and after methods revealed that there was no statistically significant change in accidents when speed limits were raised or lowered. Interestingly, when a comparison group was not used, a method that is not recommended, the simple before and after comparison showed significant increases in accidents at sites where speed limits were lowered and significant decreases where speed limits were increased. Yuan and Ivan (2001) used a simplified EB approach to estimate the safety benefits of intersection realignment on two-lane highways in Connecticut. Instead of using a multivariate model, the weights were calculated directly from the mean and variance of accident rates in a reference group assuming that there was no time trend in accident occurrence and that the relationship between accidents and exposure was linear. It is recognized by the authors that the effect of these assumptions will need to be considered in planned future applications of the EB methodology. Nevertheless, despite the admitted limitations, the results showed that the improvements appeared to reduce the total number of crashes, with varying effects for different crash types. At 11 of the 12 sites, the effects were smaller than would have been obtained with a simple before and after comparison, indicating the strong possibility of RTM for these sites. Sayed et al. (1998) conducted a before and after evaluation of installing larger signal heads at British Columbia intersections. These signal heads tended to be installed at intersections where the safety record was poor. Indeed, the simple before and after comparison was found to overestimate the safety benefits by approximately one-third when compared with those obtained with the EB method. Kulmala (1995) developed accident prediction models for three- and four-arm junctions in Finland and used these in an EB approach to estimate the safety effects of a range of engineering improvements. Overall, reduction due to RTM was 17 percent for all accidents and 10 percent for injury accidents at three-arm junctions. Corresponding numbers for four-arm junctions were 20 percent and 34 percent, respectively. Largest RTM effects were found for measures recognized as efficient and rapidly implementable, for which there is a strong likelihood that a high-accident count would have triggered the implementation decision and also dominated the before period data. The greatest RTM effect was for the installation of stop signs at four-arm junctions that were previously yield or uncontrolled. Interestingly, after controlling for RTM, the remaining effects were so small that only the effect for road lighting was found to be statistically significant (16 different measures were evaluated). Ironically, a simple before and after comparison would not only have considerably overestimated the safety effects, but would likely have also found that the overestimates are statistically significant. Elvik (2001) evaluated the safety effects of 20 bypass road projects in Norway. Effects were evaluated by means of an observational before and after study, controlling for RTM and general area-wide trends in the number of accidents. On average, a statistically significant reduction of 19 percent in the number of police-reported injury accidents was attributed to the bypass roads. In this case, the net effect of the two confounding factors controlled for in the study was small because the effects were in opposite directions. Similar to Yuan and Ivan (2001), the weights for the EB calculations were calculated directly from the mean and variance of accident rates in a comparison group. Persaud et al. (2001) evaluated the safety effect of roundabout installation in the United States. This application is used as an illustration of the methodology in Appendix D. Examples of recent studies that used conventional before and after comparisons rather than the EB approach, but which nevertheless have a certain amount of statistical rigor include: Yuan et al. (1999) set out to update the procedure for using current data for developing accident reduction factors for highway countermeasures in Connecticut. This phase of a longer-term study focused on the ease of data collection, processing requirements, methodology, and the procedure for conducting a crash reduction study, recognizing that these elements are all linked. Two methods were demonstrated. Both were, in essence, simple before and after comparisons; one expressed uncertainty in accident reduction factors using traditional confidence intervals, the other used likelihood functions to express this uncertainty (see Appendix D). It was felt that including, in the future, a comparison group of sites for which the accident frequencies are similar to that for the treatment group would improve the statistical reliability of the results. This feeling was based on a recognition of the RTM problem and the potentially prohibitive requirements of the EB approach for resolving it. [See Yuan et al. (2001) reviewed previously for subsequent work by this research team.] Griffith (1999) used two approaches to evaluate shoulder rumble strips installed on freeways. These were a before and after evaluation with yoked comparisons and a before and after with a comparison group. The EB approach was considered but not used because it was assumed that there was no selection bias of treatment sites based on accident history.
12 The basis for the assumption was the apparent similarity between the accident experience of the comparison and treatment groups before the rumble strips were installed. While this approach is sound, such a convenient situation is infrequent in practice, and establishing similarity in accident experience between the treatment and comparison groups can be problematic, particularly in evaluating treatments whose effects are likely to be small. Several real-world evaluations are used as illustrations in the report by Griffin and Flowers (1997). These include evaluations of 3R (Resurfacing, Rehabilitation, and Reconstruction) projects in New York State, continuous shoulder rumble strips in North Carolina, and raised pavement markers in Texas. In the most fundamental variation of the EB approach, the EB estimate of the expected number of accidents at a site, rather than the accident count, is used in the conventional statistical quality control methods. Another variation ranks sites for safety investigation by their potential for safety improvement (Persaud et al. 1999), which is the difference between a site's EB estimate and the expected accident frequency at a normal site. The latter estimate is obtained from a multivariate model calibrated on data from sites deemed to have desirable design standards from a safety perspective. Defining such sites can, however, be a challenge. Along similar lines is the current FHWA research project in Colorado that is reviewed in the next section. This research is examining the possibility of ranking sites by the potential cost-effectiveness of improving them. At the simplest level, both cost and potential safety effectiveness are approximated on the basis of EB estimates. The common thread in recent research on the subject of identifying and prioritizing sites for safety investigation is that there is a departure from the use of accident rates in this process. There is recognition that, because of the non-linear relationship between accidents and traffic volume, accident rates usually decrease with traffic volume and therefore sites with low volumes tend to be selected if accident rate is used as a selection criterion by itself. Current procedures in place in many jurisdictions try to overcome this difficulty by requiring a minimum accident count for a site to be flagged. The extent to which this refinement overcomes the problem is unclear, because counts (and rates) are subject to random fluctuation.
Methodology for Identifying Hazardous Locations
It is important that the process for identifying sites requiring safety investigation be efficient because resources can be wasted on sites that are incorrectly identified as potentially unsafe and sites that are truly unsafe can go untreated if not identified in this process. Techniques that simply flag sites that have a high-accident count and/or rate are now known to have difficulties in identifying deviant sites because of the potential bias due to the RTM phenomenon, in which sites with a randomly high-accident count can be wrongly identified as being hazardous and vice versa. Many jurisdictions attempt to overcome this difficulty by using a statistical quality control framework in which a count or rate is deemed to be unusually high only if it is larger than an upper control limit (UCL). The UCL is based on the mean and standard deviation of accident experience on similar sites, usually assuming that counts are Poisson distributed about the mean. One problem is that defining similar sites can be a challenge. Also, recent research has shown that the assumption of a Poisson distribution is often incorrect, leading to a UCL that can be too low (Sung et al. 2001). To overcome the difficulties with the conventional techniques, the EB approach has been suggested and has been explored by several researchers (see e.g., Pendleton 1991). Despite these efforts, the application of this approach by highway agencies is rare. Part of the reason is that the necessary data resources may not yet be in place, particularly for the more sophisticated versions of the EB approach. According to conversations with highway agencies, the limited validation and testing of this approach has been another deterrent to its implementation. Of course, in many cases, the level of understanding of this relatively new approach may be too low. The FHWA is seeking to remove these obstacles in a current collaborative project with the Colorado DOT. This initiative is reviewed later in this chapter.
Multivariate Accident Models
Regression equations that relate accident experience to the traffic and other characteristics of locations have been referred to in some of the safety literature as multivariate models (Hauer 1997). The use of these models is becoming widespread in modern highway safety analysis. The second part of this chapter and Appendix D provide some details of those applications. Because the development of these models in itself constitutes one form of highway safety analysis, it is in order to devote some coverage here to the state of the art and issues in their development, recognizing that a full treatment of this subject is outside the scope of this synthesis. The literature on multivariate accident models can be divided into two classes: accident prediction models and accident causation models. In causation models, accidents are related to factors that explain accident causation. If such models are successful, the coefficients of the various factors can be used to estimate the change in safety that would result from a change in that factor. Recent attempts to calibrate such models for rural roads and intersections
13 (Vogt et al. 1998; Vogt 1999) serve to illustrate the difficulties in calibrating these models. For these research projects, data were collected for a wide array of variables thought to influence safety. However, because of a small dataset, a lack of variation in many factors, and strong correlations among many variables of interest, the resulting models contained very few variables. Accident prediction models, on the other hand, are intended to estimate the safety of a location as a function of variables found to be the best predictors (see e.g., Bonneson and McCoy 1993; Lord 2000). Recently these models have been used in the EB procedure to estimate safety of locations for identifying blackspots or conducting before and after studies. These models need not be built only from causal variables, but can also include variables associated with accidents, for which data are readily available. For example, categorical variables such as traffic control, divided/undivided, and functional class typically have a strong association with safety in that they can account for the effects of a wide array of geometric variables. Many modelers therefore simply group locations by these categorical variables and, for each group, calibrate models relating accidents to traffic volume, the variable that explains most of the variation in accident occurrence. These models are usually better for accident prediction than accident causation models because there is more freedom of choice in the variables in that one could, with care, use correlated variables and variables that may be marginally insignificant, particularly if these have theoretical support (Washington 1999). In calibrating models of both types, a modeler must guard against possible deficiencies resulting from omitted variables, incorrect functional forms, over-fit models, and lack of causal variables. To partly accomplish this, it is necessary to undertake exploratory data analysis using techniques such as CART (Brieman et al. 1984; Washington 2000) and the ID method (Hauer and Bamfo 1997) to determine which variables should be used, whether and how variables should be grouped, how they should be defined, and how they should enter the model, i.e., the best model form. Typical model forms considered in accident modeling are exponential, piecewise linear, and quadratic forms. Less common possibilities include the use of Gamma functions and more general polynomial forms. It is currently common to use generalized linear modeling (McCullagh and Nelder 1989) to estimate the parameters of the models. Software packages such as Genstat, S-plus, LIMDEP, GLIM, or SAS are used for this purpose (see Appendix D for a brief description of these and related software packages). Such packages allow for the specification of different error distributions, including the negative binomial, which is commonly regarded as being more appropriate to describe the count of crashes in a population of entities than the Poisson or Normal distributions assumed in conventional regression modeling (Miaou 1996; Poch and Mannering 1996; Hauer and Bamfo 1997). In specifying a negative binomial error structure, an over-dispersion parameter, which relates the mean and variance and which is used in the EB procedure, can be iteratively estimated from the model and the data. This parameter can also be used as an indication of the relative quality of competing models, which is convenient, since the traditional R2 measure is not relevant for generalized linear accident prediction models (Kulmala 1995; Miaou 1996).
Other Methods Relevant to Highway Safety Analysis
The focus of this Synthesis is on what, on the basis of the survey results, can be regarded as mainstream highway safety analysis. This relates mainly to safety estimation for the purpose of identifying hazardous locations, and the development and evaluation of engineering treatments. The survey responses and the literature review suggest that other types of safety analysis are in fact conducted and that some of the promising techniques used may be of interest to safety analysis in general because several of these are relevant to and have been used in research related to drivers and vehicles. These techniques listed here are covered in Appendix E with a review of a sample of research using them. Log-linear analysis, Contingency table analysis, Induced exposure/risk estimation, Logit models, Ordered probit models, Logistic models, Meta analysis, Factor analysis, and Data imputation.
MAJOR RECENT AND ON-GOING RESEARCH INITIATIVES
This section provides information on current major researchoriented initiatives that are intended to improve the application of statistical methods in highway safety analysis. Five such initiatives are covered.
NCHRP 20-45 Scientific Approaches for Transportation Research
As mentioned earlier, this Synthesis effort was intended to complement the deliverables of NCHRP 20-45. The main product of that project is a manual that is in the form of an Internet resource on statistical methods for those undertaking transportation research in general. A National Highway Institute (NHI) course based on this project was
14 piloted early in 2001. The information on the manual provided below is taken almost verbatim from the draft version of the website.
The purpose of the manual is to improve the quality of transportation research. It was written in response to a perceived need for a single, comprehensive source of information on the conduct of research. Emphasis has been placed on applied, physical research because this constitutes the dominant research activity of most transportation agencies. Traffic, accident, and safety research are covered in less detail. The manual includes state-ofthe-art techniques for problem statement development; literature searching; development of the research work plan; execution of the experiment; data collection, management, quality control, analysis, and interpretation; reporting of results; and evaluation of the effectiveness of the research, as well as the requirements for the systematic, professional, and ethical conduct of transportation research. The recommended practices are based largely on the procedures of the Transportation Research Board. The contents of the manual have been organized into seven chapters and nine appendices. The manual has been written for transportation agency personnel who perform or supervise research, though it will also be useful to people conducting transportation research in other work environments such as universities and consulting. The contents are directed primarily to individuals with a college or university education, but with no formal training in research. The presentation presumes that the reader has completed a basic course in statistics, and is comfortable working with a personal computer.
i.e., before, during, and after data collection; after data analysis; and after the study is complete. The chapter includes an explanation of the organization of data, records and files, and a discussion of how to determine the integrity and validity of data. A research study is not complete until a written report has been completed. Chapter Six provides an overview of the organization and content of reports and technical papers. Chapter Seven concludes that a definition of measurable objectives, a formal work plan, rigorous application of established techniques of analysis and interpretation, and the preparation of a complete report of the findings, are fundamental steps in any successful research project. The chapter also includes a bookshelf of publications intended to complement the manual.
Overview of Volume II Volume II is a seamless continuation of Volume I, Principles and Processes to Transportation Research. Volume II consists of six chapters and several appendices. Chapter 1 helps the researcher to identify the empirical setting for which his/her research is being carried out. Chapter 2 enables the researcher to select the appropriate analysis techniques, while Chapters 3-6 present details of various statistical analysis techniques.
Overview of Volume I Chapter One introduces the principles of scientific research and explains the protocols that have evolved for the professional and ethical conduct of research. Chapter Two explains terminology and the principles of scientific investigation. It also discusses barriers to good science, including reasoning which is not logical, lack of proper controls, insufficient repetitions, bias, and sustaining unsuccessful projects. Chapter Three describes the research process at the project level and includes sections on problem statement development, project selection, requests for qualifications or proposals, reviewing proposals, development and execution of the work plan, the dissemination and implementation of the findings, and evaluation of the research project. Chapter Four discusses the statistical considerations in the design and analysis of research studies. The chapter takes common research problems and suggests the statistical technique suitable for its solution, the underlying assumptions, and the interpretation of the output from generic statistical computer programs. Issues involved in the collection and management of data are contained in Chapter Five. Answers are provided to issues which arise during the chronological life of a project,
Comprehensive Highway Safety Improvement Model (CHSIM)
The FHWA is currently undertaking a major effort to develop a Comprehensive Highway Safety Improvement Model (CHSIM). The description here is taken, often verbatim, from text in the July 2000 Request for Proposals for the project. The broad aim of the project is to improve the safety of existing highways. Specifically, the goal is to assist state and local highway agencies by upgrading the highway safety improvement programs they manage through the development and implementation of a set of innovative, analytical tools designed to guide the process of allocating resources. The focus is on the remedial applications of highway safety improvement programs, which are typically run as a sequential process. There are four main phases in the process: (1) identification, (2) investigation, (3) program implementation, and (4) evaluation. The particular steps of the process are: (1) identifying hazardous locations, (2) diagnosing problems at these locations, (3) selecting countermeasures, (4) ranking priorities/economic appraisal, (5) programming and implementing projects, and (6) evaluating projects. This effort will develop
15 analytical tools to improve the process for steps 1 through 4 for which the fundamental objective is to allocate resources to achieve the greatest safety benefits. Engineering countermeasures are the primary focus of the project. The promise of this effort is that the CHSIM will achieve significant safety gains, because highway agencies will be relying on better analytical tools than they are currently using to guide their safety investment strategy. The plan is for the CHSIM to be available around 2005 in the form of a software product that can readily be used by state and local highway agencies. It is expected that the aspect of CHSIM that facilitates the identification of hazardous locations (sites with promise) will be based on current research in Colorado. Since 1998, the FHWA and the Colorado DOT have been conducting a cooperative research project titled Implementation of a New Methodology for Identifying and Ranking of Locations with Potential for Accident Reduction. The main focus is on developing and implementing advanced statistical methods for identifying sites with promise (locations that hold promise for accident reduction). How much promise they hold is established during a detailed engineering investigation phase of CHSIM. The Colorado research effort recognizes that the overriding aim of the highway safety improvement process is to spend money where it achieves the greatest effect in terms of accident frequency and severity reduction. The implication is that money will tend to go to sites where there are many severe accidents or where the potential accident reduction is large, and not to sites where accidents are few but the accident rate is high because of low traffic volumes. Given these considerations, the research is exploring the practicality of ranking locations for investigation using prospective cost effectiveness of potential safety treatments. The most important product of the Colorado effort that will feed the development of the CHSIM will be a software package that implements advanced statistical methods for identifying sites with promise. The expected delivery date of this software package is October 2001. HSDM. How these ingredients will be used is contained in a recent FHWA report (Harwood et al. 2000). The report provides an algorithm for predicting the expected number of accidents for road segments and intersections on two-lane rural roads. The expected number of accidents is first estimated for a set of base conditions using crash models developed for two-lane road sections, signalized intersections, and for three- and four-legged Stop controlled intersections. AMFs are then applied for elements that vary from the base condition to estimate the expected number of accidents at an intersection or road section of interest. The resulting estimate can be used in the EB procedure for situations where the accident history is known. The following is a simplified example that is intended to demonstrate the potential of the algorithm. For specific instructions on applying the methodology, readers should refer to Harwood et al. (2000) and to the IHSDM software documentation when that is released.
SIMPLIFIED ILLUSTRATION OF THE IHSDM ACCIDENT PREDICTION ALGORITHM
Consider a four-legged Stop controlled intersection, for which the full crash model is; Accidents/year = exp(9.34 + 0.60 ln (Major Road ADT) + 0.61 ln (Minor Road ADT) + 0.13(ND) 0.0054 SKEW) where ND is the number of driveways within 76 m of the intersection on the major road and SKEW is the intersection skew angle (= 0 for right angle intersections). The base condition is no driveways, adequate sight distance, no turn lanes, and no skew. For this condition, the base model is Accidents/year = exp(9.34 + 0.60 ln (Major Road ADT) + 0.61 ln (Minor Road ADT)) AMFs assembled by a team of experts are then used to adjust the base model prediction to account for the effects of skew angle, traffic control, exclusive left- and right-turn lanes, and sight distance at a specific intersection. For example, assuming the simplification of no agency specific adjustment for an intersection with a major road AADT of 8,000 and a minor road AADT of 1,000 gives a predicted safety performance for the base condition of Accidents/year = exp(9.34 + 0.60 ln (8000) + 0.61 ln (1000)) = 1.34 Suppose an intersection differs from the base condition as follows: A left-turn lane is present on one of the major
IHSDM Research on Accident Prediction Methodology
The FHWA is developing an extensive tool known as IHSDM (Interactive Highway Safety Design Model) (Paniati and True 1996) for designing or redesigning highways. An integral part of IHSDM is a safety analysis module that allows the analyst to examine the safety implications of design decisions. Considerable background research has been conducted in recent years on developing accident prediction models and accident modification factors that are the fundamental ingredients of the safety analysis module in
16 road approaches; for this an AMF of 0.76 has been prescribed by the panel of experts. And sight distance is limited in one quadrant of the intersection. For this the prescribed AMF is 1.05, that is, accidents are increased by 5 percent. All other characteristics are in accordance with the base conditionsthere are no exclusive right-turn lanes and there is no skew. The predicted safety performance for the actual conditions is obtained by simply multiplying the base condition estimate by the AMFs. Accidents/year = 1.34 0.76 1.05 = 1.06. This would be our best estimate of the safety performance of this intersection in the absence of any accident history data. If we had such data, we could do better by using the EB procedure. Suppose the intersection actually recorded 5 accidents in the past 3 years. The expected accident frequency (E) considering both the model prediction and the observed frequency is given by E = w (Np) + (1 w)O where Np is the predicted frequency over a period of length equal to that of the observed accident count; O is the observed count; the weight, w, is I/I + k Np; and k is an overdispersion parameter derived in the model calibration. For four-legged Stop controlled intersections, the value of k is 0.24. Thus, Np = 3 1.06 = 3.18, w = 0.57 and E = 0.57(3.18) + 0.43(5) = 3.96 accidents in 3 years Note that this value is between the 5 accidents observed and the 3.18 in 3 years (1.06/year) predicted without a consideration of the actual accident experience. This is because the refined estimate of 3.96 accidents in 3 years is a weighted average of the 5 accidents observed and the 3.18 accidents predicted strictly on the basis of the traffic and design characteristics. Now suppose that the left-turn lane is being considered for removal to accommodate a redesign. Recall that the AMF for installing a left-turn lane is 0.76. The AMF for removing a left-turn lane is logically the inverse of that value, or 1.32. The safety consequence of removing the left-turn lane can then be estimated as (1.32 E) E = (1.32 3.96) 3.96 = 1.27 accidents in 3 years or 0.42 accidents per year It is stressed that this example is simplified. Of course, consideration will have to be given to accident severity and to traffic volume changes; estimates of uncertainty should be provided as well. Also, adjustment of the base model for application in a specific jurisdiction will usually be necessary. Guidance on these aspects is provided in Harwood et al. (2000). Table 1 identifies the variables in the base models for intersections along with values (in parentheses) to be assumed
Highway Safety Manual NCHRP 17-18(4)
for the base conditions. Information about AMFs and base models for two intersection types shown in the last two columns is preliminary and was derived from a 2000 draft FHWA report, Accident Modification Factors for Two Lane and Multi-lane Facilities. No AMFs are provided for the number of driveways, grade rate, or roadside hazard rating. Presumably, the effects of these variables can, at least initially, be estimated from the models. The models are also used to estimate AMFs for skew angles at three- and four-legged Stop controlled intersections of two-lane roads. (For the three-legged controlled intersections, an alternative model was used because this variable was not significant in the final base model; for signalized intersections, skew angle is thought to have an insignificant effect; for intersections with multilane major roads, the AMF for skew angle is under development.) Similarly, the models would be used to estimate the effect of changing the number of legs or changing from Stop to Signal control.
NCHRP is currently undertaking the development of a Highway Safety Manual (HSM), which will be similar in principle to the Highway Capacity Manual. The purpose of the HSM, according to the July 2000 problem statement, will be to provide the best factual information and tools, in a useful and widely accepted form, to facilitate roadway design and operational decisions based upon explicit consideration of their safety consequences. Other verbatim excerpts from the July 2000 Problem Statement for NCHRP 17-18(4) follow.
There is a significant opportunity for improving the explicit role of highway safety in making decisions on roadway design and operations. Improved, low-cost technologies have encouraged many state Departments of Transportation and other agencies to develop systems to deliver better safety information. In addition, there has been a parallel advancement in the science of safety impact prediction. Better understanding of the statistical nature of crashes, coupled with new analytical tools, makes it possible to produce more valid estimates of the affect of geometric and operational changes on the frequency and severity of crashes. The American Association of State Highway and Transportation Officials has developed a strategic highway safety plan that includes 22 emphasis areas containing a number of countermeasures designed to quickly reduce fatalities on our nation's roads. Two of the initiatives address safety information and management of the highway safety system. A key strategy for these initiatives involves improving safety information systems for better decision support. Furthermore, the move toward context sensitive design approaches has put additional pressure on state and other agencies to develop the means and tools for making design decisions that may involve exceptions to existing criteria. The safety impacts of such decisions should be explicitly considered. Recent legislative requirements for improving safety data and the use of safety as an explicit criterion in planning and designing transport facilities have created needs within many agencies for improved tools and techniques for safety analysis.
17 TABLE 1 VARIABLES (AND BASE CONDITION VALUES) FOR FIVE INTERSECTION CRASH MODELS
Although there have been substantial investments in research and development on highway safety related to the roadway environment [e.g., Federal Highway Administration's (FHWA) program to develop the Interactive Highway Safety Design Model], there is no commonly accepted, fully integrated approach for safety analysis of designs. Hence, safety may not be incorporated in the most effective manner. In December 1999, a workshop was held, under sponsorship of eight Transportation Research Board (TRB) committees funded by FHWA, for the purpose of determining the need for, nature of, and feasibility of producing a Highway Safety Manual (HSM). A group of about 25 researchers and practitioners participated in the workshop and concluded that there was definitely a need for such a technology transfer activity and that work should begin as soon as possible on the development of an HSM. The results of the workshop will be documented in a TRB Research Circular.
highway traffic safety. This effort is facilitated by the linking of information collected by police on crash reports to data-bases that contain more detailed medical information than police are qualified to report on. Since 1993, CODES has expanded to include about one-half of the states in the United States. In recent years, efforts to standardize the reporting effort have intensified. Standardized reporting is expected to better facilitate between state comparisons, to simplify and foster dissemination of data within states, to target specific areas for planning and research, and to promote a national report of CODES outcome data. The relevance of CODES to this Synthesis is that the availability of these data impact both the types of highway safety analysis that can be conducted and the methods used in these analyses. For example, one of the first uses of the linked data was to make comparisons of those using and not using safety belts or motorcycle helmets by identifying and contrasting the characteristics of the injured and uninjured persons within each of the restraint use groups. More recently, Finison and DuBrow (1998) used the Maine CODES data to study ran off road crashes. The study was confined to crashes occurring on dry roads because of a previous analysis of CODES data that showed ran off road with dry conditions accounted for 79 percent of hospital charges for ran off road crashes, but only 35 percent of the drivers involved in these crashes.
Crash Outcome Data Evaluation System (CODES) The following information is taken from a recent NHTSA report (Finison 2000) and from the CODES website: http:// www.nhtsa.dot gov/people/ncsa/codes/CODESindex.htm. CODES is a collaborative approach, led by the NHTSA, designed to generate medical and financial outcome information relating to motor vehicle crashes and use this outcome-based data as the basis for decisions related to
18
CHAPTER THREE
STATE OF PRACTICE
A major task in this synthesis effort was a survey of state and provincial jurisdictions in the United States and Canada. The purpose was to seek out details on how safety analyses are conducted in highway agencies in order to identify crucial issues and needs. The survey questionnaire is shown in Appendix A. A detailed summary of the survey responses is presented in Appendix B. This chapter discusses these results, presenting examples of the types of safety analyses conducted. The survey results, seen in the context of the state of research presented in chapter 2, provide insights into the needs for improving the state of practice. These needs are addressed in chapter 4. The survey was sent to all 50 state agencies in the United States and to the 11 provincial jurisdictions in Canada. Twenty-seven states and five provinces responded, with more than one response coming from one state. Six states provided examples of highway safety analyses conducted. The survey consisted of three parts. Part I sought general information on the jurisdiction size and the types of safety analyses conducted in the past 5 years using collision data/models. Part II dealt with the details of these safety analyses, seeking information on how before and after evaluations are conducted, how collision modification factors are developed, and how high hazard locations are identified. Respondents were also invited to provide details of research projects they have recently undertaken. Part III sought information on the problems encountered in highway safety analyses and how these are dealt with. The intent was to identify the barriers to the successful conduct of safety analysis. The survey results are discussed below for each of the three parts. Seven respondents did not provide population information and three did not provide road mileage information. From the survey, it was learned that there are several common types of analyses undertaken in jurisdictions. These are listed followed by the percentages of jurisdictions reporting that they conduct that type of analysis. Before and after evaluations Identification of hazardous locations Cost-benefit analyses in development of countermeasures Analysis of collision trends Collision rate comparisons of locations with different features Cross-sectional evaluations Comparison group evaluations Risk estimation/analyses/evaluations (94%) (100%) (85%) (85%) (76%) (27%) (27%) (18%)
Twenty-six of the 32 respondents reported that safety analysis was mostly conducted in-house. The remaining six respondents indicated that outside consultants were frequently used in addition to in-house resources.
SURVEY RESULTS PART II: DETAILS OF SAFETY ANALYSES UNDERTAKEN IN THE PAST 5 YEARS Before and After Evaluations of Countermeasures
SURVEY RESULTS PART I: GENERAL INFORMATION
The range and averages of population size and road mileage for those jurisdictions that provided this information are: Average population = 4,639,306 Median of population = 3,387,035 Range of population = 7,471 to 24,000,000 Average road mileage = 34,652 Median road mileage = 17,985 Range of road mileage = 3,879 to 296,614.
All respondents reported the use of historical collision frequency and/or collision rates to perform simple before and after studies. These studies may be done for accidents as a whole or for collisions of a particular characteristic, i.e., contributing factors, weather conditions, etc. In addition, most jurisdictions use collision diagrams in the analysis. The North Carolina, California, and New York DOTs and Saskatchewan Highways and Transportation also reported using EB methods. Few respondents reported the use of comparison groups to account for jurisdiction-wide changes in accident experience. Some respondents reported the use of significance tests, including the t-test, chi-square test, and Ftest (13 respondents); collision prediction regression models (8); logistic regression models (2); logodds ratio methods (2); time-series analysis (7); sampling techniques (4); and traffic conflict techniques (5). The length of before and after periods typically ranges from 1 to 10 years. Most jurisdictions use a minimum of 2 years of data for the before period and 3 years for the after period.
19
Development of Collision Reduction Factors for Countermeasures
The most frequently reported sources of information on the effectiveness of countermeasures were literature reviews and before and after evaluations completed within the respondent's jurisdiction. Eleven jurisdictions reported comparing the collision experience at locations with and without the feature of interest. Two respondents reported performing value engineering exercises. The Colorado and North Carolina DOTs are using or planning to apply regression models to determine reduction factors. The California DOT reported recently updating accident reduction factors for five improvement types. The research is documented in a recent publication (Hanley et al. 2000) that describes an EB before and after study to evaluate the effects of five treatments: rumble strips, shoulder widening, superelevation correction, curve correction, and wet pavement treatments. The analysis was facilitated by a seldomused software package, BEATS (Bayesian Estimation of Accidents in Transportation Studies) that has been developed by FHWA (Pendleton 1991).
Works reported the screening of highway sections by comparing the 5-year moving average accident rate to the 5-year moving average for all highway sections in the province for the same highway class. Ohio has an interesting approach that explicitly recognizes that all ranking methods have pros and cons depending on one's objectives and therefore that no single ranking method should be used; they use six. The following text describes this approach and is taken from documentation provided by the Ohio DOT.
The High Hazard Location System (HSP) is a flexible software based system for identifying high hazard locations. It allows the user to specify minimum section length, crash count thresholds, time period, crash types, as well as many other input selection criteria. It also permits the user to control in detail the rules for selecting and ranking the list of high hazard locations and sections. Three years of crash data are merged with current signal, volume and road inventory data files, thereby associating each location with its operational characteristics. Intersection and intersection-related crashes are examined to ensure each crash is identified with the correct priority roadway, cross-road name and logpoint. HSP first reduces the number of locations by comparing the number of crashes occurring at both intersection and section locations with user-prescribed threshold values for frequency, creating pre-candidate locations. HSP calculates the following values for each pre-candidate location: crash, crash rate, delta-change (change in the number of crashes over time), equivalent property damage only (EPDO), equivalent property damage only rate (EPDO rate), relative severity index (RSI) and density. At least one of these calculated values must meet or exceed the threshold applicable for its matching criteria in order to remain as a candidate location. HSP then determines each location's rank with respect to each categorical value. HSP uses the hazard index method to determine overall ranking. It calculates a priority index for each location. The user can specify any of the six ranking methods to be included as factors for the priority index and give each selected method any weighted value. The rank at each location for each method selected is multiplied by its corresponding weight. Those products are then summed, giving the priority index value for that location. The resulting priority index values of all locations are then sorted in ascending order, giving HSP's hazard index rank for all location candidates.
Identification of High Collision Locations
Jurisdictions typically identify hazardous locations on an annual basis. All jurisdictions indicated the use of collision frequency, collision rate, or a combination of the two, for identifying hazardous locations. Eleven jurisdictions reported the use of a hazard index that is typically based on collision frequency/rate and sometimes severity. The use of a combined ranking index seems to implicitly recognize the problems in using collision frequency or rate alone. Most approaches have some statistical basis, recognizing the random nature of accident counts. Some specifics are included in the following section. The California DOT reported using the EB technique and a technique that lists locations with an accident frequency higher than the Poisson distributed mean for that class of road. The Maine DOT reported the use of a hazard index that uses statewide crash rates, calculated for the various road classifications on road sections and at intersections. The Maine DOT uses a Critical Rate, calculated by applying a standard deviation, and this rate is divided by the appropriate statewide average crash rate yielding a Critical Rate Factor. Locations with a Critical Rate Factor of 1.0 or greater and that have experienced a minimum of eight crashes in the most recent 3-year period are considered High Crash Locations (HCLs). The Tennessee DOT uses a hazard index that is calculated by dividing the number of fatal and injury collisions by the total number of collisions. This index is used to rank the hazardous sites that were identified using a combination of collision rate and frequency. The Nova Scotia Department of Transportation and Public
Research Projects
Respondents were asked to rank subject areas of safety research for the level of effort expended by their jurisdiction. The results are tabulated in Table 2. It should be noted that not all respondents rated each topic; thus, some table cells contain zeros and the summations of columns do not equal the total number of questionnaire responses received. The survey indicates that jurisdictions place most emphasis on research on the safety effect of countermeasures and the development of procedures for safety analysis, followed by the identification of high risk patterns and issues. Developing state-of-the-art reports on safety knowledge and multivariate models receive the least research emphasis.
20
TABLE 2 RANKING OF RESEARCH ACTIVITIES
Research Activity Safety effect of countermeasures Development of procedures for safety analysis Identification of high risk road travel patterns and issues State-of-the-art reports on safety knowledge Development of mulivariate models
*
No. 1 Rankings* 9 10 7 3 1
No. 2 Rankings 7 3 5 6 1
High level of effort is ranked 1 and low level of effort is ranked 5.
Few jurisdictions reported on the statistical tools used in research projects. Those tools that were reported included: Morin's Upper Control Limit (Maryland DOT), Analysis of variance, Sampling techniques, Poisson and negative binomial regression, Logistic regression, Ordinary least-squares regression, Weighted least-squares regression, Likelihood functions, and EB procedure.
The threshold value of damage was reported to have increased in the early 1990s in several jurisdictions. Of the responses, 19 did not allow for self-reporting of accidents and 14 did allow self-reporting for property-damage-only accidents. Most jurisdictions reported minimal difficulties arising from the underreporting of collisions and assume reporting levels to be constant. The Texas DOT reported using only injury and fatal accidents to make temporal comparisons. The Virginia and California DOTs and the Province of Ontario report that they frequently follow this practice.
Time Trends in Collision Experience
Respondents frequently reported FHWA and NCHRP documents as safety resources, although few were specific. Resources cited are listed here: Observational Before-After Studies in Road Safety (Hauer 1997). FHWA report The Cost of Highway Crashes (Muller et al. 1991). Office of Safety and Traffic Operations. FHWA-RD-99-53 Assessment of Techniques for Cost Effectiveness of Highway Accident Countermeasures. NCHRP Report 162 Methods for Evaluating Highway Safety Improvements. Institute of Transportation Engineers, Traffic Safety Toolbox (ITE 1999). Safety Cost Effectiveness of Incremental Change in Cross-section Design and Crash Models for Rural Intersections (Montana). Transportation Association of Canada: Safety Analysis of Roadway Geometry and Ancillary Features1997. Interactive Highway Safety Design Model preliminary materials.
SURVEY RESULTS PART III: PROBLEMS/ISSUES IN SAFETY ANALYSES Underreporting of Collisions
Few respondents identified time trends that affect collision experience. Those trends that were identified included speed limit legislation, improved vehicle safety, the increased use of seat belts, graduated licensing, enforcement practices, emergency response, and physical changes between urban/suburban and rural environments. The Ministry of Transportation of Ontario reported that the transferring of provincial highways to local governments creates issues with highway classification levels and how network screening is undertaken. Most respondents recognize that time trends do affect before and after evaluations, identification of hazardous locations, and the development of collision reduction factors, but typically do not take this into account in the analyses other than by using as large a dataset as possible. This is a good illustration of the gap between the state of research and state of practice.
Changes in Traffic Volumes in Before and After Studies
Traffic volumes for the before and after periods in before and after studies are typically available and most jurisdictions use accident rates (accidents/traffic volume) to account for changes in traffic volumes between two time periods. The North Carolina, Colorado, and New Jersey DOTs reported making adjustments based on collision prediction models but do not elaborate.
Regression to the Mean
The minimum criterion for the reporting of an accident is when an injury occurs or when the property damage exceeds a threshold value. The threshold values reported by respondents ranged from $150 in Ohio to $1,400 in Delaware.
Twenty-eight of 33 respondents were aware of regression to the mean (RTM). Nineteen reported conducting studies
21 in which RTM is a factor. Of those aware of this phenomenon, several attempt to account for it by using many years of data and/or a comparison group. The Nebraska Department of Roads reported the use of significance tests. Interestingly, several respondents who are aware of RTM reported that accounting for this bias is not relevant to the analyses that they undertake. These responses serve to emphasize the point that the phenomenon, and methods for accounting for it, are perhaps not well understood. It is encouraging, however, that the California DOT and the Quebec Ministry of Transport reported using the EB procedure that is detailed in Observational Before-After Studies in Road Safety (Hauer 1997).
Adequacy of Traffic Volume/Exposure Data Appropriate Skills and Resources
Most respondents reported that their safety analyst personnel have a Bachelor's degree with training in statistics or a related mathematical field. Presumably this would be in an undergraduate statistics course that is typically part of Bachelor of Engineering programs. On a lack of appropriately skilled personnel for conducting safety studies, 18 respondents indicated a lack of personnel and 16 indicated no additional needs. On a lack of information on the proper conduct of these studies, 16 indicated there was an information gap and 17 indicated otherwise.
Ability to Link Collision and Related Databases
Respondents reported that traffic volume is typically used in before and after evaluations of countermeasures, identification of hazardous locations, and for calculating accident rates. The New York, North Carolina, and Virginia DOTs reported using traffic volumes for risk estimation as well. The North Carolina DOT, Saskatchewan Highways and Transportation, and the Ministry of Transportation of Ontario reported using traffic volume for the development and application of accident prediction models. On a scale of 1 to 5, with 1 representing high quality, traffic volume data were typically rated from 1 to 3. In addition to traffic volumes, 10 jurisdictions reported using the number of vehicles and drivers registered as exposure data. Saskatchewan Highways and Transportation and the Quebec Ministry of Transport reported the use of truck permits as an exposure variable.
Identifying Comparison Sites in Before and After Studies
Jurisdictions were questioned on their ability to link collision, traffic volume, and geometric databases at interchanges, intersections, and road sections. Most respondents reported some linking of data, although often not for all location types and all three data types. Maine, Minnesota, Nebraska, New York, Oklahoma, Maryland, and West Virginia reported being a part of a CODES project using hospital/Emergency Medical Services data as well.
Ranking of Issues
Respondents were asked to examine the nine issues in safety analysis and rank the top three in terms of how critical they are for enhancing highway safety analysis in their jurisdiction. The results are tabulated in Table 3. The most frequently cited issues were appropriate skills and resources, linking of collision and related databases, and underreporting of collisions.
TABLE 3 RANKING OF CRITICAL ISSUES IN SAFETY ANALYSIS Rankings Issue No. 1 No. 2 No. 3 Appropriate skills, resources 12 4 7 to conduct highway safety analysis Ability to link collision and 9 3 7 related databases Underreporting of collisions 7 2 1 Information on safety 4 8 5 effectiveness in developing countermeasures Adequacy of traffic 0 8 4 volume/exposure data Regression to the mean 1 0 4 Identifying comparison sites 0 3 0 in before and after studies Time trends in collision 0 2 1 experience Changes in traffic volume 0 1 2 in before and after studies
Typically, a guess is made as to how many sites are needed or as many comparison sites as possible are used. The Virginia DOT reported using a statistical method for determining the required number of comparison sites. This method compares the traffic volume, crash history, roadway characteristics, and geographic location of the sites. The North Carolina and California DOTs reported using statistical tests such as chi-square and the odds ratio to test the comparability of comparison groups.
Information on Safety Effectiveness in Developing Countermeasures
Most respondents reported having an established list of collision reduction factors for use. Some jurisdictions use solely outside sources, some use only information from before and after studies conducted within their jurisdiction, whereas others use a combination of the two. The confidence level was subjectively rated by respondents on a scale of 1 to 5, with 1 being very confident in the reduction factors. Most respondents rated their information as 2 or 3.
22
CHAPTER FOUR
CONCLUSIONS AND RECOMMENDATIONS

The importance of highway safety analysis is evidenced by the ever-growing body of literature on safety studies and the array of analytical tasks performed in state and local jurisdictions. Indeed, all jurisdictions responding to the survey have in place formal and informal highway safety improvement programs that require analysis of accident and related data. However, the level of knowledge of the special methodologies that are required to accommodate the peculiarities of highway safety data is not as high as might be desired. This is mainly because many of the difficulties in highway safety analysis have only been brought to light in the past 20 years or so. That research is on-going on methodology for addressing these difficulties is evidence of the complexity of highway safety analysis and emphasizes the need for analysts to constantly refresh their knowledge. The novelty of the special methods of analysis generated by relatively recent safety research has of necessity created a gap between research and practice, the narrowing of which requires a strong commitment on the part of those responsible for administering and conducting highway safety analysis. The most fundamental safety analyses conducted by jurisdictions relate to the identification of sites with promise, the development and prioritization of improvements, and the evaluation of treatments. Although there are in these aspects identifiable gaps between research and practice, it is encouraging that several jurisdictions are up to speed, so to speak, on the complexity of highway safety analysis. Particularly encouraging are the following positive features: Most jurisdictions are aware of the regression to the mean (RTM) problem and the associated difficulties caused by random fluctuation in accident counts. Most jurisdictions recognize the peculiarities of highway safety data and the need for special analytical methods to accommodate them. These peculiarities include poor quality of accident and traffic volume data, random fluctuation in accident data, the regression to the mean phenomenon and accident reporting differences across time and space. A few jurisdictions are starting to use advanced methods such as the empirical Bayes (EB) procedure and other analysis requiring the use of safety performance functions. Most jurisdictions have in place procedures for identifying hazardous locations that recognize the difficulties caused by the RTM phenomenon in using accident counts or accident rates alone for this purpose. Jurisdictions are conscious of the need for maintaining quality accident data and are constantly making efforts to improve the data collection process. To this end, several jurisdictions have in place or are developing a facility to easily link accident, traffic, and inventory data to create databases that would enable the application of the most advanced methods of highway safety analysis. Despite these positive aspects, much remains to be done to improve the state of practice through the use of the best available statistical methods. Areas where improvements could be made include: Before and after evaluations The selection of appropriate comparison groups; The specification and interpretation of uncertainty in results; The separation of effects due to the measure being evaluated from those due to other measures, traffic volume changes, changes in accident reporting practice, and other temporal changes; and The use of techniques such as the EB methodology to account for RTM. Other analyses The discontinuation of accident rate-based procedures and the adoption of more efficient techniques to minimize false positives and false negatives in the identification of sites for safety investigation, The use of accident modification factors based on sound evaluations and the application of safety performance functions in estimating the safety consequences of countermeasures and design decisions, and The careful use of information from crosssectional studies to establish the safety consequences of countermeasures and design changes. Accomplishing these improvements requires the availability of more reliable data, a commitment to provide analysts with the knowledge and resources to use the best available statistical methodology, and additional research. It is hoped that this Synthesis will go a long way toward providing jurisdictions with the knowledge of what it takes to bridge the gaps that exist between the current states of research and practice. Additional research needs to be conducted in the following areas:
23 The development of accident modification factors from both before and after evaluations and crosssectional studies; Simplification of advanced methodology for highway safety analysis, particularly the methods for conducting before and after studies; Methods for deciding when and where specific safety improvements are warranted; The development of safety performance functions including methods for transferring them across jurisdictions; The development of the most efficient methods for identifying sites for safety investigation; and The development of user-friendly software that would facilitate the application of the best available methods for highway safety analysis. There are currently significant initiatives aimed at fulfilling these outstanding research needs and at bridging the gaps between research and practice. Three new initiatives, which will provide highway agencies with the best available tools to conduct highway safety analyses, are NCHRP's Highway Safety Manual (HSM), FHWA's Comprehensive Highway Safety Improvement Model (CHSIM), and ongoing Interactive Highway Safety Design Model (IHSDM) research on accident prediction methodology. It should be noted that a major component of the CHSIM initiative is the accommodation of the training needs that it will of necessity create. Delivery of the products of these current initiatives is still some time in the future. In the meantime, it would be beneficial for agencies to undertake preparatory work for implementing the tools that would become available. Otherwise, the gap is likely to widen. In conclusion, the following are some specific recommendations arising from this synthesis effort: Jurisdictions should continue to emphasize the link between the quality of statistical analysis and the quality of data to all those responsible for data collection, from managers to field personnel. Formal training and refresher courses in statistical methods should be made available to all those charged with undertaking highway safety analyses. This training should be with specific reference to the special considerations required by the peculiarities of safety related data. Jurisdictions should make a special effort to keep abreast of the considerable current research aimed at facilitating highway safety analysis. These initiatives include FHWA's CHSIM and IHSDM, NCHRP's HSM, and the website being created under NCHRP 20-45: Statistical Approaches to Transportation Research. Jurisdictions should undertake institutional activities to create the environment that would facilitate the implementation of the tools that would become available. These activities include training of personnel and addressing any deficiencies regarding data collection and accessibility. Where sufficiently trained personnel are unavailable to undertake valid statistical analyses, qualified consultants should be hired for this task. In addition, the acquisition of knowledge from the safety literature should only be undertaken by those with sufficient background in statistical methods, particularly the pitfalls in applying these methods.
24
REFERENCES
Bauer, K. and D. Harwood, Statistical Models of At-Grade Intersection Accidents, FHWA-RD-96-125, Federal Highway Administration, Washington, D.C., 1996. Bauer, K. and D. Harwood, Statistical Models of Accidents on Interchange Ramps and Speed Change Lanes, FHWA-RD-97-106, Federal Highway Administration, Washington, D.C., 1998. Bonneson, J. and P. McCoy, Estimation of Safety at TwoWay STOP-Controlled Intersections on Rural Highways, Transportation Research Record 1401, Transportation Research Board, National Research Council, Washington, D.C., 1993, pp. 8389. Breiman, L., J. Freidman, R. Olshen, and C. Stone, Classification and Decision Trees, Wadsworth International Group, Belmont, Calif., 1984. Council, F., Safety Benefits of Spiral Transitions on Horizontal Curves on Two-Lane Rural Roads, Transportation Research Record 1635, Transportation Research Board, National Research Council, Washington, D.C., 1998, pp. 1017. Council, F. and J. Stewart, Safety Effects of Conversion of Rural Two-Lane to Four-Lane Roadways Based on Cross-Sectional Models, Transportation Research Record 1665, Transportation Research Board, National Research Council, Washington, D.C., 1999, pp. 3543. Davis, G., Accident Reduction Factors and Causal Inference in Traffic Safety Studies: A Review, Accident Analysis and Prevention, Vol. 32, No. 1, pp 95109, 2000. Elvik, R., F. Amundsen, and F. Hofset, Road Safety Effects of Bypasses, Presented at the 80th Annual Meeting of the Transportation Research Board, Washington, D.C., 2001. Finison, K. and R. DuBrow, Analysis of Maine Crashes Involving Vehicles that Ran Off the Road, Maine Health Information Center Report, Manchester, November 1998. Finison, K., Standardized Reporting Using Codes (Crash Outcome Data Evaluation System), NHTSA Report DOT HS 809 048, National Highway Traffic Safety Administration, Washington, D.C., April 2000. Griffin, L. and R. Flowers, A Discussion of Six Procedures for Evaluating Highway Safety Projects, Draft Report for the Federal Highway Administration, Washington, D.C., 1997. Griffith, M., Safety Evaluation of Continuous Rolled-In Rumble Strips Installed on Freeways, Transportation Research Record 1665, Transportation Research Board, National Research Council, Washington, D.C., 1999, pp. 2834. Hanley, K., A. Gibby, and T. Ferrara, Analysis of Accident Reduction Factors on California State Highways,
Transportation Research Record 1717, Transportation Research Board, National Research Council, Washington, D.C., 2000, pp. 3745. Harwood, D., F. Council, E. Hauer, W. Hughes, and A. Vogt, Prediction of the Expected Safety Performance of Rural Two-Lane Highways, Report FHWA-RD-99-207, Federal Highway Administration, Washington, D.C., 2000. [Online]. Available: http://www.tfhrc.gov/safety/ 99207.htm. Hauer, E., Observational Before-After Studies in Road Safety: Estimating the Effect of Highway and Traffic Engineering Measures on Road Safety, Pergamon Press, Elsevier Science Ltd., Oxford, U.K., 1997. Hauer, E. and B. Persaud, A Common Bias in Before-andAfter Comparisons and Its Elimination, Transportation Research Record 905, Transportation Research Board, National Research Council, Washington, D.C., 1983, pp. 164174. Hauer, E. and B. Persaud, Safety Analysis of Roadway Geometry and Ancillary Features, Transportation Association of Canada Research Report, Ottawa, 1996. Hauer, E. and J. Bamfo, Two Tools for Finding What Function Links the Dependent Variable to the Explanatory Variables, Proceedings of the ICTCT 1997 Conference, Lund, Sweden, 1997. Hingson, R., T. Hereen, and M. Winter, Lowering State Legal Blood Alcohol Limits to 0.08%: The Effect on Fatal Motor Vehicle Crashes, American Journal of Public Health, Vol. 86, No. 9, 1996, pp. 12971299. Hunter, W., J. Stutts, W. Pein, and C. Cox, Pedestrian and Bicycle Crash Types of the Early 1990's. FHWA-RD-95163, Federal Highway Administration, Washington, D.C., 1996. Institute of Transportation Engineers, Traffic Safety Toolbox, ITE, Washington, D.C., 1999. Kulmala, R., Safety at Three- and Four-Arm Junctions: Development and Applications of Accident Prediction Models, VTT Publication 233, Technical Research Center of Finland, Espoo, 1995. Lord, D., The Prediction of Accidents on Digital Networks: Characteristics and Issues Related to the Application of Accident Prediction Models, Ph.D. thesis, Department of Civil Engineering, University of Toronto, 2000. McCullagh, P. and J. Nelder, Generalized Linear Models, 2nd Ed., Chapman and Hall, London, 1989. Miaou, S., Measuring the Goodness of Fit of Accident Prediction Models, Report FHWA-RD-96-040, Federal Highway Administration, Washington, D.C., 1996. Miller, T.R., et al., The Costs of Highway Crashes, Urban Institute and Federal Highway Administration, Washington, D.C., 1991.
25 O'Day, J., Synthesis of Highway Practice 192: Accident Data Quality, Transportation Research Board, National Research Council, Washington, D.C., 1993, 48 pp. Paniati, J. and J. True, Interactive Highway Safety Design Model (IHSDM): Designing Highways with Safety in Mind, Transportation Research Circular 212, Transportation Research Board, National Research Council, Washington, D.C., 1996, pp. 5560. Pendleton, O., Application of New Accident Analysis Methodologies: Volume IGeneral Methodology, Report FHWA-RD-90-091; Volume IIA Users Manual for BEATS, Report FHWA-RD-91-014; Volume III Theoretical Development of New Accident Analysis Methodology, Report FHWA-RD-91-015, Federal Highway Administration, Washington, D.C., 1991. Pendleton, O., Evaluation of Accident Analysis Methodology, Report FHWA-RD-96-039, Federal Highway Administration, Washington, D.C., 1996. Persaud, B., Accident Prediction Models for Rural Roads, Canadian Journal of Civil Engineering, Vol. 21, No. 4, August 1994. Persaud, B., E. Hauer, R. Retting, R. Vallurupalli, and K. Mucsi, Crash Reductions Following Traffic Signal Removal in Philadelphia, Accident Analysis and Prevention, Vol. 29, No. 6, 1997, pp. 803810. Persaud, B.N. and T. Nguyen, Disaggregate Safety Performance Models for Signalized Intersections on Ontario Provincial Roads, Transportation Research Record 1635, Transportation Research Board, National Research Council, Washington, D.C., pp. 113120, 1998. Persaud, B., C. Lyon, and T. Nguyen, Empirical Bayes Procedure for Ranking Sites for Safety Investigation by Potential for Safety Improvement, Transportation Research Record 1665, Transportation Research Board, National Research Council, Washington, D.C., 1999, pp. 712. Persaud, B. and G. Bahar, Applications of Safety Performance Functions in the Management of Highway Safety, Proceedings, Transportation Specialty Conference, Canadian Society of Civil Engineers, London, Ontario, June 2000. Persaud, B.N., R. Retting, P. Garder, and D. Lord, Observational Before-After Study of the Safety Effect of U.S. Roundabout Conversions Using the Empirical Bayes Method, Presented at the 80th Annual Meeting of the Transportation Research Board, Washington, D.C., 2001. Poch, M. and F. Mannering, Negative Binomial Analysis of Intersection Accident Frequencies, Presented at the 75th Annual Meeting of the Transportation Research Board, Washington, D.C., 1996. Sawalha, Z., T. Sayed, and M. Johnson, Factors Affecting the Safety of Urban Arterial Roadways, Presented at the 78th Annual Meeting of the Transportation Research Board, Washington, D.C., 1999. Sayed, T. and F. Rodriguez, Accident Prediction Models for Urban Unsignalized Intersections in British Columbia, Transportation Research Record 1665, Transportation Research Board, National Research Council, Washington, D.C., 1999, pp. 9399. Sayed, T., W. Abdelwahab, and J. Nepomuceno, Safety Evaluation of Alternative Signal Head Design, Transportation Research Record 1635, Transportation Research Board, National Research Council, Washington, D.C., 1998, pp. 140146. Scopatz, R., Methodological Study of Between-States Comparisons, with Particular Application to .08% BAC Law Evaluation, Presented at the 77th Annual Meeting of the Transportation Research Board, Washington, D.C., 1998. Sebastian, K., Collision Analysis of Left-Turn Maneuvers at Signalized Intersections, Presented at the 78th Annual Meeting of the Transportation Research Board, Washington, D.C., 1999. Stewart, D.E., Methodological Approaches for the Estimation, Evaluation, Interpretation and Accuracy Assessment of Road Travel Basic Risk, Relative Risk and Relative Risk Odds-Ratio Performance Measure Indicators: A Risk Analysis and Evaluation System Model for Measuring, Monitoring, Comparing and Evaluating the Level(s) of Safety on Canada's Roads and Highways, Report TP 13238, Transport Canada, Ottawa, May 1998. Sung, N., W. Taylor, and V. Melfi, Another Look at Identifying Hazardous Sites (Based on the Negative Binomial Distribution), Presented at the 80th Annual Meeting of the Transportation Research Board, Washington, D.C., 2001. Tarko, A., S. Aranky, and K. Sinha, Methodological Considerations in the Development and Use of Crash Reduction Factors, Presented at the 77th Annual Meeting of the Transportation Research Board, Washington, D.C., 1998. Tarko, A., S. Aranky, K. Sinha, and R. Scienteie, Crash Reduction Factors for Improvement Projects on Road Sections in Indiana, Presented at the 78th Annual Meeting of the Transportation Research Board, Washington, D.C., 1999. Vogt, A. and J. Bared, Accident Models for Two-Lane Rural Roads: Segments and Intersections, Report FHWARD-98-133, Federal Highway Administration, Washington, D.C., 1998. [Online]. Available: http://www. tfhrc.gov/safety/pubs.htm. Vogt, A., Crash Models for Rural Intersections: Four-Lane by Two-Lane Stop-Controlled and Two-Lane by TwoLane Signalized, Report FHWA-RD-99-128, Federal Highway Administration, Washington, D.C., 1999. [Online]. Available: http://www.tfhrc.gov/safety/pubs.htm. Wang, J., The Application of an Improved Accident Analysis for Highway Safety Evaluations, Report FHWA-RD-94082, Federal Highway Administration, Washington, D.C., 1994.
26 Wang, J., W. Hughes, and J. Stewart, Safety Effects of Cross-Section Design on Rural Multilane Highways, HSIS Summary Report, FHWA Publication FHWA-RD97-027, Federal Highway Administration, Washington, D.C., 1997. Washington, S., Conducting Statistical Tests of Hypotheses: Five Common Misconceptions Found in Transportation Research, Transportation Research Record 1665, Transportation Research Board, National Research Council, Washington, D.C., 1999, pp. 16. Washington, S., Iteratively Specified Tree-Based Regression Models: Theoretical Development and Example Applied to Trip Generation, Journal of Transportation Engineering, Vol. 126, No. 6, Nov./Dec., 2000, pp. 482491. Yuan, F. and J. Ivan, Safety Benefits of Intersection Approach Realignment on Rural Two-Lane Highways, Presented at the 80th Annual Meeting of the Transportation Research Board, Washington, D.C., 2001. Yuan, F., J. Ivan, C. Davis, and N. Garrick, Estimating Benefits from Specific Highway Safety Improvements: Phase 1: Feasibility Study, Presented at the 78th Annual Meeting of the Transportation Research Board, Washington, D.C., 1999. Zegeer, C.V., J. Hummer, L. Herf, D. Reinfurt, and W. Hunger, Safety Cost-Effectiveness of Incremental Changes in Cross-Section DesignInformational Guide, FHWA/RD-87/094, Washington, D.C., 1987. Zegeer, C. and F. Council, Safety Effects Associated with Cross-Sectional Elements, Transportation Research Record 1512, Transportation Research Board, National Research Council, Washington, D.C., 1994. Zegeer, C., R. Stewart, D. Reinfurt, F. Council, T. Neuman, E. Hamilton, T. Miller, and W. Hunter, Cost Effective Geometric Improvements for Safety Upgrading of Horizontal Curves, Report FHWA-RD-90-021, Federal Highway Administration, Washington, D.C., 1991.
27
GLOSSARY
Accident modification factor (AMF)An index of how much accident experience is changed following a change in design or traffic control. It is the ratio of accidents per unit of time expected after the change to that expected without the change. Accident prediction modelA mathematical equation that predicts (estimates) the number of accidents, usually per year at a site (intersection or road section), based on the site's traffic volume and design characteristics. Accident rate (collision rate)The number of accidents (collisions) per unit of exposure. For an intersection this is typically the number of accidents divided by the total entering AADT. For road sections this is typically the number of accidents per million vehicles per kilometers or miles traveled on a section. Annual Average Daily Traffic (AADT)The estimated total traffic volume in 1 year divided by 365. Before and after studyA study in which the accident experience and other factors before a site or group of sites is changed is compared to the accident experience after the change in order to estimate the safety effect of the change. BlackspotA site that is identified on the basis of its characteristics and accident experience as potentially in need of safety improvements. Comparison groupA group of sites used in before and after studies, which are untreated but are similar to the treated sites. The comparison group is used to control for changes in safety other than those due to a treatment. Cross-sectional studyA study in which the accident experience of different sites is examined and differences in accident experience among sites are attributed to differences in specific site characteristics. Empirical Bayes (EB) methodologyA procedure that is used to estimate the long-term annual number of accidents at a site using a weighted average of the site's short-term accident count and the average accident experience of similar sites.
Generalized linear modelingRegression analysis used in situations where the data for the dependent variable (usually accident counts) do not follow a Normal distribution or where a transformation needs to be applied before a linear model can be fitted. Multivariate modelA term used in recent safety literature to describe an accident prediction model (or safety performance function) that relates accident experience to several independent variables. Negative binomial regressionThe process of developing regression models of accident experience in which accident counts are assumed to follow a negative binomial distribution. Network screeningThe process by which a road network is screened to identify sites that require safety investigation. Poisson regressionThe process of developing regression models of accident experience in which accident counts are assumed to follow a Poisson distribution. Reference population (group)The population of sites to which a treated site is assumed to belong. These sites are used in before and after studies for establishing the number of accidents expected at sites similar to a treated one. Regression to the mean (RTM)A phenomenon whereby sites with an unusually large accident count in one time period will, on average, experience a reduction in accidents in a subsequent period, and vice versa. Safety performance functionEssentially what some analysts call an Accident Prediction Model. This is a mathematical equation that predicts (estimates) the number of accidents, usually per year, at a site (intersection or road section) based on the site's traffic volume and design characteristics. Sites with promiseA term used in recent safety literature to describe sites sometimes referred to as blackspots sites identified on the basis of their characteristics and accident experience as potentially in need of safety improvements.
28
APPENDIX A
Survey Questionnaire

Project 20-5, Topic 31-02 STATISTICAL METHODS IN HIGHWAY SAFETY ANALYSIS
QUESTIONNAIRE
Attached is a questionnaire seeking information on current practices regarding highway safety analysis conducted in your jurisdiction, and the statistical methods used in these studies. The focus is on the type of safety analyses required to support traditional highway engineering functions such as the identification of hazardous locations, and the development and evaluation of countermeasures. Analyses related specifically to driver and vehicle safety are not covered by this survey. Over the past decade, considerable progress has been made in the development and application of appropriate statistical methods in highway safety analysis. While application of these methods is growing, difficulties are posed by the non-ideal conditions that often arise. There is a concern that the required software and information to adequately address these difficulties is not readily available to practicing highway safety analysts. The synthesis effort will seek to bridge any gaps that exist between the state of knowledge and the state of practice in highway safety analysis. To this end, a survey of current practices in jurisdictions is vital. The survey seeks details on how safety analysis is conducted in your jurisdiction and attempts to identify crucial issues and needs. The questionnaire should be filled out by person or persons most familiar with details of highway safety analyses carried out in your jurisdiction. Where such analyses are not usually carried out in-house, it is expected that your jurisdiction will still provide the best answers possible along with information (under Question 6b) on contractors that undertake these studies. These contractors may be contacted later on by the synthesis coordinator for follow-up information. Please return the completed questionnaire and supporting documents by May 15, 2000 to:
Dr. Bhagwant Persaud Department of Civil Engineering Ryerson Polytechnic University 350 Victoria Street, Toronto M5B 2K3 Canada
You may fax your response to him at 416-979-5122.
If you have any questions, you may contact him by telephone [416-979-5345, extension 6464, or 416-622-3672, or by e-mail (bpersaud@acs.ryerson.ca)].
29
PART I: GENERAL INFORMATION
1.
Agency name, mailing, website addresses:
2. 3. 4. 5.
Your name, title and contact information (office address, phone, fax, e-mail): Agency jurisdiction or responsibility (county, state, city, etc.): Approximate road mileage (by road classification if possible): Indicate if miles or kilometers. Population of jurisdiction:
6a. Types of safety analyses undertaken in past 5 years using collision data/models. (Identify for each whether this is done primarily in-house or by consultants.) Mostly in-house Mostly by others
1. Before and after evaluations 2. Identification of hazardous locations 3. Cost benefit analyses in development of countermeasures 4. Analysis of collision trends 5. Collision rate comparisons of locations with different features 6. Cross-sectional evaluations 7. Comparison group evaluations 8. Risk estimation/analyses/evaluations 9. Other (identify in space below)
For each type of analysis, append list of titles/references for published documents and provide samples of documents available only in-house 6b. For analysis types undertaken primarily by consultants, provide contact information for consultant that has done the most work in the past 5 years.
Study Type #
Consultant contact (company name, contact phone and e-mail address)
30
PART II: DETAILS OF SAFETY ANALYSES UNDERTAKEN IN PAST 5 YEARS A. Before And After Evaluations Of Countermeasures
7. Check box(es) for technique(s) used in before and after evaluations. Empirical Bayes Simple before and after comparison of collision frequency Simple before and after comparison of collision rates Analysis of collision diagrams before and after Traffic conflict techniques Other (Identify) 8. Check boxes for statistical tools used in before and after evaluations. Use of a comparison group Statistical verification of validity of comparison group Time series analysis Significance tests (e.g., t-test, chi-square test, F-test) Collision prediction regression models Log odds-ratio method Likelihood functions Logistic regression Sampling techniques Other (specify)
9. Answer the following if you do have a guideline or recommended practice for the length of pre and post periods for countermeasure evaluations: How many years of data do you consider as a minimum? Before:____years; After:____years How many years of data do you consider as a maximum? Before:____years; After:____years
B. Identification Of Blackspots (High Collision Locations)
10a Check box for technique used in the systematic screening of the road network to identify potentially hazardous locations for further investigation. Number of collisions Collision rate Combination of collision rate and frequency Collision prediction model Empirical Bayes technique (Please provide reference or documentation) Risk estimation/analyses/evaluation methods (Please provide reference or documentation) Hazard Index (Describe briefly and provide documentation of procedure if possible)
Other: (Describe briefly and provide documentation of procedure if possible)
10b. How often is the screening process carried out?
Every_____years
31
C. Develop Collision Reduction Factors For Countermeasures/Features
11. Check box(es) for technique(s) used to develop collision reduction factors. Literature review Before and after evaluation of countermeasures implemented in your jurisdiction Regression models for locations in your jurisdiction Comparison of collision experience at locations with and without feature Risk/effectiveness evaluation Value engineering exercise Other: Describe briefly
D. Research
12. In the boxes below, rank the subject areas for safety research carried out by your jurisdiction in the past 10 years, in terms of the level of effort expended. (Use Rank 1 for highest level of effort.) Safety effect of countermeasures Develop procedures for safety analyses State of the art reports on safety knowledge Develop multivariate models Identification of high risk road travel patterns and issues Other (Identify) 13. Identify basic statistical tools used in research projects in the past 10 years. Analysis of variance Weighted least squares regression Logistic regression Poisson/negative binomial regression ARIMA modeling Log odds-ratio methods Other (Identify) Ordinary least squares regression Sampling techniques Multinomial probit models Empirical Bayes procedures Dimensional analysis Conditional probability analysis
E. All Studies
14. Software used (Identify which of the following have been used for safety analyses). SAS STATPAK SPSS MINITAB GENSTAT GLIM EBEST HISAM HISAFE LIMDEP ROADSIDE/RSAP CART BMDP MICROBENCOST WESVAR KNOWLEDGESEEKER SUDAAN SPLUS Other, including programs developed in-house (Identify and describe typical application)
32 15. Identify other resources used as a basis for statistical analysis of collision data. Ezra Hauer's book: Observational before-after studies in road safety ITE's Traffic Safety Toolbox IHSDM preliminary materials FHWA/NCHRP reports (Identify) Text books on statistics (Identify) Other (Identify below)
PART III: PROBLEMS/ISSUES IN SAFETY ANALYSES Issue 1: Underreporting of collisions
16. What presently are the minimum criteria for collisions to be reported in your database? Minimum property damage level (identify $ level) Definite injury Possible injury Tow away of vehicle None Other (explain) 17a. Does your jurisdiction currently allow for self-reporting of crashes? 17b. If so, identify starting year: _____________. 17c. What are the criteria for determining when a crash should be self- or police reported? 18. How have reporting criteria varied over time? (e.g., definition of what is reportable, change from all police reporting to some self-reporting, and time frames associated with these changes). Y N
19. Describe any other time trends in reporting practice? 20a. What difficulties are posed in your safety analyses by collision under-reporting?
20b. How do you handle these difficulties analytically? Ignore PDO collisions Assume that reporting levels are constant Ceased doing collision analysis Other (Explain)
Issue 2: Time trends in collision experience 21. Identify and describe the types of time trends, other than trends in reporting practice, that affect collision experience (e.g., changes resulting from jurisdiction-wide improvement/degradation of safety).
33 22. What types of safety analyses are affected by time trends in collision experience? For each provide a brief description of how you account for the time trends. Before and after evaluations Identification of hazardous locations Risk estimation/analyses/evaluations Development of collision reduction factors Development/application of collision prediction models
Issue 3: Changes in traffic volumes in before and after studies 23. Are both before and after volumes typically available for treated locations? Y N
24. Identify how you analytically adjust for traffic volume changes in evaluating the safety effect of a countermeasure. Adjustments based on collision rates Adjustment based on collision prediction models/safety performance functions No adjustments made Other (Explain)
Issue 4: Regression to the mean (i.e., the problem due to randomly high count of collisions making a site appear more unsafe than it really is) 25. Are you aware of the difficulties posed by this phenomenon? Y N
26a. Do you conduct analyses in which difficulties are posed by this phenomenon? Y N
26b. If so, how do you account for regression to the mean in before and after evaluation and in the identification of hazardous locations? (Provide reference to documented procedures.)
Issue 5: Adequacy of traffic volume/exposure data
27. What types of safety analyses are traffic volume data used for? Before and after evaluations of countermeasures Identification of hazardous locations Calculation of accident rates Development/application of accident prediction models Risk estimation/analyses/evaluations 28. How would you rank the quality/availability of traffic volume data used in your traffic safety analyses? Use a scale of 1 to 5 with 1 being high quality.
34
29. Have you used any exposure measures other than traffic volume? If yes, identify: Number of vehicles registered Number of truck permits Other (Explain)
Number of drivers registered Toll receipts
Issue 6: Identifying comparison sites in before and after studies
30. If you use comparison sites in before and after safety evaluations, what method/rationale do you use to determine how many comparison sites are needed? Use as many as are available Formal statistical method (Give reference) Guestimate 31. Are you able to test for comparability of treatment and comparison groups? If so, provide a brief description of how you do this. Y N
32. Identify difficulties in identifying a comparison group (Rank with 1 being most crucial). Insufficient numbers of suitable locations Possible comparison sites are affected by treatment due to collision/traffic migration All similar sites are treated leaving no sites available for comparison Impossible to do random assignment to treatment and comparison groups Limited resources for data collection Data only/mainly available for treatment group Missing information on important variables Other (Identify)
Issue 7: Information on safety effectiveness in developing countermeasures
33. Do you have an established list of collision reduction/modification factors? If yes, estimate the following: % from in-house studies =
% from outside sources =
34. Rate your overall confidence level in the collision reduction factors (Scale of 1 to 5 with 1 being very confident): In-house studies: Studies from outside sources:
35. Of those collision reduction factors that come from in-house studies, provide rough estimates of the percentage that comes from before and after evaluations and the percentage that are based on cross-sectional analyses of data or from multivariate regression models. Before and after evaluations = % Cross-sectional/regression models = %
35
Issue 8: Appropriate skills, resources to conduct highway safety analysis
36. What is the highest level of university education in your group of safety analysts? PhD Master's degree Bachelor's degree
37. How many employees in your jurisdiction currently perform safety analysis of collision data?
38. How many of your safety analysts have university degrees or other formal training in statistics or related mathematical fields?
39. Do you feel that a lack of sufficient numbers of appropriately skilled personnel has hampered your ability to conduct safety studies? Y N 40a. Do you feel that your ability to conduct safety analysis is hampered by the lack of readily available information on the proper conduct of these studies? Y N 40b. If yes, please elaborate on information needs:
Issue 9: Ability to link collision and related databases
41a. Do you have a facility for automatically linking collision and other databases relevant to safety analyses? Y N 41b. If so, identify the location types for which the following linked databases exist: Interchanges Intersections Road sections
Collision and traffic volume Collision and geometric/inventory Collision, traffic and geometric/inventory 42a. Do you use hospital/EMS data?
42b. Have you been able to link hospital/EMS data to any other pertinent databases? Explain.
Ranking of Issues 1 to 9
43. For the issues 1 to 9 above (copied below), rank in order the top 3 in terms of how critical they are for enhancing highway safety analysis in your jurisdiction. Underreporting of collisions Ability to link collisions and related databases Appropriate skills, resources to conduct highway safety analysis Information on safety effectiveness in developing countermeasures Identifying comparison sites in before and after studies Time trends in collision experience Adequacy of traffic volume/exposure data Regression to the mean Changes in traffic volumes in before and after studies
36
APPENDIX B
Summary of Survey Responses
A discussion of the results of the survey questionnaire is provided in chapter 3, which also discusses particular practices undertaken by respondents. This appendix provides a more detailed review of survey responses focusing on the number of responses to each survey question. Unfortunately, not all questions received a satisfactory number of responses, or in some cases, meaningful responses. Despite an effort to make the questionnaire as comprehensible as possible, it is clear that some questions were not well understood by some respondents. As a result, the number of responses does not always add up to the total number of respondents. However, overall the responses were of high quality and provided very valuable information on practices in highway safety analysis. The key responses to the questionnaire are summarized and tables provided where appropriate under the subject areas of General Information, Details of Safety Analyses, and Problems/Issues in Safety Analyses.
GENERAL INFORMATION
Q. Respondents were asked to provide the approximate road mileage, population, and what types of safety analyses have been undertaken in the past 5 years in their jurisdiction, both in-house and through consultants. Respondents were also invited to submit safety projects they have recently undertaken.
37
38
39
40
41
DETAILS OF SAFETY ANALYSIS UNDERTAKEN IN PAST 5 YEARS BEFORE AND AFTER EVALUATIONS OF COUNTERMEASURES
Q7. Check boxes for techniques used in before and after evaluations. Analysis Technique Used Empirical Bayes Simple before and after of collision frequency Simple before and after comparison of collision rates Analysis of collision diagrams before and after Traffic conflict techniques Number of responses 4 31 29 27 7 % of respondents 12 94 88 82 21
Q8. Check boxes for statistical tools used in before and after evaluations. Statistical Tools Used Use of a comparison group Statistical verification of validity of comparison group Time series analysis Significance tests (e.g., t-test, chi-square test, F-test) Collision prediction regression models Log odds-ratio methods Likelihood functions Logistic regression Sampling techniques Traffic conflict techniques Number of responses 11 5 7 13 8 2 1 2 4 5 % of respondents 33 15 21 39 24 6 3 6 12 15
Q9. Answer the following if you do have a guideline or recommended practice for the length of pre and post periods for countermeasure evaluations: How many years of data do you consider as a minimum? How many years of data do you consider as a maximum? Number of years for before and after periods 1 2 3 4+ Number of responses After period Before period minimum maximum 9 7 13 1 12 14
Before period minimum 9 8 12 1
After period maximum 11 12 14
42 IDENTIFICATION OF HIGH COLLISION LOCATIONS Q10a. Check box for technique used in the systematic screening of the road network to identify potentially hazardous locations for further investigation. Technique Used Number of collisions Collision rate Combination of collision rate and frequency Collision prediction model Empirical Bayes technique Risk estimation/analyses/evaluation methods Hazard Number of responses 25 22 26 3 2 1 11 % of respondents 76 67 79 9 6 3 33
Q10b.
How often is the screening process carried out? Number of years <1 1 2 Number of responses 3 27 3 % of respondents 9 82 9
TECHNIQUES USED IN DEVELOPING COLLISION REDUCTION FACTORS
Q11
Check boxes for techniques used to develop collision reduction factors. Technique Used Literature review Before and after evaluation of countermeasures implemented in your jurisdiction Regression models for locations in your jurisdiction Comparison of collision experience at locations with and without feature Risk/effectiveness evaluation Value engineering exercise Number of responses 24 22 % of respondents 73 67
3 11 1 3
9 33 3 9
43
RESEARCH
Q12. In the boxes below, rank the subject areas for safety research carried out by your jurisdiction in the past 10 years, in terms of the level of effort expended. (Use Rank 1 for highest level of effort.) Research Activity Safety effect of countermeasures Development of procedures for safety analysis State-of-the-art reports on safety knowledge Development of multivariate models Identification of high risk road travel patterns and issues No. 1 rankings 9 10 3 1 7 No. 2 rankings 7 3 6 1 5 No. 3 rankings 7 7 7 2 2 No. 4 rankings 3 2 3 5 4 No. 5 rankings 1 0 2 7 3
Q13. Identify basic statistical tools used in research projects in the past 10 years. Techniques used in research projects Analysis of variance Ordinary least squares regression Weighted least squares regression Logistic regression Poisson/negative binomial regression Sampling techniques Empirical Bayes procedures ARIMA modeling Log odds-ratio methods Dimensional analysis Conditional probability analysis Number of responses 8 7 3 2 9 5 5 0 1 0 0 % of responses 24 21 9 6 27 15 15 0 3 0 0
Q14. Software used (Identify which of the following have been used for safety analyses). Software used in research projects SAS EBEST LIMDEP GENSTAT ROADSIDE/RSAP MICROBENCOST SPSS GLIM HISAFE Number of responses 12 1 1 1 5 5 4 2 1 % of responses 36 3 3 3 15 15 12 6 3
44 Q15. Identify other resources used as a basis for statistical analysis of collision data. Respondents frequently reported FHWA and NCHRP documents as safety resources although few were specified. Specific resources included: Ezra Hauer's book Observational Before-After Studies in Road Safety FHWA report The Cost of Highway Crashes Office of Safety and Traffic Operations FHWA-RD-99-53 Assessment of Techniques for Cost Effectiveness of Highway Accident Countermeasures FHWA-TS-18-219 NCHRP Report 162 Methods for Evaluating Highway Safety Improvements Institute of Transportation Engineers Traffic Safety Toolbox Safety Cost Effectiveness of Incremental Change in Cross-section Design and Crash Models for Rural Intersections (Montana) TAC Safety Analysis of Roadway Geometry and Ancillary Features1997 Interactive Highway Safety Design Model preliminary materials.
PROBLEMS/ISSUES IN SAFETY ANALYSES UNDER-REPORTING OF COLLISIONS
Q16.
What presently are the minimum criteria for collisions to be reported in your database? Minimum criteria for reporting collisions <$500 damage $500 damage $750 damage $1000 damage Definite Injury Possible Injury Tow away of vehicle None Number of respondents 4 7 3 16 5 8 5 1 % of respondents 12 21 9 48 15 24 15 3
Q17a.
Does your jurisdiction currently allow for self-reporting of crashes? yes 15 % of respondents 45
Q17b.
If so, identify starting year. Self-reporting, where permitted has been implemented in the 1980s and 1990s.
Q17c.
What are the criteria for determining when a crash should be self- or police reported? Self-reporting is permitted when an injury does not occur and the property damage sustained is over a given amount.
Q18.
How have reporting criteria varied over time? (e.g., definition of what is reportable, change from all police reporting to some self-reporting, and time frames associated with these changes). There has been an increase in the minimum level of property damage for required reporting.
45 Q19. Describe any other time trends in reporting practice? Time trends have included less reporting by police and changes in collision report forms. Q20a. What difficulties are posed in your safety analyses by collision under-reporting? The only difficulty reported was comparing locations and making comparisons across time. Q20b. How do you handle these difficulties analytically? Analytical method Ignore PDO collisions Assume that reporting levels are constant Ceased doing collision analysis Number of responses 6 15 0 % of respondents 18 45 0
TIME TRENDS IN COLLISION EXPERIENCE
Q21.
Identify and describe the types of time trends, other than trends in reporting practice, that affect collision experience (e.g., changes resulting from jurisdiction-wide improvement or degradation of safety). Most respondents did not list any time trends that affect collision experience. Those trends that were given included speed limit legislation, improved vehicle safety, the increased use of seat belts, graduated licensing, enforcement practices, emergency response, and physical changes between urban/suburban and rural environments.
Q22.
What types of safety analyses are affected by time trends in collision experience? For each provide a brief description of how you account for the time trends. Safety analysis Before and after evaluations Identification of hazardous locations Risk estimation/analyses/evaluations Development of collision reduction factors Development/application of collision prediction models Number of respondents 24 20 4 8 4 % of respondents 73 61 12 24 12
No method of accounting for time trends was given under this question other than using multiple years of data.
CHANGES IN TRAFFIC VOLUMES IN BEFORE AND AFTER STUDIES
Q23.
Are both before and after volumes typically available for treated locations? yes 30 % of respondents 91
46
Q24.
Identify how you analytically adjust for traffic volume changes in evaluating the safety effect of a countermeasure. Method used for adjusting for traffic volumes Adjustments based on collision rates Adjustment based on collision prediction models/safety performance functions No adjustments made Number of respondents 22 4 9 % of respondents 67 12 28
REGRESSION TO THE MEAN
Q25.
Are you aware of the difficulties posed by this phenomenon? yes 28 % of respondents 85
Q26a.
Do you conduct analyses in which difficulties are posed by this phenomenon? yes 19 % of respondents 58
Q26b.
If so, how do you account for regression to the mean in before and after evaluation and in the identification of hazardous locations? Most respondents reported using multiple years of data. California Department of Transportation and Ministere des Transports du Quebec reported the use of the Empirical Bayes method. The Nebraska Department of Highways reported the use of significant tests.
ADEQUACY OF TRAFFIC VOLUME/EXPOSURE DATA
Q27.
What types of safety analyses are traffic volume data used for? Types of safety analyses traffic volume is used for Before and after evaluations of countermeasures Identification of hazardous locations Calculation of accident rates Development/application of accident prediction models Risk estimation/analyses/evaluations Number of respondents 33 33 33 7 5 % of respondents 100 100 100 21 15
47
Q28.
How would you rank the quality/availability of traffic volume data used in your traffic safety analyses? Use a scale of 1 to 5 with 1 being high quality. Ranking of traffic volume data quality 1 2 3 4 5 Number of respondents 8 9 13 2 1 % of respondents 24 27 39 6 3
Q29.
Have you used any exposure measures other than traffic volume? If yes, identify. Other exposure measures used Number of vehicles registered Number of drivers registered Number of truck permits Number of respondents 10 10 2 % of respondents 30 30 6
IDENTIFYING COMPARISON SITES IN BEFORE AND AFTER STUDIES
Q30.
If you use comparison sites in before and after safety evaluations, what method/rationale do you use to determine how many comparison sites are needed? Method for selecting comparison sites Use as many as are available Formal statistical method Guestimate Number of respondents 13 3 9 % of respondents 39 9 27
Q31.
Are you able to test for comparability of treatment and comparison groups? If so, provide a brief description of how you do this. yes 6 % of respondents 18
The California Department of Transportation reported the use of chi-squared tests on table counts of treatment and comparison groups for the before period. The North Carolina Department of Transportation reported the use of the chi-squared test and odds ratio. The Mississippi Department of Transportation reported a comparison of collision rates between sites.
48
Q32.
Identify difficulties in identifying a comparison group (Rank with 1 being most crucial). Few respondents rated all options for difficulties in identifying comparison groups. As such, the following table shows the number of mentions for each option regardless of rank given. Difficulties in identifying comparison groups Insufficient numbers of suitable locations Possible comparison sites are affected by treatment due to collision/traffic migration All similar sites are treated leaving no sites available for comparison Impossible to do random assignment to treatment and comparison groups Limited resources for data collection Data only/mainly available for treatment group Missing information on important variables Number of respondents 17 11 8 8 22 11 16 % of respondents 52 33 24 24 67 33 48
INFORMATION ON SAFETY EFFECTIVENESS IN DEVELOPING COUNTERMEASURES
Q33.
Do you have an established list of collision reduction/modification factors? Number of respondents 25 % of respondents 76
If yes, estimate the % from in-house studies and % from outside sources % range 020 2140 4160 6180 81100 Number of respondents from in-house studies 17 2 2 2 5 Number of respondents from outside sources 5 2 2 2 17
Q34.
Rate your overall confidence level in the collision reduction factors (Scale of 1 to 5 with 1 being very confident). Confidence level in collision reduction factors used (1 being very confident) 1 2 3 4 5 Number of respondents 13 12 3 % of respondents 39 36 9
49
Q35.
Of those collision reduction factors that come from in-house studies, provide rough estimates of the percentage that comes from before and after evaluations and the percentage that are based on cross-sectional analyses of data or from multivariate regression models. % range Number of respondents from before and after studies 3 0 1 0 12 Number of respondents from cross-sectional/regression models sources 12 0 1 0 3
020 2140 4160 6180 81100
APPROPRIATE SKILLS, RESOURCES TO CONDUCT HIGHWAY SAFETY ANALYSIS
Q36.
What is the highest level of university education in your group of safety analysts? Degree Ph.D. Master's degree Bachelor's degree Number of respondents 7 10 17 % of respondents 21 30 52
Q37.
How many employees in your jurisdiction currently perform safety analysis of collision data? The number of employees of respondents performing safety analysis ranged from 1 to 200.
Q38.
How many of your safety analysts have university degrees or other formal training in statistics or related mathematical fields? Most analysts have an engineering degree that includes statistics/mathematics training. It was not clear from the responses what training the other analysts may have.
Q39.
Do you feel that a lack of sufficient numbers of appropriately skilled personnel has hampered your ability to conduct safety studies? Number of yes responses 19 % of respondents 58
Q40a.
Do you feel that your ability to conduct safety analysis is hampered by the lack of readily available information on the proper conduct of these studies? Number of yes responses 16 % of respondents 48
50 Q40b. If yes, please elaborate on information needs: There were few responses to this question. Some respondents cited a need for information on the safety effectiveness of countermeasures.
ABILITY TO LINK COLLISION AND RELATED DATABASES
Q41a.
Do you have a facility for automatically linking collision and other databases relevant to safety analyses? Number of yes responses 24 % of respondents 73
Q41b.
If so, identify the location types for which the following linked databases exist. Number of respondents with linked data Linked data Collision and traffic volume Collision and geometric/inventory Collision, traffic and geometric/inventory Interchanges 5 8 Intersections 8 3 1 Road sections 7 4 15
Q42a.
Do you use hospital/EMS data? Number of yes responses 6 % of respondents 19
Q42b.
Have you been able to link hospital/EMS data to any other pertinent data bases? Explain. Number of yes responses 8 % of respondents 24
Maine, Minnesota, Nebraska, New York, Oklahoma, Maryland, and West Virginia reported being a part of the CODES project.
51
RANKING OF ISSUES
Q43.
For the issues 1 to 9 above, rank in order the top 3 in terms of how critical they are for enhancing highway safety analysis in your jurisdiction. Issue Under-reporting of collisions Ability to link collision and related databases Appropriate skills, resources to conduct highway safety analysis Information on safety effectiveness in developing countermeasures Identifying comparison sites in before and after studies Time trends in collision experience Adequacy of traffic volume/exposure data Regression to the mean Changes in traffic volume in before and after studies Number 1 rankings 7 9 12 4 0 0 0 1 0 Number 2 rankings 2 3 4 8 3 2 8 0 1 Number 3 rankings 1 7 7 5 0 1 4 4 2
52
APPENDIX C
Some Electronic Resources Relevant to Highway Safety Analysis
This appendix provides limited guidance on electronic resources that may be valuable for highway safety analysts. The list of resources is by no means comprehensive. However, every attempt was made to be current at the time of writing this synthesis.
A. GENERIC STATISTICAL SOFTWARE THAT HAS BEEN OR COULD BE USED FOR HIGHWAY SAFETY ANALYSIS
The following information, dated around January 2000, is taken directly from the web site of TRB Committee A5011: Statistical Methodology and Statistical Computer Software in Transportation Research. http://www.a5011.gati.org/statistical_software_resource.htm. The TRB Committee on Statistical Methodology and Statistical Computer Software in Transportation Research (A5011) has assembled the following table to assist with the selection of a statistical software package. Clicking on an underlined package name opens a new page with comments supplied by transportation professionals and a link to the homepage for that software. Clicking on other package names opens the homepage for that software. The website cautions that the list is provided as a service to the transportation industry by TRB Committee A5011 and therefore should in no way be construed as a recommendation by TRB or Committee A5011 to purchase any particular product.
ADModelBuilder
Otter Research Ltd's main area of interest is in the production and application of nonlinear statistical models for macroeconomic analysis, financial modeling, and natural resource management Structural equation models Data linkage software Comprehensive library of statistical routines from simple data description to advanced multivariate analysis, backed by extensive documentation Software for the Bayesian analysis of complex statistical models using Markov chain Monte Carlo (MCMC) methods Decision-tree software, combines an easy-to-use GUI with advanced features for data mining, data preprocessing, and predictive modeling (Classification and Regression Trees) Provides interactive tools for data analysis and display based on the concepts and philosophy of exploratory data analysis Software for planning and interim monitoring of group sequential clinical trials Tools for design and analysis of epidemiological studies GENeral STATistics package. Range of statistics include basic statistics, design and analysis of designed experiments, regression (linear, nonlinear, and generalized linear), multivariate analysis techniques, time-series, survival analysis, spatial analysis, and resampling methods GLIM (Generalised Linear Interactive Modelling) is a flexible, interactive, statistical analysis program developed by the GLIM Working Party of the Royal Statistical Society. It provides a framework for statistical analysis through the fitting of generalized linear models to data, although its uses are considerably wider than this
AMOS Automatch BMDP
BUGS
CART
DataDesk
EaST Egret GENSTAT
GLIM
53 JMP LIMDEP LogXact MARS Minitab Mplus NCSS NqueryAdvisor PowerandPrecision RATS JMP's goal is to analyze data in as graphical a way as possible, which lets you discover more, interact more, and understand more (Product of the SAS Institute) General econometrics program for estimating linear and nonlinear regression models and limited and qualitative dependent variable models for cross-section, time-series, and panel data Software for small-sample logistic regression Innovative and flexible modeling tool that automates the building of accurate predictive models for continuous and binary dependent variables (multivariate adaptive regression splines) Intuitive user interface, broad statistical capabilities, presentation-quality graphics, and powerful macro language Statistical modeling program using structural equation modeling (SEM) framework for both continuous and categorical outcomes, plus data handling and statistical features Number Cruncher Statistical SystemA comprehensive and accurate, easy-to-learn, statistical and data analysis system Helps investigators and statisticians to select the most efficient sample size for research studies Program for statistical power analysis and confidence intervals Regression Analysis of Time Seriesa leading econometrics/time-series analysis software package used for analyzing time-series and cross-sectional data, developing and estimating econometric models, forecasting, and much more Integrated suite of software for enterprise-wide information delivery built around four datadriven tasks common to virtually any applicationdata access, data management, data analysis, and data presentation Power to analyze time-series data using comprehensive modeling capabilities; delivers accurate and dependable forecasts Primary strength is the estimation and testing of many types of regression models. The SHAZAM command language has great flexibility and provides capabilities for programming procedures Comprehensive missing data analysis tool that allows one to implement a thorough, principled, and informed approach to a missing data problem Data analysis, data mining, and statistical modeling; programmable Established leader in business intelligence, especially data mining, as well as three vertical markets: survey/market research, quality improvement, and scientific research Statistical Software Toolsgeared toward the estimation of complicated statistical models Complete statistical, graphical, and data-management capabilities; programmable Easy-to-learn, easy-to-use PC-based statistics package with features like StatAdvisor, gives you instant interpretations of your results; StatFolio, the new way to automatically save and reuse your analyses; and truly interactive graphics Comprehensive, integrated statistical data analysis, graphics, database management, and custom application development system featuring a wide selection of basic and advanced analytic procedures for science, engineering, business, and data mining applications Java applets for statistical analysis and graphics
SAS
SCA SHAZAM SOLAS S-Plus SPSS SST STATA Statgraphics
Statistica
Statlets
54 StatXact SUDAAN Software for small-sample categorical and nonparametric data Specifically designed for analysis of cluster-correlated data from studies involving recurrent events, longitudinal data, repeated measures, multivariate outcomes, multistage sample designs, stratified designs, unequally weighted data, and without replacement samples Research quality statistics and interactive graphics for scientists, engineers and statisticians
SYSTAT
B. SOME USEFUL WEB SITES
http://www4.nationalacademies.org/trb/crp.nsf/reference%5Cappendices/NCHRP+Overview This is an NCHRP web page that will provide information on an upcoming web manual on statistical methods for those undertaking transportation research in general. The manual is expected to be available in 2001. http://www.a5011.gati.org/statistical_software_resource.htm Transportation Research Board Committee A5011: Statistical Methodology and Statistical Computer Software in Transportation Research. http://www.bts.gov/programs/statpol/btsguide.html Bureau of Transportation Statistics' Guide to Good Statistical Practice: A Handbook for Data Program Managers and Analysts. http://davidmlane.com/hyperstat/index.html HyperStat Online is an introductory-level hypertext statistics book. http://www.tfhrc.gov/safety/pubs.htm Federal Highway Administration's safety related research is published or summarized here. Many of these publications relate to statistical analysis in highway safety and some are referenced in this synthesis. http://www.tfhrc.gov/safety/ihsdm/brief.htm Federal Highway Administration web pages devoted to providing details and updates for the Interactive Highway Safety Design Model (IHSDM). http://www.tfhrc.gov/safety/hsis/hsis2.htm The Highway Safety Information System is a multistate database, maintained by FHWA, that contains crash, roadway inventory, and traffic volume data. http://www.nhtsa.dot.gov/people/ncsa/codes/CODESindex.htm Information on the CODES (Crash Outcome Data Evaluation System) projectA collaborative approach, led by the National Highway Traffic Safety Administration (NHTSA), to generating medical and financial outcome information relating to motor vehicle crashes and using this outcome-based data as the basis for decisions related to highway traffic safety. http://roadsafetyresearch.com A recently developed website that will serve as a depositary for research by Ezra Hauer, who has contributed substantially to the new methods for highway safety analysis. Of special relevance are the critical reviews of safety knowledge and the computational tools, e.g., a spreadsheet for before and after analysis using a comparison group.
55
APPENDIX D
A Primer on the Application of Some Basic Statistical Tools to Highway Safety Analyses
This appendix addresses some of the more common difficulties, identified on the basis of the survey and literature search, in applying statistical methods to highway safety analysis. The intent is to provide some insights into how to diagnose these problems and how to resolve them. The focus is on statistical issues presented by the non-ideal conditions peculiar to the analysis of highway safety data. This synthesis is being coordinated with NCHRP Project 2045, which is developing a website to provide guidance on the application of basic statistical tools in transportation. That guidance is, accordingly, outside the scope of this appendix. The intent behind this appendix is not to provide a full treatise on statistical methods as should properly be done in a text book, but rather to provide an overview of the common problems in highway safety analyses and potential solutions.
PROBLEMS/PITFALLS IN HIGHWAY SAFETY ANALYSIS Regression to the Mean (RTM)
because an unusually high count is likely to decrease subsequently even if no improvement were implementeda phenomenon known as regression to the mean. As long as improvement projects are motivated, at least in part by safety concerns, then RTM is likely to be at play and its effects must be accounted for because research has shown that these are likely to be of the same order as the real safety effects of most treatments (Hauer et al. 1983). The phenomenon has been well documented in published literature and is recognized by many safety analysts, but an illustrative example may be in order here to emphasize the uncertainties of RTM. Table D1 is assembled from the data presented in Hauer (1997) for 1,072 San Francisco intersections grouped according to the specific numbers of accidents occurring in 19741976. For the same intersections in each row, the average number of accidents per intersection for 1977 is also shown. Thus, for example, those 218 intersections that had exactly 1 accident in 19741976 recorded, in total, 120 accidents in 1977, for an average of 0.55 accidents per intersection (as shown in Table D1, column 6). There was no real change in safety at these intersections between 19741976 and 1977 in that accidents averaged over all intersections remained essentially constant over the years at approximately 1.1 accidents per intersection per year. Yet, as the table shows, intersections that had exactly 4 accidents in 19741976 (1.33/year) recorded, on average, 1.08 accidents in 1977, a decrease of 19 percent. Each group of intersections with 4 or more accidents in 19741976 (more than the average of 1.1 per
The common method for evaluating the safety effect of highway improvements is a simple comparison of annual accident counts before and after improvements. It is the simplest of techniques and requires less data than the newer techniques, but overestimates the safety effect if sites were identified for improvement on the basis of an unusually high accident count (Pendleton 1996; Hauer 1997). This is
TABLE D1 ILLUSTRATING THE REGRESSION TO THE MEAN PHENOMENON
56 year) recorded substantial reductions in accidents the following year; conversely, each group with 3 or fewer accidents (i.e., less than the average of 1.1 per year) experienced an increase. These changes have nothing to do with safety and are artifacts of the RTM phenomenon. To appreciate the magnitude of the problem, imagine that the 54 intersections with 6 accidents in 3 years (a total of 324 accidents in 3 years, or 108 per year) were treated at the end of 19741976 and recorded, for example, a total of 72 accidents in 1977. A conventional before and after comparison would estimate the treatment effect as a reduction of 108 72 = 36 accidents per year or 33.3 percent [= 100(36)/108]. Yet, as the last column in Table D1 shows, this would be a gross overestimate since the reduction due to RTM alone (and not to safety) would have been 24 accidents per year or 22 percent. It stands to reason that the conventional before and after comparison should not be done unless it can be demonstrated that the before period accident count is not unusually high (or unusually low). In practice this is hard to demonstrate, because only a truly random selection of treatment sites, which is almost never done in road safety management, will guarantee that there is no RTM. Finally, it should be noted that although the illustration pertained to intersections and to a 3-year before period, as Hauer et al. (1983) and others have demonstrated, the danger of ignoring RTM is no less severe for road segments and other entity types and for longer before periods. estimated to have reduced accidents by [100 (80 64)/80] = 20 percent. Assume now that, contrary to our assumption of no spillover effect, some of the treatment effect indeed spilled over to the control sites and therefore 98 accidents were observed in the after period (instead of 112). One would now estimateerroneouslythat 100 98/140 = 70 accidents would have occurred on the treated sites had they not been treated. The effect of the treatment is now estimated incorrectlyto be [100 (70 64)/70] = 8.6 percent! Conversely, the treatment effects would be overestimated if there were migration effects and an increase in accidents at the comparison sites due to compensatory behavior of drivers. The installation of all-way stop control and other speed control measures are believed to sometimes cause such effects.
The Problem with Accident Rates
In some jurisdictions accident rate is used directly or indirectly as a hazard measure to flag locations for safety investigation. Average annual daily traffic data (AADTs) are used directly in the computation of this measure; i.e., accident rate = accident frequency/AADT (or some scalar multiple of this). The problem, as the extensive literature on safety performance functions shows, is that the relationship between accident frequency and AADT is not linear (Pendleton 1996). Figure D1, which depicts the safety performance function for injury accidents for two-lane rural roads in Ontario, illustrates the inherent nonlinearity and the difficulties with the linearity assumption. The relationship depicted is of the form Accidents/km/unit of time = a(AADT)b where a = 0.00398 and b = 0.812 are regression coefficients calibrated from data. A value of b = 1 would have indicated a linear relationship. The nonlinearity depicted in Figure D1 points to an inherent flaw in the use of accident rate as a measure of safety. Specifically, comparing accident rates of two entities at different traffic levels to judge relative safety may lead to erroneous conclusions. According to Figure D1, the accident rate (the slope of a line from the origin to a point on the curve) is expected to be lower at higher traffic volumes. Thus, saying that when two rates are equal they indicate equivalent levels of hazard may be completely false if different AADT levels are involved. A further complication in the use of accident rates arises in evaluating the safety effect of a retrofit measure. If such a measure was
Comparison Sites and Spillover and Migration Effects
It is common to use a treatment-comparison experimental design to control for effects not due to the treatment. The treatment effects would be underestimated if, as some of these studies have found, there is a decrease in target accidents at comparison sites that is due to spillover effects of the treatment. Measures such as red light cameras are believed to have such effects. To illustrate the problem, consider the following hypothetical example for which before and after accident data are provided.
Accidents at Treated Sites Before After 100 64 Accidents at Comparison Sites (assuming no spillover effects) 140 112
In a treatment-comparison design, assuming no RTM for convenience, one first calculates that 100 112/140 = 80 accidents would have occurred on the treated sites in the after period had the treatment not been implemented. Since 64 accidents materialized, the treatment is then correctly
57
FIGURE D1 Safety performance function for two-lane rural arterial highways in Ontario.
implemented on a road section, and it is observed that traffic increases and the accident rate drops after implementation, no real improvement may have taken place at all since, as seen in Figure D1, these changes can occur without the safety performance function changing. A real improvement takes place only if the accident frequency after implementation is lower than the value that pertained before for the same volume level. Thus, caution should be exercised in comparing accident rates to declare the extent of benefits arising from the safety improvement. The upshot of all this is that the use of accident rates to compare sites in regard to their safety levels is potentially problematic. The most valid basis of comparison using accident rates is for the relatively rare cases when the traffic volume levels are the same or when the relationship between accidents and AADT is linear.
Although there was an increase from 1994 to 1995 there appears to be an overall declining trend. Such a trend may actually be regarded as typical, but it is always difficult to speculate on the underlying reasons. It may reflect a decline in travel; it is possible that there were increasing levels of accident under-reporting (indeed, it is known that sometime in 1996 there was an administrative decision to conserve resources by reducing the level of accident reporting), or even that road safety programs as a whole were having a substantial and increasingly beneficial effect over the data period. Or, perhaps, the year-to-year differences can only be attributed to random fluctuation in accident counts. Whatever the reasons, such jurisdiction-wide trends must be accounted for in before and after evaluations, and methods for doing so have now been documented (Hauer 1997). To illustrate the implications of not accounting for such trends, consider that there were approximately 25,000 accidents in 1992, and approximately 19,000/year in 1994 1995, approximately 25 percent fewer. Thus, if a site was treated in 1993, and 1992 is used as the before period and 19941995 as the after period (avoiding the year 1996 in which accidents are known to be under-reported), then an observed reduction in accidents of say, 20 percent, which is of the order expected for a good safety measure, would not seem to be that impressive when one considers that the entire jurisdiction, which typically includes mostly untreated locations, has experienced a 25 percent decline in accidents during the same period.
Time Trends in Accidents
In all jurisdictions, the total number of accidents will fluctuate from year to year. This creates difficulties when comparing data from one period with that from another such as is done in before and after evaluations. Previous before and after evaluations of safety treatments have typically not accounted for the impacts of such time trends in data, so it is in order to explore the issue here in some depth. Consider, for example, data for Vancouver, British Columbia. The following are the approximate total numbers of reported accidents for the years 1992 to 1996. 1992 1993 1994 1995 1996 25,000 21,500 18,500 19,500 15,500
Traffic Volume Changes
It is important to account for changes in traffic volume between the before and after periods, because many measures, e.g., the installation of a traffic signal, are known to
58 cause changes in traffic volume, which, in turn, are known to be very important in explaining observed differences in accident experience. Thus, to evaluate the effect of a treatment, it is vital to account for both changes in safety due to the treatment and changes due to traffic volume. Although precise traffic counts are rarely available, changes in traffic volume can be estimated on the basis of traffic count samples from the before and after periods providing there is no systematic bias in these counts. One common way for accounting for traffic volume changes, as revealed in the survey of jurisdictions, is to normalize for traffic volume by comparing accident rates before and after a treatment. However, as pointed out earlier, the use of accident rates implies a linear relationship between accidents and traffic volume an implication that is now known to be generally false. To appreciate the difficulty created by not accounting for traffic volume changes or by using accident rates for this purpose, consider the safety performance function taken from Lord (2000) for signalized intersections: Accidents/year=0.0002195(F10.534)(F20.566)e(0.00000892 F2) where F1 and F2 are the total entering AADTs on the major and minor roads, respectively. Suppose a treated signalized intersection had F1=16,000 and F2=8,000 before treatment. The expected accident frequency, using these values in the equation above is 0.0002195(160000.534)(80000.566)e(0.000008928000) =6.71 accidents/year Suppose there was a 10 percent reduction in traffic (both F1 and F2) following treatment due to shifts in the traffic pattern in the jurisdiction; the expected accident frequency without the treatment is reduced to 5.93, a reduction of around 12 percent. Thus, to ignore the effects of the decrease in traffic would mean that the treatment effect would be overestimated by a factor of the order of 12 percent. For this reason alone the results of a simple before and after comparison are likely to be incorrect because changes in traffic volume have not been accounted for. Note also that the change in accidents is larger than the change in traffic; therefore, extending the simple before and after method by normalizing for traffic volume changes would not work, because comparing accident rates before and after assumes that accidents are proportional to traffic volumes; that is, that a 10 percent decrease in traffic results in a 10 percent decrease in accidents. To see this, consider the accident rates for the example. For the before period, using the safety performance function, we calculated 6.71 accidents/year for a total entering AADT of 24,000, obtaining 0.77 accidents per million entering vehicles. If, in the after period, the AADT were reduced by 10 percent, then using the same equation, we calculated a frequency of 5.93 accidents/year without the treatment. Based on the reduced AADT of 21,600, we would then calculate an accident rate of 0.75 accidents per million entering vehicles. Thus, the accident rate has changed even though we assumed no treatment effect.
Uncertainty in Estimates
Safety analyses, such as the comparison of accident experience between groups or before and after an improvement, are typically based on samples that are used to make inferences about the population. As such, there is uncertainty in the resulting estimates, which could be considerable given the small sample sizes typically available for safety analyses. Statements of uncertainty are vital to the proper interpretation of the results. It is often the case that estimates of uncertainty are either not provided, are incorrectly calculated, or are improperly interpreted. In conventional statistical methodology estimates of uncertainty or variance are used to examine whether or not observed differences between sample properties are statistically significant. To do so, a statistical test is used to assess whether or not to reject the null hypothesis; that there is no difference. A Type I error occurs if this hypothesis is rejected when in fact it is true that there is no difference. Washington (1999) points out that the Type II errorleading to a conclusion that there is no difference when in fact there isis also important, but is rarely assessed, largely because common software packages do not provide this information. That a Type II error might have occurred is one of the reasons why one should be cautious about concluding that statistically insignificant results are not important. Other possible explanations for such results are, according to Washington: (a) that the expected effect did not manifest itself in the data used, (b) the sample size was too small to discern an effect, (c) the size of the effect was too small to detect, (d) the inherent variability in the data is too large to discern an effect, and (e) there is no effect. In a similar vein, Hauer (1997, p. 68) argues against the use of statistical tests of significance in road safety, suggesting that the question of practical interest in before and after studies is: What is the size of the safety effect and how accurately is it known? Regardless of one's philosophy, there is still a need to express uncertainty in estimates and to properly calculate and interpret the measures.
Deriving Accident Modification Factors (AMFs) from Cross-Section Studies
Most experts recognize that the preferred way to develop AMFs is to do a before and after study of entities in which only the element of interest has changed. This type of study
59 is often not possible for a variety of reasons, including the difficulty of getting large enough treatment samples, the challenges of specifying a proper control group unaffected by treatment, and that there are often changes to other elements as well. As an alternative, cross-sectional studies could provide AMF information under certain conditions. In a common form, cross-sectional analysis looks at the accident experience of entities with and without some feature, and attributes the difference in safety to the feature. Because it is difficult to find entities that vary in only one feature, cross-sectional analysis is often accomplished through multivariate models in which researchers attempt to account for all variables that affect safety. If such attempts are successful, the models can then be used to estimate the change in accidents that result in a unit change in a variable, that is, the AMF. At present, the science of assembling AMFs from multivariate accident models is not fully developed; therefore, validation of AMFs so determined is especially important. Such AMFs could be inaccurate because of the difficulty of accounting for all of the factors that affect safety. For example, intersections with left-turn lanes also tend to have illumination. If an accident model is used to estimate an AMF for left-turn lanes, and the presence of illumination is not accounted for in the model, the difference in model predictions with and without left-turn lanes could be partly due to illumination differences. Ironically, it is precisely because a variable is found to be correlated with another variable that it may be omitted during the model fitting exercise. Including correlated variables could lead to effects that are actually counterintuitive (e.g., illumination increases night-time accidents). Other reasons why the effect of an element that may affect safety cannot be captured in a model is because the sample used to develop the model is too small or there is little or no variation in the element. For example, the effect of illumination cannot be captured if all locations in a sample are illuminated. Random Allocation of Sites to Treatment and Comparison Groups Sites could be selected for possible treatment on the basis of the safety record and then randomly allocated to either a treatment or a comparison group. This would create similar accident frequency distributions in the two groups, allowing for RTM effects to be controlled for (Davis et al. 1999). In practice, this method of project selection is problematic, because there may be liability issues if some sites that are included in the comparison group are worthier of treatment than some sites in the treatment group. In addition, there are the ethical ramifications of making a conscious decision to effectively ignore sites in need of treatment.
MANY YEARS OF DATA
It is widely believed that the use of many years of before period data will minimize, if not eliminate, RTM effects. Although it is true that these effects get smaller as more years of data are used, it would take an impractically long before period for these effects to be considered negligible. Hauer and Persaud (1983) found that for before periods of as large as 6 years, the RTM effect could be of the same order of magnitude as the safety effects of many countermeasures. Use of Post-Selection Data in Before and After Studies It is recognized that one of the best ways, at least in theory, of avoiding RTM in before and after comparisons is to use only data after the decision is made to implement a treatment. This ensures that a randomly high count that may have been used in the selection is not used in the analysis. In practice, however, the resulting before periods can be too short to provide sufficient accidents for meaningful results. Specifying an Appropriate Comparison Group in Before and After Studies
SOLUTIONS Avoiding Selection Bias Ignoring Safety in Site Selection
One way of avoiding selection bias is to not use safety as a consideration in the selection of sites for treatment. This strategy defeats the purpose of safety improvement programs because measures are likely to have the largest safety benefits where a safety concern is manifested in a high accident frequency. In practice, despite their best intentions, it is rather difficult for authorities to consciously ignore safety in selecting sites for improvement, and only a random selection of sites would guarantee that safety was not considered in site selection.
One variation of the conventional before and after study is the simple before and after study with comparison group. The use of an untreated comparison group of sites similar to the treated ones can account for unrelated effects such as time and travel trends, but will not account for RTM unless sites are matched on the basis of accident occurrence. There are immense practical difficulties of achieving this ideal, as illustrated in Pendleton (1996). In addition, the necessary assumption that the comparison group is unaffected by the treatment is difficult to test and can be an unreasonable one in some situations. Most fundamentally, the comparison group needs to be similar to the treatment group in all of the possible factors that could influence safety. A recent paper by Scopatz
60 (1998) points to the difficulties of fulfilling this need by examining the results from Hingson et al. (1996) that lowering legal blood alcohol concentration (BAC) limits to 0.08 percent resulted in a 16 percent reduction in the probability that a fatally injured driver would have a BAC above that level. The treatment group was composed of states that passed a lower legal BAC law, whereas the comparison states retained a 0.10 percent BAC legal limit. Scopatz points out that that there are numerous differences other than legal BAC limits between law and comparison states. Therefore, it is impossible to conclude that the passage of a law as opposed to some other uncontrolled-for factor accounts for the results. To support this point, Scopatz showed that if logically valid but different comparison states are chosen the results change dramatically, and in most cases are in fact consistent with a conclusion of no effect. Tarko et al. (1998) found that a similar situation would arise even in analysis confined to a single state, in which the treatment group is in one county and the comparison group is in another. Ideally, the comparison group should be drawn from the same jurisdiction as the treatment group. The difficulty is that the pool available for the comparison group could be too small if most or all elements are treated or at least are affected by the treatment. In the BAC case, the law applied to all drivers in a state; and, as discussed earlier, measures such as red light cameras are believed to have significant spillover effects to untreated sites. where SP is the expected annual accident frequency and F1 and F2 are the total entering AADTs on the major and minor roads, respectively. This SPF reveals a great deal about the safety of an intersection even if its accident history is not known. For example, an intersection with F1=16,000 and F2=8,000 would have an expected accident frequency or safety performance (SP) of SP = 0.0002195(160000.534)(80000.566)e(0.00000892 8000) = 6.71 accidents/year It is also possible to calculate the expected accident rate as 6.71 10-6/(30000 365)=0.77 accidents per million entering vehicles. If the major AADT were reduced by 50 percent, then SP=4.63, correctly indicating that the intersection is safer with less traffic. However, the accident rate increases to 0.79. A 50 percent increase in the AADT would result in an increased SP of 8.33, but a reduced accident rate of 0.71. This illustration serves to emphasize the point made earlier that accident rates should not be used as an index in comparing the safety performance of different intersections. The result that the safety performance is based on the relative amounts of traffic on the major and minor approaches is another strong indication that accident rate should not be used as a safety index since the rate is based on total entering volume, regardless of the distribution of this total among the approaches. The most obvious use of the SPF is to compare the actual accident experience at an intersection against this expected value. Thus, if the example intersection recorded 36 accidents in 3 years, an average of 12 accidents/year, one could say that the intersection is performing worse than expected since 12.0 is greater than the 6.71 accidents/year expected on the basis of the safety performance of similar intersections. This is true even though the larger than expected number of accidents in 3 years might be partly due to a random up-fluctuation in the accident count. The next part of the text addresses how SPFs can be used to smooth random fluctuation in accident counts. So far, the estimate of safety (6.71 accidents/year) pertains to an average intersection for which there is information about the AADT (F1=16,000 and F2=8,000), control type (signalized), and number of legs (four). For a specific intersection having these characteristics, the accident history, if known, must also be used in estimating its safety. To combine the two sources of information, the SPF and the accident record, another piece of information about the SPF is needed. It is a parameter, usually called k, which is related to the uncertainty in the SPF and is also an output of the statistical procedure used for calibrating the SPFs from data. For four-legged signalized intersections in Toronto, the calibrated value of k was 6.91. This value is
Use of Safety Performance Functions to Evaluate Treatments and Identify Sites with Promise
Modern safety management practice requires the application of a fundamental tool, called a safety performance function (SPF), often called an accident prediction model, which is an equation describing the relationship between the safety and the amount of traffic at a location (road section, intersection, etc.). It is, in essence, the accident frequency expected per unit of time based on the location's traffic and other characteristics. Presented next are some basics of safety performance functions with illustrative applications. Much of what is presented is taken from a recent paper by Persaud (2000).
Illustrating the Basics of Safety Performance Functions The theory behind the application of SPFs is best presented by way of an example. The illustration pertains to the safety of four-legged signalized intersections. The calibrated SPF (Lord 2000), based on 19901995 data for Toronto, is given by the following equation: SP=0.0002195(F10.534)(F20.566)e(0.00000892 F2)
61 used in the empirical Bayes procedure to smooth the randomness of the accident count as follows: For the example intersection that recorded 36 accidents in the last 3 years, we found that such intersections may be expected to have 6.71 accidents/year on the basis of its AADT, control type, and number of legs. We need a weight for combining this information, the 6.71 accidents expected at an average intersection of this kind, with the information that 36 accidents were recorded in 3 years at this specific intersection. In other words, we need to refine the actual accident count to estimate an expected value in the long run, by accounting for the randomness of annual accident occurrence. The formula for computing the weight (w) is w = k/[k + (n SP)] where n is the number of years of accident data. For k = 6.91, SP=6.71, and n=3, w = 6.91/[6.91 + (3 6.71)]=0.256 The refined expected number of accidents/year (m) is calculated by m = (w SP) + (1 w)(X)/n where X is the number of accidents recorded in n years. Thus m = 0.256 (6.71) + (1 0.256)(36)/3 =10.65 accidents/year The difference between the refined long run estimate of 10.65 accidents/year and the average of 12.0 accidents/year actually recorded in a 3-year period is not trivial. However, it is commonly believed that averaging the counts over a longer period will provide a close enough estimate of the long run average and eliminate the need for refining this value by using the safety performance estimate. This belief is convenient, but it is also erroneous. To see this, suppose, for illustration, that the average of 12.0 accidents/year was based on 60 accidents in 5 years rather than 36 accidents in 3 years. It turns out that the value of m in this case is 11.10 (using n=5 and X=60 in the above equations), still well below the value of 12.0 averaged over 5 years. Thus, the refinement is still necessary, because the difference between 12.0 and 11.10 is of the order of the changes in safety obtained from most intersection countermeasures. It is recommended that PSIs be calculated for each accident severity type [fatal, injury, property damage only (PDO)] using the procedures outlined and that a PSIindex be calculated by weighting the PSIs by the relative economic value of fatal, injury, and PDO accidents. Locations can then be ranked for further investigation in descending order of the PSIindex. Application to Network Screening to Identify Locations for Safety Investigation Screening of the network of intersections or road sections to identify locations for safety investigation could be carried out based on the values calculated previously (Pendleton 1996). One possible process is to use the value of m calculated above, and weighted by accident severity, as an index to prioritize locations for safety investigation. Another newly developed process is based on the concept of the Potential for Safety Improvement (PSI) that is now being implemented by some jurisdictions. The PSI is the difference between a location's actual safety performance (m) and the expected safety performance for locations with similar classification (e.g., four-legged signalized) and traffic volumes (SP). Theoretical details and a validation of the procedure are given in Persaud and Lyon (1999). Thus, for the previous example, the value of the PSI is 3.94 (= 10.656.71).
Application to Evaluation of Treatment While Accounting for Traffic Volume Changes, Time Trends in Accidents, and Uncertainty in Estimates The most recent developments in methods for the proper conduct of before and after evaluations not only resolve the difficulties of RTM but also provide ways of addressing other issues raised earlieraccounting for traffic volume changes, time trends in accidents, and uncertainty in estimates. Suppose the example intersection was identified as hazardous using the procedure in the previous section and was treated. In addition, suppose that, in the after period, the goal of the treatment was achieved and the safety of the intersection was closer to that expected of similar intersections, with a recorded count of 15 accidents in 2 years or 7.5 per year. Then, to properly estimate the effect of the treatment one should compare the expected safety without the treatment to the actual count recorded after treatment. It is also necessary that the expected safety without the treatment account for changes in traffic volume and/or time trends in accident occurrence. Methods for doing these adjustments in properly estimating the expected safety without the treatment can be found in Hauer (1997). Recent applications can be found in Persaud et al. (1997) and Persaud et al. (2001). An excerpt from the latter application is reproduced here.
62 For the purposes of this illustration, it is assumed that adjustments for time trends and volume changes are not required. Then, the value of m already calculated can be used to estimate the expected safety without the treatment and the safety effect of the treatment as follows: The change in safety resulting from the treatment=BA where A is the count of accidents in an after period of y years, and B is the expected number of accidents in y years without the treatment and is simply the product of y and m. Thus, for the example illustration, the change in safety in the 2-year after period is 6.30 accidents [=(2 10.65) 15]. This translates into a percentage reduction of 100 (6.30/21.30)=29.6 percent, or an AMF of 15/21.30 = 0.70. Simply using the 12 accidents/year as an estimate of the expected safety without the treatment would result in a reduction of 37.5 percent [=(24 15)/24]. The treatment benefit is 9 accidents (=24 15) in 2 years compared to the value of 6.30 obtained with the proper procedurean overestimate of 30 percentdefinitely not a trivial amount! Of course, it is necessary to also specify the uncertainty in the change in safety before placing too much confidence in it. For one treated intersection it is likely that the uncertainty is large. Therefore, the accumulated effects of a number of applications of the same treatment need to be evaluated in order to have sufficient confidence in the estimate of treatment effect. Methods for estimating the appropriate variances and for accumulating the effects over a number of treatments can be found in Hauer (1997).
Before Conversion 56 (4.67) 34 10,654 4,691 7.28 After Conversion 38 (3.17) 14 11,956 5,264 4.42
Months (years) of crash data Count of total crashes Major approaches AADT Minor approaches AADT Crashes/year
Estimating B: The Crashes That Would Have Occurred in the After Period Without the Conversion The full project report describes how safety performance functions were assembled for the various types of intersections converted. This is no trivial task. For rural stop controlled intersections in Maryland, the safety performance function gives the estimate (P) of the number of total crashes/year during the before period as P (crashes/year)=0.000379 (major road AADT)0.256 (minor road AADT)0.831 =0.000379 (10,654)0.256 (4,691)0.831=4.58 This is the expected number of crashes per year at similar rural intersections (in terms of traffic control and approach AADTs). Next, the expected annual number of crashes during the before period is estimated as mb=(k + xb)/(k/P + n) where xb is the count of crashes during the before period of length n years, and k=4.0 is the over-dispersion parameter estimated for this safety performance function. Thus, the expected annual number of crashes during the before period at the specific intersection under consideration is mb=(4.0 + 34)/[(4/4.58) + 4.67]=6.860 To estimate B, the length of the after period and the differences in the AADTs between the before and after periods must be considered. This is accomplished by first multiplying the expected annual number of crashes in the before period by R, the ratio of the annual regression predictions for the after and before periods. (Note: The full methodology described in Chapter 12 of Hauer's book can be applied for cases for which the AADT is known for each year and for which a time trend measure such as the total number of accidents in each year
Example Application of State of the Art Empirical Bayes Methodology for Before and After Studies [Extracted and modified from Persaud et al. (2001)]. The intent is to provide a demonstration of the application of the EB methodology rather than a step-by-step example that analysts can follow in doing a before and after study. This demonstration should provide insights into the level of complexity and the data and analytical requirements for conducting a before and after study using state of the art methodology. In Maryland, five rural intersections were converted from stop control to roundabouts in the mid-1990s. Consider one such intersection, which was converted in 1994, for which the crash counts and AADTs on the approaches were as follows.
63 for the jurisdiction is known; this application is a simplification since no time trend measure is available, and there is only one AADT estimate for each of the before and after periods.) For this case, in the after period crashes/year = 0.000379 (11,956)0.256 (5,264)0.831 = 5.19 The ratio R of the after period to the before period regression predictions is R = 5.19/4.58 = 1.133 which gives ma = R mb = 1.133 6.860 = 7.772 crashes/year Finally, to estimate B, the number of crashes that would have occurred in the after period had the conversion not taken place, ma is multiplied by ya, the length of the after period in years. Thus B = 7.772 3.17 = 24.61 Recall from the previous table that 14 crashes actually occurred. The variance of B is given by For the Maryland conversion data in the previous table Var() = 71.29 + 44 = 115.29 The standard deviation of () is 10.74. In a conventional statistical test, one would find that the reduction in crashes of 61.19 is statistically significant at the 5 percent level because it larger than 1.96 standard deviations. Method 2: Index of Effectiveness () A biased estimate of is given by thus = For the Maryland conversion data in the previous table = 105.19 44 = 61.19 The variance of is given by Method 1: Reduction in Expected Number of Crashes () This is the difference between the sums of the Bs and As over all sites in a conversion group. Let =B =A
Estimation of Safety Effect In the estimation of changes in crashes, the estimate of B is summed over all intersections in the converted group and compared with the count of crashes during the after period in that group (Hauer 1997). For the five conversions in Maryland, this table gives the estimates of B, variance of these estimates, and the count of crashes in the after period.
The percent change in crashes is in fact 100(1 ); thus, a value of = 0.7 indicates a 30 percent reduction in crashes. From Hauer (1997), an approximate unbiased estimate of is given by
For the Maryland conversion data in the previous table
This translates into a reduction of 58.4 percent. The variance of is given by
The variance of B is summed over all conversions. The variance of the after period counts, A, assuming that these are Poisson distributed, is equal to the sum of the counts. There are two ways to estimate safety effect as shown here. For each, the estimation of the variance is illustrated.
For the Maryland conversion data in the previous table
64
FIGURE D2 Likelihood functions for safety effect of roundabout conversions.
These results can be presented and preserved in the form of likelihood functions, which are more appropriate for expressing uncertainty than the conventional confidence intervals. Figure D2 shows the likelihood functions for the Maryland conversions as well as for other conversions from stop control, which all happened to be in urban areas. First, consider the Maryland function. It is seen that the point estimate of = 0.416 is most likely and all other values are less likely. Values lower than 0.3 and larger than 0.65 are quite unlikely, because the relative likelihood is less than 5 percent. The function certainly indicates that a value of = 1, which indicates no safety effect, is almost totally unlikely, which is good news for those who decided to install the roundabout. In all of this there was no need to resort to conventional hypothesis testing since we have answered the question: what is the safety effect and how accurately is it known? The likelihood function for the urban stops indicates that it is likely that is larger than for rural conversions (the safety effect is smaller) in that the most likely value is now 0.47. When all stops, urban and rural, are placed in the same group, the information base becomes richer as is evidenced
by the smaller spread in the combined likelihood function. However, the information is not as useful, because it appears that separate values of are applicable for urban and rural roundabout conversions from stop control. Information on Safety Performance Functions As discussed earlier, SPFs, also known as accident prediction models, form the backbone of the new methodologies for highway safety analysis. Thus, it is informative to document some of the more recent sources for these functions. The desideratum is for each agency to calibrate and maintain a set of functions for roads in the jurisdiction. However, the large amount of required data and human and financial resources presents a formidable obstacle. Research has shown that functions can be transferred across jurisdictions by making adjustments for differences in accident experience. The procedure for making these adjustments is found in Harwood et al. (1999). The tables at the end of this appendix (Tables D2D4) document some of the latest sources for safety performance functions.
65
66
67
APPENDIX E
Review of A Sample of Relevant Methodology from Non-Mainstream Types of Safety Analyses
Statistical methods used in several aspects of academictype safety research are quite distinct from statistical methods used in highway safety analysis traditionally conducted in state and local agencies. It is nevertheless useful for highway safety analysts to have an appreciation for these research-oriented methods. This is not only because it may be possible to apply some of these methods in the more research-type analysis that they perform, but also because they often need to review research results in order to acquire safety knowledge pertinent to their day-to-day activities. To provide this appreciation, this appendix presents summary capsules on some of the methods used in research that should be of interest to highway safety analysts in state and local jurisdictions. These are summaries and are not intended to be critical reviews. They include: Log-linear analysis Contingency table analysis Induced exposure/risk estimation Logit models Ordered probit models Logistic models Meta analysis Factor analysis Data imputation. tionally, contingency tables, which record the number of responses for each combination of variable values, are used. However, when the number of variables is greater than two, the process can be arduous. Four log-linear models to predict the frequency of a cell were developed with three variables and two-way interaction terms between these variables. Odds multipliers were computed from the fitted model to compare the response between age groups and collision-related variables. A log-linear model with three variables and two-way interactions is of the form:
where log mijk=log expected frequency of cell in which x =i, y=j, and z=k, and v=overall effect. All other terms are the effect of the level of each variable and the interaction between variables on the response variable. The four models developed were for (1) age, injury severity, and ADT; (2) age ADT, first harmful event (e.g. rearend); (3) age, roadway character (straight or curved), and speed ratio (speed/posted speed); (4) age, location, and alcohol involvement. The models confirmed the tendencies of different driver groups to experience different collision types. Among the conclusions were that older drivers tend to be involved in angle and turning collisions and that young and middle-aged drivers have a higher likelihood of being involved in collisions where alcohol is involved. Log-Linear Analysis Example 2: Abdelwahab, H.T. and A. Abdel-Aty Mohamed, Log-Linear Analysis of the Relationship Between Alcohol Involvement and Driver Characteristics in Traffic Crashes, 3rd Transportation Specialty Conference of the Canadian Society for Civil Engineering, London, Ontario, June 810, 2000. The study investigated the relationship between driver age, gender, and location of residence (local, in-state, out of state, and foreign) on the probability of alcohol being a factor in a traffic collision in the state of Florida, given that the collision has occurred. Collisions were defined as alcohol related if any one of the drivers involved had consumed alcohol prior to the collision. The study goal was to investigate the associations between the driver-related variables and to identify higher risk groups. Log-linear modeling, which allows the modeling of categorical data,
The websites documented in Appendix C provide more information on the methods as well as references to relevant textbooks. Log-Linear Analysis In log-linear analysis the log of the dependent variable is modeled as a linear function of the independent variables. In road safety, log-linear models have been used instead of contingency tables to identify groups of drivers or other conditions that increase accident risk or severity. The fitting of log-linear models to data allows the calculation of odds multipliers that express the increased or decreased risk associated with any change in a variable included in the model. Log-Linear Analysis Example 1: Abdel-Aty, M.A., C.L. Chen, and J.R. Schott, An Assessment of the Effect of Driver Age on Traffic Accident Involvement Using LogLinear Models, Accident Analysis and Prevention, Vol. 30, No. 6, pp. 851861. The authors investigate the effect of driver age on collision involvement given that a collision has occurred. Conven-
68 was the adopted approach. Three models were developed. The first model used the variables gender (male or female), DUI (yes for alcohol involved, no otherwise), and trafficway (straight or curved roadway). The second modeled DUI, traffic-way, and residency (local, in-state, out of state, or foreign). The third modeled DUI, traffic-way, and age (15 19, 2024, 2564, 6579, 80+). Results indicate that a larger percentage of young (2024) and middle-aged (2564) drivers are involved in alcohol-related crashes. Males and instate and local drivers are also predicted to be more likely to be involved in crashes that involve alcohol. Log-Linear Analysis Example 3: Rosman, D.L., M.W. Knuiman, and G.A. Ryan, An Evaluation of Road Crash Injury Severity Measures, Accident Analysis and Prevention, Vol. 28, No. 2, pp. 163170. Log-linear models were developed to investigate measures of injury severity based on the Abbreviated Injury Scale (AIS) and length of stay in hospital. Independent variables used were sex and age of the injured person and road-user type (car occupant, motorcyclist, bicyclist, or pedestrian). Dependent variables modeled were the number of injuries, number of serious injuries, injury severity score, length of stay in hospital, and AIS. obtained by assuming a null hypothesis. The chi-square goodness-of-fit test statistic determines if the observed and expected values vary significantly, and if this is the case the null hypothesis used in obtaining the expected values is determined to be false. Based on the critical chi-square value, a decision is made regarding the acceptance or rejection of the null hypothesis. Three types of analyses were undertaken. The first tested if driver age and light condition were independent when considering crash involvement. The second tested if driver age and light condition were independent when considering driving exposure. The third tested if crash frequencies were proportional to driving exposure. To test if random variables X and Y are dependent to each other, the null hypothesis is that they are independent. Under this hypothesis, the expected frequencies for a cell are found by multiplying the total number of observations by the probability that X=x and Y=y, which are the marginal distributions of the row and column variables, respectively. To consider driving exposure, an induced exposure measure was used. Only the not-at-fault drivers were considered in order to represent the amount of travel by that group. Considering at-fault drivers as well would introduce biases if some driver groups were over-represented in the crash data. The analysis determined that driver age and light condition were a factor in crash involvement. It was also found that driving exposure is dependent on driver age and to light condition. Finally, analysis showed that crash involvement of drivers did not occur according to their exposure by light condition, and that certain driver groups are overrepresented in crash involvement for certain light conditions. Identification of over-represented driver groups was measured as the percentage of at-fault drivers/percentage of not-at-fault drivers. Where this value is over 1.0, this shows an over-representation of that age group in crashes. The study came to the following conclusions: As lighting conditions worsen, the involvement of older drivers in crashes increases. Under all light conditions, older drivers are more likely to be involved in crashes. Younger driver crash involvement under dark street-lighted situations was lower than all others. Middle-aged drivers are under-represented except for dark with no street light situations. For all light conditions, middle-aged drivers had a lower crash involvement compared to young and older drivers. Older drivers had a higher crash involvement than younger drivers except for the dawn and dark without street light conditions.
Contingency Table Analysis Contingency tables display data classified by the levels of the discrete independent variables. For a two variable table of variables X and Y, where X has r levels and Y has c levels, the contingency table would have (r c) cells. In each cell is the observed number of accidents that occurred under the conditions that define the cell. Contingency tables display data in an easy-to-use format and allow significance tests such as the chi-squared test to be readily applied. Contingency Table Analysis Example: Dissanayake, S. and L.J. John, Use of Induced Exposure Method to Study the Highway Crash Involvement of Driver Groups Under Different Light Conditions, 3rd Transportation Specialty Conference of the Canadian Society for Civil Engineering, London, Ontario, June 810, 2000. Chi-square statistics and contingency table analysis were used to explore the relationship between driver age (1525, 2565, 65+), light condition (daylight, dusk, dawn, darkness with street lights, and darkness without street lights), and crashes. For two random classification variables X and Y (in this case age and light condition), where X has r levels and Y has c levels, the contingency table has (r c) cells. Each cell represents the observations of X and Y. In contingency table analysis, the observed number of crashes under each category is compared to the expected number of crashes
Induced Exposure/Risk Estimation Induced exposure is a method of estimating the risk of driver groups or other accident conditions, such as traffic
69 control, using the relative involvement of each group/condition in observed accidents compared to the relative proportion of that group/condition in the entire population. Induced Exposure/Risk Estimation Example 1: DeYoung, D.J., R.C. Peck, and C.J. Helander, Estimating the Exposure and Fatal Crash Rates of Suspended/Revoked and Unlicensed Drivers in California, Accident Analysis and Prevention, Vol. 29, No. 1, pp. 1724, 1997. Quasi-induced exposure was used to determine if unlicensed drivers or drivers with suspended or revoked licenses were overinvolved in accidents. Data from the National Highway Traffic Safety Administration's Fatal Accident Reporting System was used to calculate accident and exposure rates for drivers in California. Only multivehicle accidents in which just one driver was assigned fault were used to comply with the methodology. Involvement ratios (IRs) were calculated for each group by dividing the percentage of the group in the at-fault drivers subset by the percentage of the group in the innocent drivers subset. IR=percent of drivers in the at-fault group/percent of drivers in the innocent group An IR greater than 1 indicates over-involvement and an IR less than 1 indicates under-involvement. The results indicated that suspended/revoked drivers are over-represented in fatal crashes by a factor of 2 to 5 compared with fully licensed drivers. Potential sources of bias in the methodology cited by the authors included: the proportion of unlicensed drivers was 33 percent higher in the entire dataset than in the subset of accidents where only one driver was deemed at fault; drivers who die in a crash are less likely to be assigned fault and, if any of the groups is over- or under-represented in fatalities, the rates will be affected; and, if innocent drivers have characteristics making them more likely to be involved in a collision the exposure of these groups will be overestimated. Induced Exposure/Risk Estimation Example 2: Vitetta, B.A. and M.A. Abdel-Aty, Using Induced Exposure to Investigate the Effect of Driver Factors in Traffic Safety, 3rd Transportation Specialty Conference of the Canadian Society for Civil Engineering, London, Ontario, June 810, 2000. Quasi-induced exposure methods were used to identify highrisk driver groups using a relative crash IR. Quasi-induced exposure is based solely on the crashes experienced where fault has been assigned to only one driver involved in a crash. For a single-vehicle crash, that driver is always at fault. The use of quasi-induced exposure assumes that the distribution of nonresponsible drivers in the population of two-vehicle crashes closely matches the distribution of the entire population of drivers. This also assumes that the driver type of nonresponsible driver is independent of the driver type of the responsible driver. Crash propensity was examined for different age groups (1524, 2534, 3544, 4554, 5564, 6574, 7584, 85+) versus vehicle type, road type, divided versus undivided roadways, straight versus curved roadways, and various other conditions. Three statistics were calculated. Relative Crash Involvement Ratio = V TVN
RCIR where RCIR V TVN = = =
relative crash involvement ratio, or crash risk; fraction of the same class of drivers or vehicles; and fraction of the same class of notresponsible drivers or vehicles in a twovehicle crash, or relative exposure.
Relative Propensity RCIR r = where RCIRr RCIRs RCIRt = = = relative crash involvement ratio for relative propensity, relative crash involvement ratio for singlevehicle crashes, and relative crash involvement ratio for twovehicle crashes. RCIRs RCIRt
Induced Exposure/Risk Estimation Example 3: Stewart, D.E., Statistical Analytical Methodology for Measuring, Comparing and Interpreting Road Travel Risks: The Relationships Between Data Input Requirements and Analytical Frameworks, Proceedings of the Canadian Multidisciplinary Road Safety Conference X, Toronto, Canada, June 811, 1997. Stewart detailed a risk estimation methodology based on accident rates and statistical principles. The methodology defines how to estimate the relative risk of population
70 groups under specific conditions and how to measure the statistical accuracy of these estimates. One risk estimation measure is named the Proportion Risk and is defined as Logit Models Example 1: Mannering, F.L. and L.L. Grodsky, Statistical Analysis of Motorcyclists' Perceived Accident Risk, Accident Analysis and Prevention, Vol. 27, No. 1, pp. 2131. A multinomial logit model was developed to determine what factors significantly influence motorcyclists' estimates of their likelihood of becoming involved in an accident if they continue to ride for 10 more years. A questionnaire was used to collect data characterized by four categories: rider characteristics (e.g., age), exposure (e.g., miles driven per year), experience (e.g., years of having a motorcycle license), and behavioral attributes (e.g., a stated preference for consistently exceeding the speed limit). The questionnaire responses on the riders' estimate of their likelihood to be involved in a crash in the next 10 years were grouped into low (020 percent), medium (3070 percent), and high (80100 percent). Logit models predict the likelihood of a response given the characteristics represented in the model. The logit models took the form
where Rp(I\TGi,TCi,Ti) = risk measure for target group TGi, of experiencing an accident of type I, given the travel conditions TCi, during a time period T; proportional representation of the target group in accidents of type I in the given conditions and time period; and proportional representation of the target group's road travel in the given conditions and time period.
p(I\TGi,TCi,Ti)
p(E\TGi,TCi,Ti)
A value of less than one indicates that the target group is potentially a low-risk group for the specific accident type under the given conditions and time period. A value of more than one indicates that the target group is potentially a highrisk group for the specific accident type under the given conditions and time period. A value of 0 indicates that the group is potentially neither a high or low-risk group. The accuracy of the risk estimation is evaluated in terms of 95 percent confidence limits. The equations to calculate these upper and lower bounds are not provided here, but may be found in the paper. When evaluating the relative risk, the upper and lower confidence limits must be considered. If 0 is within the upper and lower bounds of the risk estimate then the estimate is not statistically significant in indicating the target group as being high or low risk. Other measures of risk included those estimated using accident frequency and comparing the relative risk between target driver groups and/or road conditions.
where Pni = probability that rider n would categorize themselves as having a low, medium, or high risk of being in an accident in the next 10 years, and Uni = linear function of variables that determine the probability of a rider considering themselves in the low-, medium-, or high-risk groups. The parameter estimates of the linear functions were estimated using a maximum likelihood procedure. Among the study findings was that age, gender, and experience are significant determinants of the estimate of self-risk, and that riders were generally aware of their relative crash risks. Logit Models Example 2: Chang L.-Y. and F. Mannering, Analysis of Vehicle Occupancy and the Severity of Truck and Non-Truck-Involved Accidents, Transportation Research Record 1635, Transportation Research Board, National Research Council, Washington, D.C., January 1999, pp. 93104. A nested-logit model was developed to explore the relationship between vehicle occupancy and accident severity and the severity differences between truck and non-truck involved crashes. Separate models were calibrated for truck-involved and non-truck-involved crashes. The nested-logit was used to concurrently model vehicle occupancy and severity, assuming that the number of vehicle
Logit Models Logit models determine which factors significantly affect the outcome of an event and can be used to predict the likelihood of each possible outcome of an event given the characteristics of the independent variables included in the model. Logit models are applicable when outcomes are not continuous; e.g., injury scales or high- or low-risk groups.
71 occupants affects the crash severity (with more people the likelihood of a more severe consequence is increased). Results showed that increased severity was more likely when associated with truck-involved crashes, high speed limits, crashes occurring when a vehicle is marking a right or left turn, and rear-end types of collisions. Logit Models Example 3: Shankar, V., F. Mannering, and W. Barfield, Statistical Analysis of Accident Severity on Rural Freeways, Accident Analysis and Prevention, Vol. 28, No. 3, pp. 391401. A nested-logit model was developed for predicting accident severity given that an accident has occurred, based on various collision and geometric variables. Severity was classified as one of property damage only (PDO), possible injury, evident injury, or disabling injury or fatality. A nested-logit model differs from a logit model in that there are two levels to the model, which allows categories with shared characteristics to be modeled on a second level. The authors found that a nested-logit model, which treated PDO and possible injury accidents as having shared characteristics, fit the data best.
where is the dependent variable injury severity coded as 0, 1, 2, 3, or 4; is a vector of estimated parameters; and x is the vector of explanatory variables. Given a crash, an individual falls into severity category n if y*
where is the cumulative distribution of y*. Two models were developed, one with the basic variables and the other including iterations among the independent variables. Results revealed that an increased severity risk exists for higher speed crashes, those occurring at night, for women, when alcohol is involved, and for crashes when a passenger car rear-ends a truck at a large differential speed between the two vehicles. Logistic Regression Logistic regression models identify factors that affect the likelihood of an outcome, e.g., a crash resulting in a fatality, and can be used to predict the outcome of an event. Logistic regression models are applicable to dichotomous data. Logistic Regression Example 1: Li, L., K. Kim, and L. Nitz, Predictors of Safety Belt Use Among Crash Involved Drivers and Front Seat Passengers: Adjusting for OverReporting, Accident Analysis and Prevention, Vol. 31, No. 6, 1999, pp. 631638. The study examined the relationship between personal characteristics of crash-involved motor vehicle occupants, driving circumstances, and the use of safety belts, including the association with alcohol and safety belt use. It was also examined to see if the results of crash-involved motorists match those found in roadside observational studies of safety belt use. A logistic regression model was developed to predict the likelihood of safety belt use given the personal and crash characteristics. A logistic model is of the form
Ordered Probit Models Ordered probit models are similar to logit models. They determine which factors significantly affect the outcome of an event and can be used to predict the likelihood of each possible outcome of an event, but differ in that they differentiate unequal differences between ordinal categories in the dependent variable (e.g., it doesn't assume that the difference between no injury and minor injury is the same difference as between a severe injury and a fatality given a unit change in an explanatory variable). Ordered Probit Models Example: Duncan, C.S., A.J. Khattak, and F.M. Council, Applying the Ordered Probit Model to Injury Severity in Truck-Passenger Car Rear-End Collisions, Transportation Research Record 1635, Transportation Research Board, National Research Council, Washington, D.C., 1999, pp. 6371. Ordered probit modeling was used to examine the occupant characteristics and roadway and environmental conditions that influence injury severity in rear-end crashes involving truck-passenger car collisions. Ordered probit models differentiate unequal differences between ordinal categories in the dependent variable (e.g., it doesn't assume that the difference between no injury and minor injury is the same difference as between a severe injury and a fatality given a unit change in an explanatory variable). The model is of the form
where
72
where Pr(I) A W N O = = = = = the probability of impairment, age, weekend, nighttime, and nonresident status.
Allowing for the over-reporting of safety belt use to police results in the model form
where
A likelihood ratio was used to assess the model fit by testing the null hypothesis that the covariates have no effect on the response variable. The ratio is calculated by subtracting the log-likelihood values of the full model from the loglikelihood values of a model with only the intercept term. Results indicated that impairment was more likely for middle-aged riders, unlicensed riders who did not wear a helmet, and that impaired related crashes are more likely to occur at night, on weekends and in rural areas. Logistic Regression Example 3: Krull, K.A., A.J. Khattak, and F.M. Council, Injury Effects of Rollovers and Events Sequence in Single-Vehicle Crashes, Transportation Research Record 1717, Transportation Research Board, National Research Council, Washington, D.C., January 2000, pp. 4654. Logistic regression models were developed to investigate the driver, roadway, and crash characteristics that influence the likelihood of fatal or incapacitating injuries given that a single-vehicle run-off-the-road crash has occurred. Two models were developed; one for all crashes of this type and the second for a subset in which the vehicle rolled over. Among the study findings were that the use of safety belts and slick roadways lead to a reduced likelihood of fatal or incapacitating injury. An increased likelihood exists when the vehicle rolls over after leaving the road, in particular for hit point object then rolled and hit longitudinal object then rolled crashes. Logistic Regression Example 4: Donelson, A.C., K. Ramachandran, K. Zhao, and A. Kalinowski, Rates of Occupant Deaths in Vehicle Rollover: The Importance of Fatality Risk Factors, Transportation Research Record 1665, Transportation Research Board, National Research Council, Washington, D.C., 1999, pp. 109117. Multivariate logistic regression models were developed to explore the effect of various factors on the likelihood of fatalities in single-vehicle rollover crashes involving lightduty trucks. The study objectives were to quantify the effect of fatality risk factors, adjust fatality-based rates for that influence, and assess how well-adjusted rates measured
Two models were developed. The first considered all drivers and front-seat passengers of motor vehicles. The second considered only drivers and front-seat passengers that sustained at least a nonincapacitating injury. Results indicate that older drivers and rainy weather increase the likelihood of safety belts being worn. Males, the involvement of alcohol, weekends, and nighttime decrease the likelihood of safety belts being used. The authors also conclude that the association of alcohol with the non-use of safety belts is underestimated without accounting for the overreporting of seatbelt use. Logistic Regression Example 2: Kim, K., S. Kim, and E. Yamashita, An Analysis of Alcohol Impaired Motorcycle Crashes in Hawaii, 1986 to 1995: An Analysis, Transportation Research Record 1734, Transportation Research Board, National Research Council, Washington, D.C., January 2000, pp. 7785. A logistic regression model was developed to explain the likelihood of alcohol impairment among crash-involved motorcycle riders in police-reported motorcycle crashes. The logistic model was of the form
73 differences for various groupings of vehicles. The models were calibrated using crashes of all severities. Counts and rates were then adjusted for the effect of the higher risk of those conditions that resulted in a fatality by multiplying the observed number of fatalities by the ratio of the sum of probabilities of fatality at a base condition (i.e., each variable was set to its safest level) to the sum of probabilities of fatality with the observed conditions. Logistic Regression Example 5: McGinnis, R.G., L.M. Wissinger, R.T. Kelly, and C.O. Acuna, Estimating the Influences of Driver, Highway, and Environmental Factors on Run-off-Road Crashes Using Logistic Regression, Presented at the 78th Annual Meeting of the Transportation Research Board, Washington, D.C., January 1999. Logistic regression was used to identify factors that differentiate run-off-road crashes from non-run-off-road crashes. Fifty-six separate models were created for each seven age groups, two gender groups, and four highway classification groups. The authors found that stratifying the data by age and gender showed differences in the affect of variables on drivers dependent on their age and gender. Variables that showed an increased likelihood of a run-off-road crash included a nonintersection location, presence of a horizontal curve, rural highway, alcohol involvement, slippery pavement condition, no street lighting, and a high speed limit. Impacts of the interactions between variables on the likelihood of a run-off-road crash were not found to be statistically significant. Meta-Analysis Using Log-Odds Meta-analysis is a statistical technique that combines the independent estimates of safety effectiveness from separate studies into one estimate by weighting each individual estimate according to its variance. Meta-Analysis Example 1: Elvik, R., The Safety Value of Guardrails and Crash Cushions: A Meta-Analysis of Evidence From Evaluation Studies, Accident Analysis and Prevention, Vol. 27, No. 4, 1995, pp. 523549. A meta-analysis technique was applied on 32 studies to evaluate the safety effects of median barriers, guardrails, and crash cushions. Meta-analysis is a statistical technique that combines the independent estimates of safety effectiveness from separate studies into one estimate by weighting each individual estimate according to its variance. In this study, the Log-odds meta-analysis method was used. The effect on accident rate of the three safety devices was estimated by the odds ratio, defined as
where ACC = total number of accidents; VKT = vehicle kilometers of travel; G indicates the presence of a median barrier, guard rail, or crash cushion; and W indicates without the presence of a median barrier, guardrail, or crash cushion. The statistical weight for each study result is calculated using the number of accidents for calculating the above odds ratio and is defined as
Logistic Regression Example 6: Lin, T.-D., P.P. Jovanis, and C.-Z. Yang, Modeling the Safety of Truck Driver Service Hours Using Time-Dependent Logistic Regression, Transportation Research Record 1407, Transportation Research Board, National Research Council, Washington, D.C., 1993, pp. 110. The study developed a time-dependent logistic regression model of crash risk of truck drivers, including both multiday (i.e., hours on and off-duty) and continuous driving time as factors. A cluster analysis was used to define 10 time-based driving patterns. For example, some patterns showed irregular driving during the first few days and then regular hours afterwards. Interaction terms describing time-dependent effects with independent variables, for example, the interaction between driving pattern and driving hour, were also considered. Results showed that driving time had the most influence on accident risk. Driving age and off-duty hours had a small effect except that drivers with less than 9 hours of off-time had a higher risk than drivers with a greater rest period. Drivers with at least 10 years of driving experience had a smaller accident risk. Drivers with infrequent driving patterns and a tendency towards night driving show a higher accident risk.
The estimated mean effect on accidents using all studies is calculated using the Log-odds method as
where Ei = the estimated effect of study i, and wi = the statistical weight assigned to study i. Also analyzed were the effects on accident severity, defined as the change in the probability of a fatal or injury accident, given that an injury occurred. There are different odds ratio and statistical weight formulae for these effects.
74 The study also tested for publication bias in the studies used. Publication bias occurs when research results are not published because the results are counterintuitive, e.g., an increase or no effect on accidents when a decrease was expected. Publication bias was investigated using a graphical method called the funnel graph method. Each study result is plotted on a graph in which the horizontal axis shows each result and the vertical axis shows the statistical weight assigned. If there is no publication bias a scatterplot of results should resemble an upside down funnel. As sample size increases the dispersion of estimates should converge, because larger sample sizes should give more accurate results. If the tails of the funnel are not symmetrical publication bias may exist. Publication bias was not found to be an issue in this study. Results indicated that median barriers increase accident rate by 30 percent but reduce fatalities by 20 percent and injuries by 10 percent in the event of an accident. Guardrails were estimated to reduce accident rate by 27 percent, fatalities by 44 percent, and injuries by 52 percent in the event of an accident. Crash cushions were estimated to reduce accident rate by 84 percent, fatalities by 69 percent, and injuries by 68 percent in the event of an accident. Meta-Analysis Example 2: Elvik, R., A Meta-Analysis of Studies Concerning the Safety Effects of Daytime Running Lights on Cars, Accident Analysis and Prevention, Vol. 28, No. 6, 1996, pp. 685694. The Log-odds meta-analysis method was used to evaluate the safety effectiveness of daytime running lights using 17 separate studies. The use of daytime running lights was estimated to result in a 1015 percent reduction in the number of multivehicle daytime accidents. or factors, explain the variance in the dependent variable. Frequently factor analysis is applied to data collected through survey methods, where a large number of question responses are likely to be correlated. Factor Analysis Example 1: Chliaoutakis, J.El., et al., The Impact of Young Drivers' Lifestyle on Their Road Traffic Accident Risk in Greater Athens Area, Accident Analysis and Prevention, Vol. 31, No. 6, 1999, pp. 771780. The relationship between the lifestyle of young drivers and the accident risk was examined through a factor analysis and logistic regression. Data on 146 males and 95 females was collected through a questionnaire containing 116 variables, divided into those concerning socio-demographic characteristics, attitudes and behaviors while driving, lifestyle and driving style, and accidents experienced. The main aim of factor analysis is to combine variables with a common background into a new variable and to understand the extent to which these new variables, or factors, explain the variance in the dependent variable. Through factor analysis the 74 variables related to lifestyle were combined into 10 factors that explained a 44.1 percent of the variance in accident experience. The 10 factors were given the following names, which are reflective of the variables that are included under the factor: culture, sport activity, elegance, car addiction, alcohol and drugs, interest in public affairs, amusement, aggressive behavior, religiousness, and car as a hobby. To illustrate how factors are made up of variables, the factor aggressive behavior was composed of variables involving a definite risk to other road users. The highest loading (or relative weight) for this factor were punishing other people due to several reasons (0.66), illegal overtaking (0.64), running red lights (0.59), bullying (0.54), and making indecent gestures/swearing at other drivers (0.52). The study concluded that there is a relationship between lifestyle and accident risk. Some lifestyle aspects such as alcohol consumption are related to a higher accident risk and others such as religiousness are related to a lower accident risk.
Meta-Analysis Example 3: Elvik, R., The Effects on Accidents of Studded Tires and Laws Banning Their Use: A Meta-Analysis of Evaluation Studies, Accident Analysis and Prevention, Vol. 31, No. 1, 1999, pp. 125134. The Log-odds meta-analysis method was used to evaluate the effects on accidents of studded tires and laws banning their use. Studies with a strong methodological approach were used to estimate a reduction of 5 percent on snowand ice-covered roads, a 2 percent reduction on dry roads, and a reduction of 4 percent for all road conditions combined. However, these results were not statistically significant. Factor Analysis The main aim of factor analysis is to combine independent variables with a common background into a new variable and to understand the extent to which these new variables,
Factor Analysis Example 2: Kanellaidis, G., J. Golias, and K. Zarifopoulos, A Survey of Drivers' Attitudes Toward Speed Limit Violations, Journal of Safety Research, Vol. 26, No. 1, 1995, pp. 3140. The compliance with speed limits on urban and interurban roads was analyzed versus the views of drivers on the relationship between speeding and the risk of accidents. The dominant factors relating to speeding were defined using a factor analysis. Data were collected by means of a questionnaire to 207 drivers. The information collected through the questionnaire included 10 possible reasons why drivers may speed and reasons they believe other drivers may speed. Possible reasons for speeding were:
75 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. they do not pay attention to the speed limit signs, do not consider the speed limit signs as reliable, do not agree with speed limits, are in a hurry, want to keep up with other traffic, absence of traffic police, are emotionally upset, want to show off to other drivers, overestimate their driving abilities, and underestimate driving risk at high speeds. estimated as 0 the continuous variable of BAC was regarded as missing. If BAC was not equal to 0 a specific level of BAC is estimated in the second stage. This second model used conventional linear regression. In the modeling procedure, it was not assumed that the same covariates that predict the probability of a non-zero BAC also predict the level of BAC. Prior to modeling the population was separated to calibrate models by vehicle class. The covariates include police-reported drinking, age, gender, use of restraint, injury severity, license status (no valid license, valid license), previous incidents (none, one incident, two or more incidents), day of week, time of day, vehicle role (single vehicle, multiple vehicle striking, multiple vehicle struck), and relation to roadway (on roadway, not on roadway). Once the covariates for the first and second stage models were chosen, multiple imputations of the missing data were created using a general location model (GLOM), which is a combination of the first stage loglinear model and the second stage linear regression procedure. The missing data were imputed under the simulated values of the parameters, which are seeded with 10 different random numbers. The authors concluded that the multiple imputation procedure was an improvement over the previous method based on a three-class linear discriminant model.
Each possible reason was to be ranked by the respondents on an 11-point scale. Three factors were constructed from the 10 possible reasons for speeding that explained 63 percent of the total variance for both factors concerning drivers themselves and concerning other drivers. The first factor is interpreted to account for reasons related to egocentric behavior of the drivers. Factor 2 accounts for reasons attributable to external influences that are not permanent. Factor 3 relates to the notion of speed limits (application of limits, reliability of signs, etc.).
Data Imputation Data imputation methods fill in missing information in data. The knowledge of the cases with non-missing data values are used to predict the likely value of missing variables. Data Imputation Example: Rubin, D.B., J.L. Schafer, and R. Subramanian, Multiple Imputation of Missing Blood Alcohol Concentration (BAC) Values in FARS. Report DOT HS 808 816, National Highway Traffic Safety Administration, Washington, D.C., 1998. This report overviews a methodology for imputing levels of blood alcohol concentration (BAC) for missing values in the Fatality Analysis Reporting System. The approach taken was to simulate 10 specific values of BAC for each missing value. From this estimated data valid statistical inferences including variance, confidence limits, and deviation tests could be drawn. The authors point out the advantage of this approach as opposed to estimating the probability of the involved person to fall within a predefined category of BAC (e.g., from BAC = 0.05 to 0.1). Estimating specific values also allows analysis of nonstandard boundaries of alcohol involvement and may provide more accurate estimates of variance, which is greater due to the missing data. The approach modeled BAC as two variables using characteristics of the accident and the involved person as covariates. The first stage modeled whether the involved person's BAC is equal to 0 or greater than 0 using a conventional loglinear model for cross-classified categorical data. If BAC was Other Models/Analyses of Potential Relevance Gebers, M.A., Exploratory Multivariable Analyses of California Driver Record Accident Rates, Transportation Research Record 1635, Transportation Research Board, National Research Council, Washington, D.C., 1999, pp. 72 80. Regression models were developed to predict accident experience among California drivers using driver record information. Two types of models were developed, the first using frequency data; i.e., a driver would have to experience 0, 1, 2, etc., crashes in the time period and the second, categorical data; i.e., whether or not a driver experienced a crash in the time period. For frequency data, ordinary leastsquares, weighted least-squares, Poisson, and negative binomial models were calibrated. Weights for weighted leastsquares regression were determined by first using ordinary leastsquares regression and then dividing the sample data into quartiles based on the predicted values. The individual weights were calculated as the standard deviation of each quartile. For categorical data, linear probability and logistic regression models were calibrated. Each model included the variables of prior total citations, prior total accidents, license class, age, gender, medical condition on record, and having a license restriction on record. The author concluded that all modeling techniques performed similarly when predicting future crashes.
THE TRANSPORTATION RESEARCH BOARD is a unit of the National Research Council, a private, nonprofit institution that provides independent advice on scientific and technical issues under a congressional charter. The Research Council is the principal operating arm of the National Academy of Sciences and the National Academy of Engineering. The mission of the Transportation Research Board is to promote innovation and progress in transportation by stimulating and conducting research, facilitating the dissemination of information, and encouraging the implementation of research findings. The Board's varied activities annually draw on approximately 4,000 engineers, scientists, and other transportation researchers and practitioners from the public and private sectors and academia, all of whom contribute their expertise in the public interest. The program is supported by state transportation departments, federal agencies including the component administrations of the U.S. Department of Transportation, and other organizations and individuals interested in the development of transportation. The National Academy of Sciences is a nonprofit, self-perpetuating society of distinguished scholars engaged in scientific and engineering research, dedicated to the furtherance of science and technology and to their use for the general welfare. Upon the authority of the charter granted to it by the Congress in 1863, the Academy has a mandate that requires it to advise the federal government on scientific and technical matters. Dr. Bruce Alberts is president of the National Academy of Sciences. The National Academy of Engineering was established in 1964, under the charter of the National Academy of Sciences, as a parallel organization of outstanding engineers. It is autonomous in its administration and in the selection of its members, sharing with the National Academy of Sciences the responsibility for advising the federal government. The National Academy of Engineering also sponsors engineering programs aimed at meeting national needs, encouraging education and research, and recognizes the superior achievements of engineers. Dr. William A. Wulf is president of the National Academy of Engineering. The Institute of Medicine was established in 1970 by the National Academy of Sciences to secure the services of eminent members of appropriate professions in the examination of policy matters pertaining to the health of the public. The Institute acts under the responsibility given to the National Academy of Sciences. by its congressional charter to be an adviser to the federal government and, upon its own initiative, to identify issues of medical care, research, and education. Dr. Kenneth I. Shine is president of the Institute of Medicine. The National Research Council was organized by the National Academy of Sciences in 1916 to associate the broad community of science and technology with the Academy's purposes of furthering knowledge and advising the federal government. Functioning in accordance with general policies determined by the Academy, the Council has become the principal operating agency of both the National Academy of Sciences and the National Academy of Engineering in providing services to the government, the public, and the scientific and engineering communities. The Council is administered jointly by both Academies and the Institute of Medicine. Dr. Bruce Alberts and Dr. William A. Wulf are chairman and vice chairman, respectively, of the National Research Council.

NCHRP Syn 295

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

NCHRP Syn 295

Hochgeladen von

Copyright:

Verfügbare Formate

NCHRP

Statistical Methods in Highway Safety Analysis

NATIONAL COOPERATIVE HIGHWAY RESEARCH PROGRAM

A Synthesis of Highway Practice

TRANSPORTATION RESEARCH BOARD

NATIONAL RESEARCH COUNCIL

TRANSPORTATION RESEARCH BOARD EXECUTIVE COMMITTEE 2001 Officers

NATIONAL COOPERATIVE HIGHWAY RESEARCH PROGRAM

NATIONAL COOPERATIVE HIGHWAY RESEARCH PROGRAM

NCHRP SYNTHESIS 295

CONSULTANT BHAGWANT N. PERSAUD Department of Civil Engineering Ryerson University

TRANSPORTATION RESEARCH BOARD NATIONAL RESEARCH COUNCIL

NATIONAL COOPERATIVE HIGHWAY RESEARCH PROGRAM

Published reports of the

and can be ordered through the Internet at: http://www.nationalacademies.org/trb/bookstore

Printed in the United States of America

CONCLUSIONS AND RECOMMENDATIONS

SUMMARY OF SURVEY RESPONSES

SOME ELECTRONIC RESOURCES RELEVANT TO HIGHWAY SAFETY ANALYSIS

REVIEW OF A SAMPLE OF RELEVANT METHODOLOGY FROM NON-MAINSTREAM TYPES OF SAFETY ANALYSES

STATISTICAL METHODS IN HIGHWAY SAFETY ANALYSIS

Types of Statistical Analyses

Overview of Issues in Highway Safety Analysis

PUBLISHED RESEARCH Methodology for Evaluating Treatment Effects

Methodology for Identifying Hazardous Locations

Multivariate Accident Models

Other Methods Relevant to Highway Safety Analysis

MAJOR RECENT AND ON-GOING RESEARCH INITIATIVES

NCHRP 20-45 Scientific Approaches for Transportation Research

Comprehensive Highway Safety Improvement Model (CHSIM)

SIMPLIFIED ILLUSTRATION OF THE IHSDM ACCIDENT PREDICTION ALGORITHM

IHSDM Research on Accident Prediction Methodology

SURVEY RESULTS PART I: GENERAL INFORMATION

Identification of High Collision Locations

High level of effort is ranked 1 and low level of effort is ranked 5.

Ability to Link Collision and Related Databases

CONCLUSIONS AND RECOMMENDATIONS

NATIONAL COOPERATIVE HIGHWAY RESEARCH PROGRAM

You may fax your response to him at 416-979-5122.

PART I: GENERAL INFORMATION

Agency name, mailing, website addresses:

Consultant contact (company name, contact phone and e-mail address)

Other: (Describe briefly and provide documentation of procedure if possible)

10b. How often is the screening process carried out?

C. Develop Collision Reduction Factors For Countermeasures/Features

PART III: PROBLEMS/ISSUES IN SAFETY ANALYSES Issue 1: Underreporting of collisions

Issue 5: Adequacy of traffic volume/exposure data

Number of drivers registered Toll receipts

Issue 6: Identifying comparison sites in before and after studies

Issue 7: Information on safety effectiveness in developing countermeasures

% from outside sources =

Issue 9: Ability to link collision and related databases

Before period minimum 9 8 12 1

After period maximum 11 12 14

TECHNIQUES USED IN DEVELOPING COLLISION REDUCTION FACTORS

PROBLEMS/ISSUES IN SAFETY ANALYSES UNDER-REPORTING OF COLLISIONS

TIME TRENDS IN COLLISION EXPERIENCE

CHANGES IN TRAFFIC VOLUMES IN BEFORE AND AFTER STUDIES

REGRESSION TO THE MEAN

ADEQUACY OF TRAFFIC VOLUME/EXPOSURE DATA

IDENTIFYING COMPARISON SITES IN BEFORE AND AFTER STUDIES

INFORMATION ON SAFETY EFFECTIVENESS IN DEVELOPING COUNTERMEASURES

020 2140 4160 6180 81100