Loading...
Development of Macro-Level Crash Prediction Models, using Advanced Statistical and Machine Learning Methods
Mohammadpour, Iman | 2023
73
Viewed
- Type of Document: Ph.D. Dissertation
- Language: Farsi
- Document No: 56553 (09)
- University: Sharif University of Technology
- Department: Civil Engineering
- Advisor(s): Nassiri, Habibollah
- Abstract:
- Road casualty is the fifth leading cause of death in Iran. To adopt proper countermeasures there is a need to evaluate the consequences of the implemented policies. Despite the development of crash time series models, these methods have not been in accordance with the multivariate, seasonal, and non-linear nature of crash data. On the other hand, the interpretable crash causal analysis frameworks are descriptive and they lack predictive power. Moreover, the unobserved homogeneity between observations has been widely overlooked in the crash causal analysis literature. This thesis introduces a novel causal analysis methodology by combining the interpretability and prediction power of the Structural Equation Modeling (SEM) and Bayesian Networks (BN) models. This thesis also aims to introduce the Random Forest (RF) regression for short-term crash time series prediction. The prediction accuracy of the developed model is compared with the classical Box-Jenkins algorithm in out-of-sample forecasts. To this aim, the police crash report and loop detectors datasets are aggregated at two spatiotemporal dimensions, including country-wide-monthly and province-wide-daily levels. The monthly country-wide data from 2016 to 2021 was employed for time series analysis, where the first 55 data points were used as the training sample, and the remaining ten months were considered the test sample. The performance of the random forest model (MAPE=2.6) with the exogenous variables of traffic flow, crash year, and month outperformed the best 〖SARIMA (1,0,0) (1,0,0)〗_12 model (MAPE=5.7) with traffic flow as the regressor. Besides, the average speed had a negative linear association with the total crashes. While it revealed an increasing effect on fatal crashes. To shed light on the different crash mechanisms by severity levels, which clarifies these findings, the loop detector and crash data were aggregated daily for rural multilane highways of Tehran province, Iran, covering two years, 2020–2021. The partial least squares SEM (PLS-SEM) was employed for crash causal analysis along with the finite mixture partial least squares (FIMIX-PLS) segmentation to account for potential unobserved heterogeneity between observations. The results indicated that the higher the mean speed and the lower the traffic volume, the higher odds of distracted driving. The distracted driving was in turn associated with the higher vulnerable road users’ crashes and single-vehicle crashes, triggering a higher frequency of severe crashes. Moreover, lower mean speed and higher traffic volume were positively correlated with the percentage of tailgating violations, which, in turn, predicted multi-vehicle crashes as the main predictor of non-severe crash outcomes. Finally, the hybrid SEM-BN modelling framework was utilized to explore the expected consequences of different policies, revealing the efficiency of the developed instrument
- Keywords:
- Bayesian Network ; Time Series Analysis ; Crash Frequency Prediction ; Crash Causation Analysis ; Machine Learning ; Structural Equations Modeling ; Box-Jenkins Time Series ; Road Accidents
- محتواي کتاب
- view