A platform for research: civil engineering, architecture and urbanism
Machine learning or discrete choice models for car ownership demand estimation and prediction?
Discrete choice models are widely used to explain transportation behaviors, including a household's decision to own a car. They show how some distinct choice of human behavior or preference influences a decision. They are also used to project future demand estimates to support policy exploration. This latter use for prediction is indirectly aligned with and conditional to the model's estimation which aims to fit the observed data. In contrast, machine learning models are derived to maximize prediction accuracy through mechanisms such as out-of-sample validation, non-linear structure, and automated covariate selection, albeit at the expense of interpretability and sound behavioral theory. We investigate how machine learning models can outperform discrete choice models for prediction of car ownership using transportation household survey data from Singapore. We compare our household car ownership model (multinomial logit model) against various machine learning models (e.g. Random Forest, Support Vector Machines) by using 2008 data to derive, i.e. estimate models that we then use to predict 2012 ownership. The machine learning models are inferior to the discrete choice model when using discrete choice features. However, after engineering features more appropriate for machine learning they are superior. These results highlight both the cost of applying machine learning models in econometric contexts and an opportunity for improved prediction and better urban policy making through machine learning models with appropriate features.
Machine learning or discrete choice models for car ownership demand estimation and prediction?
Discrete choice models are widely used to explain transportation behaviors, including a household's decision to own a car. They show how some distinct choice of human behavior or preference influences a decision. They are also used to project future demand estimates to support policy exploration. This latter use for prediction is indirectly aligned with and conditional to the model's estimation which aims to fit the observed data. In contrast, machine learning models are derived to maximize prediction accuracy through mechanisms such as out-of-sample validation, non-linear structure, and automated covariate selection, albeit at the expense of interpretability and sound behavioral theory. We investigate how machine learning models can outperform discrete choice models for prediction of car ownership using transportation household survey data from Singapore. We compare our household car ownership model (multinomial logit model) against various machine learning models (e.g. Random Forest, Support Vector Machines) by using 2008 data to derive, i.e. estimate models that we then use to predict 2012 ownership. The machine learning models are inferior to the discrete choice model when using discrete choice features. However, after engineering features more appropriate for machine learning they are superior. These results highlight both the cost of applying machine learning models in econometric contexts and an opportunity for improved prediction and better urban policy making through machine learning models with appropriate features.
Machine learning or discrete choice models for car ownership demand estimation and prediction?
Paredes, Miguel (author) / Hemberg, Erik (author) / O'Reilly, Una-May (author) / Zegras, Chris (author)
2017-06-01
116583 byte
Conference paper
Electronic Resource
English
Enhanced utility estimation algorithm for discrete choice models in travel demand forecasting
Springer Verlag | 2025
|Measuring uncertainty in discrete choice travel demand forecasting models
Online Contents | 2016
|Measuring uncertainty in discrete choice travel demand forecasting models
Taylor & Francis Verlag | 2016
|