A platform for research: civil engineering, architecture and urbanism
Towards Sustainable Safe Driving: A Multimodal Fusion Method for Risk Level Recognition in Distracted Driving Status
Precise driving status recognition is a prerequisite for human–vehicle collaborative driving systems towards sustainable road safety. In this study, a simulated driving platform was built to capture multimodal information simultaneously, including vision-modal data representing driver behaviour and sensor-modal data representing vehicle motion. Multisource data are used to quantify the risk of distracted driving status from four levels, safe driving, slight risk, moderate risk, and severe risk, rather than detecting action categories. A multimodal fusion method called vision-sensor fusion transformer (V-SFT) was proposed to incorporate the vision-modal of driver behaviour and sensor-modal data of vehicle motion. Feature concatenation was employed to aggregate representations of different modalities. Then, successive internal interactions were performed to consider the spatiotemporal dependency. Finally, the representations were clipped and mapped into four risk level label spaces. The proposed approach was evaluated under different modality inputs on the collected datasets and compared with some baseline methods. The results showed that V-SFT achieved the best performance with an recognition accuracy of 92.0%. It also indicates that fusing multimodal information effectively improves driving status understanding, and V-SFT extensibility is conducive to integrating more modal data.
Towards Sustainable Safe Driving: A Multimodal Fusion Method for Risk Level Recognition in Distracted Driving Status
Precise driving status recognition is a prerequisite for human–vehicle collaborative driving systems towards sustainable road safety. In this study, a simulated driving platform was built to capture multimodal information simultaneously, including vision-modal data representing driver behaviour and sensor-modal data representing vehicle motion. Multisource data are used to quantify the risk of distracted driving status from four levels, safe driving, slight risk, moderate risk, and severe risk, rather than detecting action categories. A multimodal fusion method called vision-sensor fusion transformer (V-SFT) was proposed to incorporate the vision-modal of driver behaviour and sensor-modal data of vehicle motion. Feature concatenation was employed to aggregate representations of different modalities. Then, successive internal interactions were performed to consider the spatiotemporal dependency. Finally, the representations were clipped and mapped into four risk level label spaces. The proposed approach was evaluated under different modality inputs on the collected datasets and compared with some baseline methods. The results showed that V-SFT achieved the best performance with an recognition accuracy of 92.0%. It also indicates that fusing multimodal information effectively improves driving status understanding, and V-SFT extensibility is conducive to integrating more modal data.
Towards Sustainable Safe Driving: A Multimodal Fusion Method for Risk Level Recognition in Distracted Driving Status
Huiqin Chen (author) / Hao Liu (author) / Hailong Chen (author) / Jing Huang (author)
2023
Article (Journal)
Electronic Resource
Unknown
Metadata by DOAJ is licensed under CC BY-SA 1.0
Editor's Note - Distracted and Driving Don't Mix
Online Contents | 2012
Distracted Driving Performance Measures: Spectral Power Analysis
British Library Online Contents | 2015
|Distracted Motor Vehicle Driving at Highway-Rail Grade Crossings
British Library Online Contents | 2015
|