Eine Plattform für die Wissenschaft: Bauingenieurwesen, Architektur und Urbanistik
Safety compliance checking of construction behaviors using visual question answering
Abstract Unsafe construction behavior, one of the leading factors of accidents and casualties, can be reduced by strengthening construction inspection. However, current methods use either manual inspection or inefficient cross-modal models based on multiple backbone networks. To alleviate the problems, a “rule-question” transformation and annotation system is formulated, and the unsafe behavior detection is turned into a visual reasoning task: visual question answering (VQA). The VQA model is developed based on a vision-and-language Transformer, and the unsafe behavior could be identified based on the output answers. A dataset containing 16 safety rules and 2386 related construction images is used to fine-tune and validate the VQA model. The results show that the developed VQA model achieves an average recall of 0.81 at a faster reasoning speed. Finally, an applet for safety report generation is implemented to demonstrate the feasibility and practicability of the safety compliance checking based on VQA.
Highlights VQA applied for construction safety compliance checking. A “rule-question” transformation and annotation system was constructed for the VQA application. A dataset containing 16 safety rules and 2386 images was built for VQA modeling and evaluation. The VQA model achieved a Recall of 0.81 at a real-time response speed. Safety reports were automatically generated based on the proposed framework.
Safety compliance checking of construction behaviors using visual question answering
Abstract Unsafe construction behavior, one of the leading factors of accidents and casualties, can be reduced by strengthening construction inspection. However, current methods use either manual inspection or inefficient cross-modal models based on multiple backbone networks. To alleviate the problems, a “rule-question” transformation and annotation system is formulated, and the unsafe behavior detection is turned into a visual reasoning task: visual question answering (VQA). The VQA model is developed based on a vision-and-language Transformer, and the unsafe behavior could be identified based on the output answers. A dataset containing 16 safety rules and 2386 related construction images is used to fine-tune and validate the VQA model. The results show that the developed VQA model achieves an average recall of 0.81 at a faster reasoning speed. Finally, an applet for safety report generation is implemented to demonstrate the feasibility and practicability of the safety compliance checking based on VQA.
Highlights VQA applied for construction safety compliance checking. A “rule-question” transformation and annotation system was constructed for the VQA application. A dataset containing 16 safety rules and 2386 images was built for VQA modeling and evaluation. The VQA model achieved a Recall of 0.81 at a real-time response speed. Safety reports were automatically generated based on the proposed framework.
Safety compliance checking of construction behaviors using visual question answering
Ding, Yuexiong (Autor:in) / Liu, Muyang (Autor:in) / Luo, Xiaowei (Autor:in)
11.09.2022
Aufsatz (Zeitschrift)
Elektronische Ressource
Englisch
Automatic Construction Safety Report Using Visual Question Answering and Segmentation Model
Springer Verlag | 2025
|Visual Question Answering Bahasa Indonesia Berbasis Deep Learning untuk Pembelajaran Visual Anak TK
DOAJ | 2024
|Elsevier | 2024
|