A platform for research: civil engineering, architecture and urbanism
Construction site accident analysis using text mining and natural language processing techniques
Abstract Workplace safety is a major concern in many countries. Among various industries, construction sector is identified as the most hazardous work place. Construction accidents not only cause human sufferings but also result in huge financial loss. To prevent reoccurrence of similar accidents in the future and make scientific risk control plans, analysis of accidents is essential. In construction industry, fatality and catastrophe investigation summary reports are available for the past accidents. In this study, text mining and natural language process (NLP) techniques are applied to analyze the construction accident reports. To be more specific, five baseline models, support vector machine (SVM), linear regression (LR), K-nearest neighbor (KNN), decision tree (DT), Naive Bayes (NB) and an ensemble model are proposed to classify the causes of the accidents. Besides, Sequential Quadratic Programming (SQP) algorithm is utilized to optimize weight of each classifier involved in the ensemble model. Experiment results show that the optimized ensemble model outperforms rest models considered in this study in terms of average weighted F1 score. The result also shows that the proposed approach is more robust to cases of low support. Moreover, an unsupervised chunking approach is proposed to extract common objects which cause the accidents based on grammar rules identified in the reports. As harmful objects are one of the major factors leading to construction accidents, identifying such objects is extremely helpful to mitigate potential risks. Certain limitations of the proposed methods are discussed and suggestions and future improvements are provided.
Highlights Text mining and natural language processing techniques can be successfully applied to analyze accident reports in text format. Optimized ensemble models outperforms singles models in terms of F1 score. A rule based approach is suitable for object extraction when the grammar structure used in the text is standard.
Construction site accident analysis using text mining and natural language processing techniques
Abstract Workplace safety is a major concern in many countries. Among various industries, construction sector is identified as the most hazardous work place. Construction accidents not only cause human sufferings but also result in huge financial loss. To prevent reoccurrence of similar accidents in the future and make scientific risk control plans, analysis of accidents is essential. In construction industry, fatality and catastrophe investigation summary reports are available for the past accidents. In this study, text mining and natural language process (NLP) techniques are applied to analyze the construction accident reports. To be more specific, five baseline models, support vector machine (SVM), linear regression (LR), K-nearest neighbor (KNN), decision tree (DT), Naive Bayes (NB) and an ensemble model are proposed to classify the causes of the accidents. Besides, Sequential Quadratic Programming (SQP) algorithm is utilized to optimize weight of each classifier involved in the ensemble model. Experiment results show that the optimized ensemble model outperforms rest models considered in this study in terms of average weighted F1 score. The result also shows that the proposed approach is more robust to cases of low support. Moreover, an unsupervised chunking approach is proposed to extract common objects which cause the accidents based on grammar rules identified in the reports. As harmful objects are one of the major factors leading to construction accidents, identifying such objects is extremely helpful to mitigate potential risks. Certain limitations of the proposed methods are discussed and suggestions and future improvements are provided.
Highlights Text mining and natural language processing techniques can be successfully applied to analyze accident reports in text format. Optimized ensemble models outperforms singles models in terms of F1 score. A rule based approach is suitable for object extraction when the grammar structure used in the text is standard.
Construction site accident analysis using text mining and natural language processing techniques
Zhang, Fan (author) / Fleyeh, Hasan (author) / Wang, Xinru (author) / Lu, Minghui (author)
Automation in Construction ; 99 ; 238-248
2018-12-18
11 pages
Article (Journal)
Electronic Resource
English
Construction site accident analysis using text mining and natural language processing techniques
British Library Online Contents | 2019
|Text mining and natural language processing in construction
Elsevier | 2023
|Taylor & Francis Verlag | 2024
|