A platform for research: civil engineering, architecture and urbanism
Asphalt Pavement Crack Image Screening by Transformer-Based Model
The traditional digital image processing algorithms and Convolutional Neural Network (CNN) models cannot reach the ideal level in the accuracy of asphalt pavement crack image screening. We introduce Vision Transformer (ViT), a Transformer-based method for automatic classification of asphalt pavement images. ViT consists of linear projection layer, embedding layer, transformer block and MLP head layer. To verify the ability of ViT on the crack image screening task, we use asphalt pavement datasets from two sources, and use a transfer learning-based strategy to fine-tune the model after obtaining the weights pre-trained by ViT on the ImageNet dataset. The performance of the ViT model is compared with three popular deep learning models, in which ViT achieves 98.5% accuracy and 99.2% F1 Score on the self-collected asphalt complex pavement dataset. The results showed that ViT and its variants performed better than the classical CNN network Resnet50 in the task of screening asphalt pavement cracks. The image screening method of asphalt pavement cracks based on Transformer performs well and has good research prospects.
Asphalt Pavement Crack Image Screening by Transformer-Based Model
The traditional digital image processing algorithms and Convolutional Neural Network (CNN) models cannot reach the ideal level in the accuracy of asphalt pavement crack image screening. We introduce Vision Transformer (ViT), a Transformer-based method for automatic classification of asphalt pavement images. ViT consists of linear projection layer, embedding layer, transformer block and MLP head layer. To verify the ability of ViT on the crack image screening task, we use asphalt pavement datasets from two sources, and use a transfer learning-based strategy to fine-tune the model after obtaining the weights pre-trained by ViT on the ImageNet dataset. The performance of the ViT model is compared with three popular deep learning models, in which ViT achieves 98.5% accuracy and 99.2% F1 Score on the self-collected asphalt complex pavement dataset. The results showed that ViT and its variants performed better than the classical CNN network Resnet50 in the task of screening asphalt pavement cracks. The image screening method of asphalt pavement cracks based on Transformer performs well and has good research prospects.
Asphalt Pavement Crack Image Screening by Transformer-Based Model
Ziwei, Wang (author) / Jun, Feng (author) / Tian, Zhang (author)
2022-07-01
763458 byte
Conference paper
Electronic Resource
English