Eine Plattform für die Wissenschaft: Bauingenieurwesen, Architektur und Urbanistik
Vector-Quantized Variational Teacher and Multimodal Collaborative Student for Crack Segmentation via Knowledge Distillation
This paper proposes a novel method for real-time crack segmentation in infrastructure inspection that achieves state-of-the-art performance. This approach leverages knowledge distillation, in which a vector-quantized variational autoencoder (VQ-VAE) acts as the “teacher” that extracts informative representations and learns codebook, and a multimodal collaborative student (MCS) utilizes the learned codebook for improved crack segmentation. This framework, incorporating the Teacher’s Codebook Cheating (TCC), achieves high accuracy and efficiency. With minimal parameters (0.59 million), the model demonstrates significant improvements in crack segmentation speed and precision, achieving a Dice score of 93.19, Intersection over Union (IOU) of 0.8723, and mean pixel accuracy of 97.52. Notably, the model processes frames at an impressive 89.3 frames per second (FPS), outperforming all other state-of-the-art methods despite using a smaller input size of ; nevertheless, its efficiency stems from its simplicity, with only 0.59 million parameters, making it well-suited for resource-constrained environments. These results highlight the effectiveness of our method for real-time crack segmentation, paving the way for more automated and accessible infrastructure inspection.
Vector-Quantized Variational Teacher and Multimodal Collaborative Student for Crack Segmentation via Knowledge Distillation
This paper proposes a novel method for real-time crack segmentation in infrastructure inspection that achieves state-of-the-art performance. This approach leverages knowledge distillation, in which a vector-quantized variational autoencoder (VQ-VAE) acts as the “teacher” that extracts informative representations and learns codebook, and a multimodal collaborative student (MCS) utilizes the learned codebook for improved crack segmentation. This framework, incorporating the Teacher’s Codebook Cheating (TCC), achieves high accuracy and efficiency. With minimal parameters (0.59 million), the model demonstrates significant improvements in crack segmentation speed and precision, achieving a Dice score of 93.19, Intersection over Union (IOU) of 0.8723, and mean pixel accuracy of 97.52. Notably, the model processes frames at an impressive 89.3 frames per second (FPS), outperforming all other state-of-the-art methods despite using a smaller input size of ; nevertheless, its efficiency stems from its simplicity, with only 0.59 million parameters, making it well-suited for resource-constrained environments. These results highlight the effectiveness of our method for real-time crack segmentation, paving the way for more automated and accessible infrastructure inspection.
Vector-Quantized Variational Teacher and Multimodal Collaborative Student for Crack Segmentation via Knowledge Distillation
J. Comput. Civ. Eng.
Qiu, Shi (Autor:in) / Zaheer, Qasim (Autor:in) / Shah, S. Muhammad Ahmed Hassan (Autor:in) / Ai, Chengbo (Autor:in) / Wang, Jin (Autor:in) / Zhan, You (Autor:in)
01.05.2025
Aufsatz (Zeitschrift)
Elektronische Ressource
Englisch
Taylor & Francis Verlag | 2024
|Teacher-Assistant Knowledge Distillation Based Indoor Positioning System
DOAJ | 2022
|