Eine Plattform für die Wissenschaft: Bauingenieurwesen, Architektur und Urbanistik
Deep learning-based text detection and recognition on architectural floor plans
Abstract An important aspect of automatic floor plan analysis is the extraction of textual information, as it is essential for a thorough understanding of the drawing. This paper presents a text extraction approach utilizing a deep learning-based object detection model and state-of-the-art Optical Character Recognition (OCR) methods. The paper contributes to the research community in three ways: First, it introduces additional annotations to existing data sets to encompass text elements. Second, it proposes a specialized data synthesis pipeline, allowing for generating training images that mimic important characteristics of real data. Finally, it documents a comparative study of deep learning-based object detection architectures (Tesseract, EAST, CRAFT, Faster R-CNN, YOLOv5, YOLOR, YOLOv7, and YOLOv8) and OCR tools (PARSEq, MATRN, EasyOCR, and Tesseract) for the task. Results indicate that YOLOv7 yields the best text detection performance (up to 97.5% wmAP) and PARSEq excels in character recognition (85.2% CER). The data sets are made available.
Highlights Data set established for text detection on floor plans with bounding box annotations. Proposed customizable synthetic data generation method for text on floor plans. Comparison of deep learning detection models enhanced with synthetic data. Comparison of OCR tools on data set highlights need for improvement. Modular text extraction pipeline for downstream task-specific adoption.
Deep learning-based text detection and recognition on architectural floor plans
Abstract An important aspect of automatic floor plan analysis is the extraction of textual information, as it is essential for a thorough understanding of the drawing. This paper presents a text extraction approach utilizing a deep learning-based object detection model and state-of-the-art Optical Character Recognition (OCR) methods. The paper contributes to the research community in three ways: First, it introduces additional annotations to existing data sets to encompass text elements. Second, it proposes a specialized data synthesis pipeline, allowing for generating training images that mimic important characteristics of real data. Finally, it documents a comparative study of deep learning-based object detection architectures (Tesseract, EAST, CRAFT, Faster R-CNN, YOLOv5, YOLOR, YOLOv7, and YOLOv8) and OCR tools (PARSEq, MATRN, EasyOCR, and Tesseract) for the task. Results indicate that YOLOv7 yields the best text detection performance (up to 97.5% wmAP) and PARSEq excels in character recognition (85.2% CER). The data sets are made available.
Highlights Data set established for text detection on floor plans with bounding box annotations. Proposed customizable synthetic data generation method for text on floor plans. Comparison of deep learning detection models enhanced with synthetic data. Comparison of OCR tools on data set highlights need for improvement. Modular text extraction pipeline for downstream task-specific adoption.
Deep learning-based text detection and recognition on architectural floor plans
Schönfelder, Phillip (Autor:in) / Stebel, Fynn (Autor:in) / Andreou, Nikos (Autor:in) / König, Markus (Autor:in)
25.10.2023
Aufsatz (Zeitschrift)
Elektronische Ressource
Englisch
SpaceScope: Spatial Content-Based Retrieval of Architectural Floor Plans
British Library Conference Proceedings | 2003
|Recognition and Indexing of Architectural Features in Floor Plans on the Internet
British Library Conference Proceedings | 2000
|