Eine Plattform für die Wissenschaft: Bauingenieurwesen, Architektur und Urbanistik
Building Codes Part-of-Speech Tagging Performance Improvement by Error-Driven Transformational Rules
To enable full automation, automated code compliance checking systems need to extract regulatory information in building codes and convert it to computable representations. This conversion is a natural language processing (NLP) task that requires highly accurate part-of-speech (POS) tagging results on building codes. Existing POS taggers, however, do not provide such accuracy on building codes. To address this need, the authors propose to improve the performance of POS taggers by error-driven transformational rules that revise machine-tagged POS results. The proposed method utilizes a syntactic and semantic rule-based, NLP approach combined with a structure that is inspired by transfer learning. This method generates a group of transformational rulesets, from simple ones to complex ones, that will convert machine taggers’ tagging results to their corresponding human-labeled gold standard. The transformational rules utilize syntactic and semantic information of domain texts. All rules are constrained not to introduce any new errors when fixing existing errors of machine taggers. The last ruleset, which fixes most common remaining errors in textual data after all other rules are applied, is exempted from this constraint. An experimental test on part-of-speech tagged building code (PTBC) data shows this method reduced 82.7% of errors in POS tagging results of building codes, which increased the POS tagging accuracy on building codes from 89.13% to 98.12%.
Building Codes Part-of-Speech Tagging Performance Improvement by Error-Driven Transformational Rules
To enable full automation, automated code compliance checking systems need to extract regulatory information in building codes and convert it to computable representations. This conversion is a natural language processing (NLP) task that requires highly accurate part-of-speech (POS) tagging results on building codes. Existing POS taggers, however, do not provide such accuracy on building codes. To address this need, the authors propose to improve the performance of POS taggers by error-driven transformational rules that revise machine-tagged POS results. The proposed method utilizes a syntactic and semantic rule-based, NLP approach combined with a structure that is inspired by transfer learning. This method generates a group of transformational rulesets, from simple ones to complex ones, that will convert machine taggers’ tagging results to their corresponding human-labeled gold standard. The transformational rules utilize syntactic and semantic information of domain texts. All rules are constrained not to introduce any new errors when fixing existing errors of machine taggers. The last ruleset, which fixes most common remaining errors in textual data after all other rules are applied, is exempted from this constraint. An experimental test on part-of-speech tagged building code (PTBC) data shows this method reduced 82.7% of errors in POS tagging results of building codes, which increased the POS tagging accuracy on building codes from 89.13% to 98.12%.
Building Codes Part-of-Speech Tagging Performance Improvement by Error-Driven Transformational Rules
Xue, Xiaorui (Autor:in) / Zhang, Jiansong (Autor:in)
07.07.2020
Aufsatz (Zeitschrift)
Elektronische Ressource
Unbekannt
Inductive Improvement of Part-of-Speech Tagging and Its Effect on a Terminology of Molecular Biology
British Library Conference Proceedings | 2005
|Transformational Performance Solutions
Online Contents | 2015