A platform for research: civil engineering, architecture and urbanism
Sequential Pattern Mining Algorithm Based on Text Data: Taking the Fault Text Records as an Example
Sequential pattern mining (SPM) is an effective and important method for analyzing time series. This paper proposed a SPM algorithm to mine fault sequential patterns in text data. Because the structure of text data is poor and there are many different forms of text expression for the same concept, the traditional SPM algorithm cannot be directly applied to text data. The proposed algorithm is designed to solve this problem. First, this study measured the similarity of fault text data and classified similar faults into one class. Next, this paper proposed a new text similarity measurement model based on the word embedding distance. Compared with the classic text similarity measurement method, this model can achieve good results in short text classification. Then, on the basis of fault classification, this paper proposed the SPM algorithm with an event window, which is a time soft constraint for obtaining a certain number of sequential patterns according to needs. Finally, this study used the fault text records of a certain aircraft as experimental data for mining fault sequential patterns. Experiment showed that this algorithm can effectively mine sequential patterns in text data. The proposed algorithm can be widely applied to text time series data in many fields such as industry, business, finance and so on.
Sequential Pattern Mining Algorithm Based on Text Data: Taking the Fault Text Records as an Example
Sequential pattern mining (SPM) is an effective and important method for analyzing time series. This paper proposed a SPM algorithm to mine fault sequential patterns in text data. Because the structure of text data is poor and there are many different forms of text expression for the same concept, the traditional SPM algorithm cannot be directly applied to text data. The proposed algorithm is designed to solve this problem. First, this study measured the similarity of fault text data and classified similar faults into one class. Next, this paper proposed a new text similarity measurement model based on the word embedding distance. Compared with the classic text similarity measurement method, this model can achieve good results in short text classification. Then, on the basis of fault classification, this paper proposed the SPM algorithm with an event window, which is a time soft constraint for obtaining a certain number of sequential patterns according to needs. Finally, this study used the fault text records of a certain aircraft as experimental data for mining fault sequential patterns. Experiment showed that this algorithm can effectively mine sequential patterns in text data. The proposed algorithm can be widely applied to text time series data in many fields such as industry, business, finance and so on.
Sequential Pattern Mining Algorithm Based on Text Data: Taking the Fault Text Records as an Example
Xinglong Yuan (author) / Wenbing Chang (author) / Shenghan Zhou (author) / Yang Cheng (author)
2018
Article (Journal)
Electronic Resource
Unknown
Metadata by DOAJ is licensed under CC BY-SA 1.0
Sustainable Fault Diagnosis of Imbalanced Text Mining for CTCS-3 Data Preprocessing
DOAJ | 2021
|Text-mining building maintenance work orders for component fault frequency
Taylor & Francis Verlag | 2019
|Text-mining building maintenance work orders for component fault frequency
British Library Online Contents | 2019
|British Library Online Contents | 2018
|Corporate Social Responsibility and Corporate Performance: A Hybrid Text Mining Algorithm
DOAJ | 2020
|