Word Sense Disambiguation in Bangla Language Using Supervised Methodology with Necessary Modifications: Fid-Bau Portal

Fachinformationsdienst BAUdigital

A platform for research: civil engineering, architecture and urbanism

Word Sense Disambiguation in Bangla Language Using Supervised Methodology with Necessary Modifications

Pal, Alok Ranjan / Saha, Diganta / Dash, Niladri Sekhar / Pal, Antara

Abstract An attempt is made in this paper to report how a supervised methodology has been adopted for the task of word sense disambiguation in Bangla with necessary modifications. At the initial stage, the Naïve Bayes probabilistic model that has been adopted as a baseline method for sense classification, yields moderate result with 81% accuracy when applied on a database of 19 (nineteen) most frequently used Bangla ambiguous words. On experimental basis, the baseline method is modified with two extensions: (a) inclusion of lemmatization process into of the system, and (b) bootstrapping of the operational process. As a result, the level of accuracy of the method is slightly improved up to 84% accuracy, which is a positive signal for the whole process of disambiguation as it opens scope for further modification of the existing method for better result. The data sets that have been used for this experiment include the Bangla POS tagged corpus obtained from the Indian Languages Corpora Initiative, and the Bangla WordNet, an online sense inventory developed at the Indian Statistical Institute, Kolkata. The paper also reports about the challenges and pitfalls of the work that have been closely observed and addressed to achieve expected level of accuracy.

Access

Check availability in my library

Order at Subito €

Page navigation

Document information

Export, share and cite

Word Sense Disambiguation in Bangla Language Using Supervised Methodology with Necessary Modifications

Pal, Alok Ranjan / Saha, Diganta / Dash, Niladri Sekhar / Pal, Antara

Abstract An attempt is made in this paper to report how a supervised methodology has been adopted for the task of word sense disambiguation in Bangla with necessary modifications. At the initial stage, the Naïve Bayes probabilistic model that has been adopted as a baseline method for sense classification, yields moderate result with 81% accuracy when applied on a database of 19 (nineteen) most frequently used Bangla ambiguous words. On experimental basis, the baseline method is modified with two extensions: (a) inclusion of lemmatization process into of the system, and (b) bootstrapping of the operational process. As a result, the level of accuracy of the method is slightly improved up to 84% accuracy, which is a positive signal for the whole process of disambiguation as it opens scope for further modification of the existing method for better result. The data sets that have been used for this experiment include the Bangla POS tagged corpus obtained from the Indian Languages Corpora Initiative, and the Bangla WordNet, an online sense inventory developed at the Indian Statistical Institute, Kolkata. The paper also reports about the challenges and pitfalls of the work that have been closely observed and addressed to achieve expected level of accuracy.

Word Sense Disambiguation in Bangla Language Using Supervised Methodology with Necessary Modifications

Pal, Alok Ranjan / Saha, Diganta / Dash, Niladri Sekhar / Pal, Antara

Abstract An attempt is made in this paper to report how a supervised methodology has been adopted for the task of word sense disambiguation in Bangla with necessary modifications. At the initial stage, the Naïve Bayes probabilistic model that has been adopted as a baseline method for sense classification, yields moderate result with 81% accuracy when applied on a database of 19 (nineteen) most frequently used Bangla ambiguous words. On experimental basis, the baseline method is modified with two extensions: (a) inclusion of lemmatization process into of the system, and (b) bootstrapping of the operational process. As a result, the level of accuracy of the method is slightly improved up to 84% accuracy, which is a positive signal for the whole process of disambiguation as it opens scope for further modification of the existing method for better result. The data sets that have been used for this experiment include the Bangla POS tagged corpus obtained from the Indian Languages Corpora Initiative, and the Bangla WordNet, an online sense inventory developed at the Indian Statistical Institute, Kolkata. The paper also reports about the challenges and pitfalls of the work that have been closely observed and addressed to achieve expected level of accuracy.

Access

Check availability in my library

Order at Subito €

Page navigation

Document information

Export, share and cite

Document information

Title:

Word Sense Disambiguation in Bangla Language Using Supervised Methodology with Necessary Modifications

Contributors:

Pal, Alok Ranjan (author) / Saha, Diganta (author) / Dash, Niladri Sekhar (author) / Pal, Antara (author)

Published in:

Journal of The Institution of Engineers (India): Series B ; 99 ; 519-526

Publication date:

2018-05-24

Size:

8 pages

ISSN:

2250-2114 , 2250-2106

DOI:

https://doi.org/10.1007/s40031-018-0337-5

Type of media:

Article (Journal)

Type of material:

Electronic Resource

Language:

English

Keywords:

Natural language processing , Word sense disambiguation , Naïve Bayes method , Lemmatization , Bootstrapping Engineering , Communications Engineering, Networks

Similar titles

Translation-based Word Sense Disambiguation: Appendices

Lyse, Gunn Inger | BASE | 2011

Free access

Word Sense Disambiguation Using the Hopfield Model of Neural Networks

Sreenivasa Rao, M. / Pujari, A. K. | British Library Online Contents | 1999

An Intelligent Information Retrieval System using Automatic Word Sense Disambiguation

Ramasubramanian, P. G. / Agah, A. / Gauch, S. E. | British Library Online Contents | 2007

Research of Word Sense Disambiguation Based on Soft Pattern

Jia, K.L. | British Library Online Contents | 2011

Learning Word Sense Disambiguation in biomedical text with difference between training and test distributions

Son, Jeong-Woo / Park, Seong-Bae | British Library Online Contents | 2012