A platform for research: civil engineering, architecture and urbanism
The Effects of Resampling on Classifying Imbalanced Datasets
The majority of data classifiers work well when the distribution of different classes in the dataset is well balanced. Issues start to occur due to the cases of dataset imbalances. This paper applies solutions to the dataset imbalance problem by utilizing Oversampling, Undersampling and hybrid approaches in order to tackle the imbalanced datasets. A dataset which contains 400 observations, and 4 variables was used. The data distribution was imbalanced where 70% of the data belonged to one class and the remaining 30% belonged to the other class. The number of instances that were used for training the dataset was 188 and 97 while the number of instances that were used for testing the dataset was 92 and 23. Three models were used for classification: Random Forest, Support Vector Machine, and Naïve Bayes. Resampling was applied on the training instances. Results show that the classifiers’ accuracy increases after treating the problem of imbalance.
The Effects of Resampling on Classifying Imbalanced Datasets
The majority of data classifiers work well when the distribution of different classes in the dataset is well balanced. Issues start to occur due to the cases of dataset imbalances. This paper applies solutions to the dataset imbalance problem by utilizing Oversampling, Undersampling and hybrid approaches in order to tackle the imbalanced datasets. A dataset which contains 400 observations, and 4 variables was used. The data distribution was imbalanced where 70% of the data belonged to one class and the remaining 30% belonged to the other class. The number of instances that were used for training the dataset was 188 and 97 while the number of instances that were used for testing the dataset was 92 and 23. Three models were used for classification: Random Forest, Support Vector Machine, and Naïve Bayes. Resampling was applied on the training instances. Results show that the classifiers’ accuracy increases after treating the problem of imbalance.
The Effects of Resampling on Classifying Imbalanced Datasets
Obaid, Waleed (author) / Nassif, Ali Bou (author)
2022-02-21
1633042 byte
Conference paper
Electronic Resource
English
DOAJ | 2024
|