ENSEMBLE LEARNING APPROACHES FOR CLASSIFICATION WITH HIGH-DIMENSIONAL DATA

  • Cao Truong Tran Institute of Information and Communication Technology, Le Quy Don Technical University

Tóm tắt

Classification with high-dimensional data is a significant challenge in machine learning because the abundance of features in high-dimensional data makes it difficult to identify mean­ingful patterns, which leads to overfitting and reduced classification performance. Moreover, the computational cost of processing high-dimensional data is often prohibitively expensive, requiring specialized hardware or optimized algorithms. Ensemble learning is a powerful machine learning technique that combines multiple models to improve classification accuracy. By aggregating the predictions of multiple models, ensemble learning can reduce overfitting, increase robustness, and improve performance on a wide range of real-world classification problems. Ensemble learning is effective for classification with high-dimensional data because it can combine multiple models to mitigate the effects of the curse of dimensionality, reduce overfitting, and enhance generalization performance. By using different learning algorithms or subsets of features, ensemble learning can improve the diversity of the models, leading to better overall performance on high-dimensional data. This paper proposes two hybrid ensemble machine learning approaches that integrate random subspace ensemble with bagging and boosting to enhance classification performance with high-dimensional data. Experimental results demonstrate that these methods significantly improve classification accuracy with high­dimensional data.

điểm /   đánh giá
Phát hành ngày
2023-07-11
Chuyên mục
Bài viết