This project explores the application of advanced SMOTE (Synthetic Minority Over-sampling Technique) algorithms—Borderline-SMOTE, Borderline-SMOTE2, and CURE SMOTE—to effectively address classification challenges in imbalanced datasets. By implementing machine learning models such as Random Forest and K-Nearest Neighbors, this study evaluates the performance improvements achieved through these oversampling techniques across three diverse datasets:
- Mammography Dataset
- Credit Card Fraud Dataset
- ParkourMaker Dataset
This project achieved the highest grade of 1.0 for its comprehensive approach to handling imbalanced datasets and the effective implementation of SMOTE variants. The application of CURE SMOTE, in particular, demonstrated significant improvements in minority class prediction accuracy, especially within the credit card fraud dataset when combined with the Random Forest algorithm.