Class Imbalance Reduction and Training Data Selection for Cross Project Defect Prediction
Keywords:Cross project defect prediction, defect prediction, class imbalance, machine learning algorithms, software metrics
The research aims to predict errors in a targeted project using data from other projects. This project is named as the Cross-Project Defect Prediction (CPDP). There are a number of ways available to improve the predictable performance of CPDP models. However, there is no comparison of modern methods. Predictability facilitates the rational distribution of testing resources by detecting software modules that may be problematic before releasing products. If a project does not have defective history data, project projectile prediction (CPDP) is an alternative. The project disability forecasting process is significantly improved by choosing appropriate training data (CPDP). In this study, there is a reduction of the class inequality class (CIR) algorithm proposed to create similarities between impeccable and impact cable records of data sets by considering data distribution structures and comparing learning package tools for the many built-in devices considered as a benchmark in dealing with issues of class inequality such as KNN, Decision tree, Naive Bays, Random-Forest, AdaBoost Algorithms, Gradient Boosting and Extra tree classifier algorithms.