The invention discloses a
software defect prediction method, which specifically comprises the following steps of: dividing a
data set into three clusters of data samples according to data sample defect characteristics in a
software defect prediction
data set, a proper sampling method is selected from each cluster of data samples according to the distribution condition of defective samples in the cluster of data samples to generate new defective data samples, so that a new
data set is synthesized to
train a
software defect prediction model. According to the method, the problems of inter-
class imbalance and intra-
class imbalance of the data set can be effectively solved, and the software defect prediction effect is improved.