Software defect prediction method

A software defect prediction and defect technology, applied in software testing/debugging, computer components, error detection/correction, etc., can solve problems such as intra-class imbalance and data set inter-class imbalance, so as to improve quality and improve prediction. Accuracy, cost reduction effect

Pending Publication Date: 2021-06-04
SHANGHAI MARITIME UNIVERSITY
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present invention proposes a software defect prediction method, which divides the software defect prediction data set samples into three clusters by using the K-means clustering algorithm, and selects a suitable sampling method for processing according to the characteristics of the data samples in each cluster, To solve the problem of imbalance between classes and within classes of data sets, and improve the prediction effect of software defects

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments. Advantages and features of the present invention will be apparent from the following description and claims. It should be noted that the drawings are all in a very simplified form and use imprecise ratios, which are only used to facilitate and clearly assist the purpose of illustrating the embodiments of the present invention.

[0037] like figure 1 Shown, the software defect prediction method that the present invention proposes, comprises the following steps:

[0038] S1. Obtain a software defect prediction data set, and perform feature selection processing on the data set;

[0039] In this embodiment, the MDP data set announced by NASA is used, and the CM1, JM1, KC1, and PC3 data sets are selected as the software defect prediction data set in this embodiment. The software defect prediction data set also includes two types of data s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a software defect prediction method, which specifically comprises the following steps of: dividing a data set into three clusters of data samples according to data sample defect characteristics in a software defect prediction data set, a proper sampling method is selected from each cluster of data samples according to the distribution condition of defective samples in the cluster of data samples to generate new defective data samples, so that a new data set is synthesized to train a software defect prediction model. According to the method, the problems of inter-class imbalance and intra-class imbalance of the data set can be effectively solved, and the software defect prediction effect is improved.

Description

technical field [0001] The invention relates to the field of software defect prediction in software warehouse mining, in particular to a software defect prediction method based on a clustering combination sampling method. Background technique [0002] Pre-identifying possible defects in software through software defect prediction technology will help reduce test costs, improve test efficiency, and improve software performance and quality. However, the software defect prediction accuracy is still not satisfactory due to the impact of data class imbalance. [0003] Traditional sampling techniques, such as random oversampling methods, generate new defective samples for the overall data set, resulting in the generation of a large number of duplicate data in dense areas of defective samples, while insufficient samples are generated in sparse areas of defective samples, resulting in the number of overall samples reaching Expected value, to solve the problem of between-class imbal...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/36G06K9/62
CPCG06F11/3688G06F11/3692G06F18/23213G06F18/24323G06F18/214
Inventor 王颖任洪敏
Owner SHANGHAI MARITIME UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products