Data and Ensemble Machine Learning Fusion Based Intelligent Software Defect Prediction System

Adnan Khan, Muhammad, Ghazal, T M, Abbas, Sagheer, Aftab, Shabib, Al Hamadi, Hussam and Yeob Yeun, Chan (2023) Data and Ensemble Machine Learning Fusion Based Intelligent Software Defect Prediction System. Computers, Materials & Continua, 75 (3). pp. 6083-6100. ISSN 1546-2226

Full text not available from this repository.

Abstract

The software engineering field has long focused on creating high-quality software despite limited resources. Detecting defects before the testing stage of software development can enable quality assurance engineers to concentrate on problematic modules rather than all the modules. This approach can enhance the quality of the final product while lowering development costs. Identifying defective modules early on can allow for early corrections and ensure the timely delivery of a high-quality product that satisfies customers and instills greater confidence in the development team. This process is known as software defect prediction, and it can improve end-product quality while reducing the cost of testing and maintenance. This study proposes a software defect prediction system that utilizes data fusion, feature selection, and ensemble machine learning fusion techniques. A novel filter-based metric selection technique is proposed in the framework to select the optimum features. A three-step nested approach is presented for predicting defective modules to achieve high accuracy. In the first step, three supervised machine learning techniques, including Decision Tree, Support Vector Machines, and Naïve Bayes, are used to detect faulty modules. The second step involves integrating the predictive accuracy of these classification techniques through three ensemble machine-learning methods: Bagging, Voting, and Stacking. Finally, in the third step, a fuzzy logic technique is employed to integrate the predictive accuracy of the ensemble machine learning techniques. The experiments are performed on a fused software defect dataset to ensure that the developed fused ensemble model can perform effectively on diverse datasets. Five NASA datasets are integrated to create the fused dataset: MW1, 6084 CMC, 2023, vol.75, no.3 PC1, PC3, PC4, and CM1. According to the results, the proposed system exhibited superior performance to other advanced techniques for predicting software defects, achieving a remarkable accuracy rate of 92.08%.

Affiliation: Skyline University College
SUC Author(s): Adnan Khan, Muhammad and Ghazal, T M ORCID: https://orcid.org/0000-0003-0672-7924
All Author(s): Adnan Khan, Muhammad, Ghazal, T M, Abbas, Sagheer, Aftab, Shabib, Al Hamadi, Hussam and Yeob Yeun, Chan
Item Type: Article
Subjects: B Information Technology > BF Software Emgineeting
B Information Technology > BL Machine Learning
B Information Technology > BM Artificial Intelligence
B Information Technology > BT Data Management
Divisions: Skyline University College > School of IT
Depositing User: Mr Mosys Team
Date Deposited: 25 Dec 2023 13:42
Last Modified: 25 Dec 2023 13:42
URI: https://research.skylineuniversity.ac.ae/id/eprint/718
Publisher URL: https://doi.org/10.32604/cmc.2023.037933
Publisher OA policy: https://v2.sherpa.ac.uk/id/publication/37365
Related URLs:

    Actions (login required)

    View Item
    View Item
    Statistics for SkyRep ePrint 718 Statistics for this ePrint Item