Three-stage data generation algorithm for multiclass network intrusion detection with highly imbalanced dataset

Almomani, A, Gupta, B B, Chui, Kwok Tai, Chaurasia, Priyanka, Arya, Varsha and Alhalabi, Wadee (2023) Three-stage data generation algorithm for multiclass network intrusion detection with highly imbalanced dataset. International Journal of Intelligent Networks, 4. pp. 202-210. ISSN 2666-6030

[thumbnail of pii/S2666603023000209] Text
pii/S2666603023000209 - Published Version

Download (2kB)

Abstract

The Internet plays a crucial role in our daily routines. Ensuring cybersecurity to Internet users will provide a safe online environment. Automatic network intrusion detection (NID) using machine learning algorithms has recently received increased attention recently. The NID model is prone to bias towards the classes with more training samples due to highly imbalanced datasets across different types of attacks. The challenge in generating additional training data for minority classes is the generation of insufficient data. The study's purpose is to address this challenge, which extends the data generation ability by proposing a three-stage data generation algorithm using the synthetic minority over-sampling technique, a generative adversarial network (GAN), and a variational autoencoder. A convolutional neural network is employed to extract the representative features from the data, which were fed into a support vector machine with a customised kernel function. An ablation study evaluated the effectiveness of the three-stage data generation, feature extraction, and customised kernel. This was followed by a performance comparison between our study and existing studies. The findings revealed that the proposed NID model achieved an accuracy of 91.9%–96.2% in the four benchmark datasets. In addition, it outperformed existing methods such as GAN-based deep neural networks, conditional Wasserstein GAN-based stacked autoencoder, synthesised minority oversampling technique-based random forest, and variational autoencoder-based deep neural network, by 1.51%–28.4%.

Affiliation: Skyline University College
SUC Author(s): Almomani, A ORCID: https://orcid.org/0000-0002-8808-6114 and Gupta, B B
All Author(s): Almomani, A, Gupta, B B, Chui, Kwok Tai, Chaurasia, Priyanka, Arya, Varsha and Alhalabi, Wadee
Item Type: Article
Uncontrolled Keywords: Convolutional neural network , Data generation , Generative adversarial network , Kernel function , Multiclass classification , Network intrusion detection , Support vector machine , Synthetic minority over-sampling technique
Subjects: B Information Technology > BD Big Data Analitics
B Information Technology > BM Artificial Intelligence
B Information Technology > BW Computer Networks
Divisions: Skyline University College > School of IT
Depositing User: Mr Mosys Team
Date Deposited: 25 Dec 2023 13:28
Last Modified: 25 Dec 2023 13:28
URI: https://research.skylineuniversity.ac.ae/id/eprint/745
Publisher URL: https://doi.org/10.1016/j.ijin.2023.08.001
Publisher OA policy:
Related URLs:

    Actions (login required)

    View Item
    View Item
    Statistics for SkyRep ePrint 745 Statistics for this ePrint Item