NETWORK INTRUSION DETECTION SYSTEMS USING SUPERVISED MACHINE LEARNING CLASSIFICATION AND DIMENSIONALITY REDUCTION TECHNIQUES: A SYSTEMATIC REVIEW

(Received: 21-Aug.-2021, Revised: 4-Nov.-2021 , Accepted: 14-Nov.-2021)

Authors Zein Ashi, Laila Aburashed, Mahmoud Al-Qudah, Abdallah Qusef,

Keywords #Network intrusion detection #Machine learning #Supervised learning #Dimensionality #Systematic review

Abstract Protecting the confidentiality, integrity and availability of cyberspace and network (NW) assets has become an increasing concern. The rapid increase in the Internet size and the presence of new computing systems (like Cloud) are creating great incentives for intruders. Therefore, security engineers have to develop new technologies to match growing threats to NWs. New and advanced technologies have emerged to create more efficient intrusion detection systems using machine learning (ML) and dimensionality reduction techniques, to help security engineers bolster more effective NW Intrusion Detection Systems (NIDSs). This systematic review provides a comprehensive review of the most recent NIDS using the supervised ML classification and dimensionality reduction techniques, it shows how the used ML classifiers, dimensionality reduction techniques and evaluating metrics have improved NIDS construction. The key point of this study is to provide up-to-date knowledge for new interested researchers.

References

[1] Z. Ahmad, A. Shahid Khan, C. Wai Shiang, J. Abdullah and F. Ahmad, "Network Intrusion Detection System: A Systematic Study of Machine Learning and Deep Learning Approaches," Transactions on Emerging Telecom. Technologies, vol. 32, no. 1, pp. 1–29, DOI: 10.1002/ett.4150, Oct. 2021.

[2] P. Spadaccino and F. Cuomo, "Intrusion Detection Systems for IoT: Opportunities and Challenges Offered by Edge Computing," arXiv, Art no. 2012.01174v1, Dec. 2020.

[3] D. Anderson, T. Frivold and A. Valdes, "Next-generation Intrusion Detection Expert System (NIDES): A Summary," Computer Science Laboratory Rep. SRI-CSL-95-07, [Online], Available: http://merlot.usc.edu/cs530-s04/papers/Anderson95a.pdf, May 1995.

[4] A. S. Alzahrani, R. A. Shah, Y. Qian and M. Ali, "A Novel Method for Feature Learning and Network Intrusion Classification," Alexandria Engineering Journal, vol. 59, no. 3, pp. 1159–1169, Jun. 2020.

[5] W. A. Gould, "Spoilage of Canned Tomatoes and Tomato Products," in: Tomato Production, Processing and Technology, Ch. 25, pp. 419–431, 3rd Ed., Sawston, U.K.: Woodhead Publishing, DOI: 10.1533/9781845696146.3.419, 1992.

[6] S. M. Othman, F. Mutaher Ba-Alwi, N. T. Alsohybe and A. T. Zahary, "Survey on Intrusion Detection System Types," Int. J. Cyber-Security Digit. Forensics, vol. 7, no. 4, pp. 444–462, Dec. 2018.

[7] A. Verma and V. Ranga, "Statistical Analysis of CIDDS-001 Dataset for Network Intrusion Detection Systems Using Distance-based Machine Learning," Procedia Computer Science, vol. 125, pp. 709–716, DOI: 10.1016/j.procs.2017.12.091, Dec. 2018.

[8] A. S. Shekhawat, F. Di Troia and M. Stamp, "Feature Analysis of Encrypted Malicious Traffic," Expert Systems with Applications, vol. 125, pp. 130–141, DOI: 10.1016/j.eswa.2019.01.064, Feb. 2019.

[9] T. Aldwairi, D. Perera and M. A. Novotny, "An Evaluation of the Performance of Restricted Boltzmann Machines As a Model for Anomaly Network Intrusion Detection," Computer Networks, vol. 144, pp. 111–119, DOI: 10.1016/j.comnet.2018.07.025, Oct. 2018.

[10] P. Dahiya and D. K. Srivastava, "Network Intrusion Detection in Big Dataset Using Spark," Procedia Computer Science, vol. 132, pp. 253–262, DOI: 10.1016/j.procs.2018.05.169, 2018.

[11] A. Singh, N. Thakur and A. Sharma, "A Review of Supervised Machine Learning Algorithms," Proc. of the 3rd IEEE International Conference on Computing for Sustainable Global Development (INDIACom), pp. 1310–1315, New Delhi, India, Mar. 2016.

[12] S. Khatri, A. Arora and A. P. Agrawal, "Supervised Machine Learning Algorithms for Card Fraud Detection: A Comparison," Proc. of the 10th IEEE Int. Conf. on Cloud Computing, Data Science &Amp; Engineering (Confluence), pp. 680–683, Noida, India, Mar. 2020.

[13] M. Qasaimeh, R. Turab and R. S. Al-Qassas, "Authentication Techniques in Smart Grid: A Systematic Review," Telkomnika, vol. 17, no. 3, pp. 1584–1594, DOI: 10.12928/TELKOMNIKA.V17I3.11437, Jun. 2019.

[14] Z. Ashi, L. Aburashed, M. Al-FDZD’UHK and M. Qasaimeh, "Fast and Reliable DDoS Detection Using Dimensionality Reduction and Machine Learning," Proc. of the 15th IEEE Int. Conf. for Internet Technology and Secured Transactions (ICITST), DOI: 10.23919/ICITST51030.2020.9351347, London, UK, Dec. 2020.

[15] S. B. Kotsiantis, D. Kanellopoulos and P. E. Pintelas, "Data Preprocessing for Supervised Learning," International Journal of Computer and Information Engineering's, vol. 1, no. 12, pp. 4104–4110, 2007.

[16] D. Stiawan et al., "An Approach for Optimizing Ensemble Intrusion Detection Systems," IEEE Access, vol. 9, pp. 6930–6947, DOI: 10.1109/ACCESS.2020.3046246, Dec. 2021.

[17] W. Xue and T. Wu, "Active Learning-based XGBoost for Cyber Physical System against Generic AC False Data Injection Attacks," IEEE Access, vol. 8, pp. 144575–144584, Aug. 2020.

[18] R. Vijayanand and D. Devaraj, "A Novel Feature Selection Method Using Whale Optimization Algorithm and Genetic Operators for Intrusion Detection System in Wireless Mesh Network," IEEE Access, vol. 8, pp. 56847–56854, DOI: 10.1109/ACCESS.2020.2978035, Mar. 2020.

[19] S. Aljawarneh, M. Aldwairi and M. B. Yassein, "Anomaly-based Intrusion Detection System through Feature Selection Analysis and Building Hybrid Efficient MRGHO,? J. of Computational Science, vol. 25, no. October, pp. 152–160, DOI: 10.1016/j.jocs.2017.03.006, Mar. 2018.

[20] C. T. Tran, M. Zhang, P. Andreae and B. Xue, "Bagging and Feature Selection for Classification with Incomplete Data," Proc. of the European Conference on the Applications of Evolutionary Computation (EvoApplications 2017), Part of the Lecture Notes in Computer Science Book Series, vol. 10199, Cham: Springer, DOI: 10.1007/978-3-319-55849-3_31, 2017.

[21] R. Ihya, A. Namir, S. El Filali, M. Ait Daoud and F. Z. Guerss, "J48 Algorithms of Machine Learning for Predicting a UVHU’V Acceptance of an E-orientation Systems," Proceedings of the 4th ACM International Conference on Smart City Applications (SCA '19), Article no. 20, pp. 1-8, DOI: 10.1145/3368756.3368995, Oct. 2019.

[22] F. Alam and S. Pachauri, "Comparative Study of J48, Naive Bayes and One-R Classification Technique for Credit Card Fraud Detection Using WEKA," Journal of Advanced Computer Science & Technology, vol. 10, no. 6, pp. 1731–1743, 2017.

[23] R. Harode, "XGBoost: A Deep Dive into Boosting," SFU/// Professional Master's Program in Computer Science, [Online], Available: https://medium.com/sfu-cspmp/xgboost-a-deep-dive-into-boosting-f06c9c41349 (accessed on Oct. 15, 2021).

[24] M. Guia, R. R. Silva and J. Bernardino, "Comparison of Naive Bayes, Support Vector Machine, Decision Trees and Random Forest on Sentiment Analysis," Proc. 11th Int. Jt. Conf. Knowl. Discov. Knowl. Eng. Knowl. Manag. (KDIR 2019), vol. 1, pp. 525–531, DOI: 10.5220/0008364105250531, Nov. 2019.

[25] Y. Chang, W. Li and Z. Yang, "Network Intrusion Detection Based on Random Forest and Support Vector Machine," Proc. of 2017 IEEE Int. Conf. on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC), vol. 1, pp. 635–638, DOI: 10.1109/CSE-EUC.2017.118, Guangzhou, China, Jul. 2017.

[26] E. C. Matel, A. M. Sison and R. P. Medina, "Optimization of Network Intrusion Detection System Using Genetic Algorithm with Improved Feature Selection Technique," Proc. of the IEEE 11th Int. Conf. Humanoid, Nanotechnology, Inf. Technol. Commun. Control. Environ. Manag. (HNICEM), Article no. 19556390, Laoag, Philippines, DOI: 10.1109/HNICEM48295.2019.9073439, 2019.

[27] S. Sun, Z. Ye, L. Yan, J. Su and R. Wang, "Wrapper Feature Selection Based on Lightning Attachment Procedure Optimization and Support Vector Machine for Intrusion Detection," Proc. of the 4th IEEE Int. Symposium on Wireless Systems within the Int. Conf. on Intelligent Data Acquisition and Advanced Computing Systems, pp. 41–46, Lviv, Ukraine, Sep. 2018.

[28] M. Pechenizkiy, A. Tsymbal and S. Puuronen, "PCA-based Feature Transformation for Classification: Issues in Medical Diagnostics," Proc. of the 17th IEEE Symposium on Computer-based Medical Systems, vol. 17, pp. 535–540, DOI: 10.1109/cbms.2004.1311770, Bethesda, MD, USA, Jul. 2004.

[29] R. Zebari, A. Abdulazeez, D. Zeebaree, D. Zebari and J. Saeed, "A Comprehensive Review of Dimensionality Reduction Techniques for Feature Selection and Feature Extraction," Journal of Applied Science and Technology Trends, vol. 1, no. 2, pp. 56–70, DOI: 10.38094/jastt1224, May 2020.

[30] D. 0ODGHQLć, "Feature Selection for Dimensionality Reduction," Proc. of Subspace, Latent Structure and Feature Selection, vol. 3940, C. Saunders, M. Grobelnik, S. Gunn and J. Shawe-Taylor, Eds., Berlin: Springer, pp. 84–102, DOI: 10.1007/11752790_5, 2006.

[31] M. Masaeli, G. Fung and J. G. Dy, "From Transformation-based Dimensionality Reduction to Feature Selection," Proceedings of the 27th International Conference on Machine Learning, pp. 751–758, DOI: 10.5555/3104322.3104418, Haifa, 2010.

[32] A. Nazir and R. A. Khan, "A Novel Combinatorial Optimization based Feature Selection Method for Network Intrusion Detection," Computers & Security, vol. 102, Article no. 102164, DOI: 10.1016/j.cose.2020.102164, Mar. 2021.

[33] M. Mazini, B. Shirazi and I. Mahdavi, "Anomaly Network-based Intrusion Detection System Using a Reliable Hybrid Artificial Bee Colony and AdaBoost Algorithms," Journal of King Saud University - Computer and Information Sciences, vol. 31, no. 4, pp. 541–553, Oct. 2019.

[34] M. M. Sakr, M. A. Tawfeeq and A. B. El-Sisi, "Filter versus Wrapper Feature Selection for Network Intrusion Detection System," Proc. of the IEEE 9th Int. Conf. Intell. Comput. Inf. Syst. (ICICIS), pp. 209– 214, DOI: 10.1109/ICICIS46948.2019.9014797, Cairo, Egypt, Dec. 2019.

[35] A. H. Hamamoto, L. F. Carvalho, L. D. H. Sampaio, T. Abrão and M. L. Proença, "Network Anomaly Detection System Using Genetic Algorithm and Fuzzy Logic," Expert Systems with Applications, vol. 92, no. C, pp. 390–402, DOI: 10.1016/j.eswa.2017.09.013, Feb. 2018.

[36] J. L. G. Torres, C. A. Catania and E. Veas, "Active Learning Approach to Label Network Traffic Datasets," Journal of Information Security and Applications, vol. 49, Article no. 102388, 2019.

[37] B. Anderson and D. McGrew, "Machine Learning for Encrypted Malware Traffic Classification: Accounting for Noisy Labels and Non-stationarity," Proc. of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '17), vol. F1296, pp. 1723–1732, DOI: 10.1145/3097983.3098163, Aug. 2017.

[38] S. Mohammadi, H. Mirvaziri, M. Ghazizadeh-Ahsaee and H. Karimipour, "Cyber Intrusion Detection by Combined Feature Selection Algorithm," Journal of Information Security and Applications, vol. 44, pp. 80–88, DOI: 10.1016/j.jisa.2018.11.007, Feb. 2019.

[39] S. Dwivedi, M. Vardhan and S. Tripathi, "An Effect of Chaos Grasshopper Optimization Algorithm for Protection of Network Infrastructure," Computers Networks, vol. 176, Article no. 107251, DOI: 10.1016/j.comnet.2020.107251, May 2020.

[40] V. Kanimozhi and T. P. Jacob, "Artificial Intelligence Outflanks All Other Machine Learning Classifiers in Network Intrusion Detection System on the Realistic Cyber Dataset CSE-CIC-IDS2018 Using Cloud Computing," ICT Express, vol. 7, no. 3, pp. 366–370, DOI: 10.1016/j.icte.2020.12.004, Sep. 2021.

[41] H. Jiang, Z. He, G. Ye and H. Zhang, "Network Intrusion Detection based on PSO-Xgboost Model," IEEE Access, vol. 8, pp. 58392–58401, DOI: 10.1109/ACCESS.2020.2982418, Mar. 2020.

[42] P. Ding, J. Li, M. Wen, L. Wang and H. Li, "Efficient BiSRU Combined with Feature Dimensionality Reduction for Abnormal Traffic Detection," IEEE Access, vol. 8, pp. 164414–164427, DOI: 10.1109/ACCESS.2020.3022355, Sep. 2020.

[43] A. Nagaraja, U. Boregowda, K. Khatatneh, R. Vangipuram, R. Nuvvusetty and V. Sravan Kiran, "Similarity-based Feature Transformation for Network Anomaly Detection," IEEE Access, vol. 8, pp. 39184–39196, DOI: 10.1109/ACCESS.2020.2975716, Feb. 2020.

[44] R. A. Ghazy, E. S. M. EL-Rabaie, M. I. Dessouky, N. A. El-Fishawy and F. E. Abd El-Samie, "Efficient Techniques for Attack Detection Using Different Features Selection Algorithms and Classifiers," Wireles Personal Communication, vol. 100, no. 4, pp. 1689–1706, DOI: 10.1007/s11277-018-5662-0, May 2018.

[45] N. Kunhare, R. Tiwari and J. Dhar, "Particle Swarm Optimization and Feature Selection for an Intrusion Detection System," Sadhana, vol. 45, Article no. 109, DOI: 10.1007/s12046-020-1308-5, May 2020.

[46] S. M. Kasongo and Y. Sun, "Performance Analysis of Intrusion Detection Systems Using a Feature Selection Method on the UNSW-NB15 Dataset," Journal of Big Data, vol. 7, Article no. 105, DOI: 10.1186/s40537-020-00379-6, Nov. 2020.

[47] N. Bindra and M. Sood, "Detecting DDoS Attacks Using Machine Learning Techniques and Contemporary Intrusion Detection Dataset," Automatic Control and Computer Sciences, vol. 53, no. 5, pp. 419–428, DOI: 10.3103/S0146411619050043, Nov. 2019.

[48] H. Rajadurai and U. D. Gandhi, "A Stacked Ensemble Learning Model for Intrusion Detection in a Wireless Network," Neural Comp. and App., vol. 5, DOI: 10.1007/s00521-020-04986-5, May 2020.

[49] T. A. Alamiedy, M. Anbar, Z. N. M. Alqattan and Q. M. Alzubi, "Anomaly-based Intrusion Detection System Using Multi-objective Grey Wolf Optimization Algorithm," Journal of Ambient Intelligence and Humanized Computing, vol. 11, pp. 3735–3756, DOI: 10.1007/s12652-019-01569-8, Nov. 2019.

[50] Y. Zhu and Y. Zheng, "Traffic Identification and Traffic Analysis Based on Support Vector Machine," Neural Comput. Appl., vol. 32, pp. 1903–1911, DOI: 10.1007/s00521-019-04493-2, Sep. 2020.

[51] A. Sebbar, K. Zkik, Y. Baddi, M. Boulmalf and M. D. E. C. El Kettani, "MitM Detection and Defense Mechanism CBNA-RF Based on Machine Learning for Large-scale SDN CRQWH[W,? J. of Ambient Intelligence and Humanized Comp., vol. 11, pp. 5875–5894, DOI: 10.1007/s12652-020-02099-4, 2020.

[52] K. Thakur and G. Kumar, "Nature Inspired Techniques and Applications in Intrusion Detection Systems: Recent Progress and Updated Perspective," Archives of Computational Methods in Engineering, Article no. 0123456789, DOI: 10.1007/s11831-020-09481-7, Aug. 2020.

[53] A. B. Abhale and S. S. Manivannan, "Supervised Machine Learning Classification Algorithmic Approach for Finding Anomaly Type of Intrusion Detection in Wireless Sensor Network," Optical Memory and Neural Networks, vol. 29, pp. 244-256, DOI: 10.3103/S1060992X20030029, 2020.

[54] A. Verma and V. Ranga, "Evaluation of Network Intrusion Detection Systems for RPL Based 6LoWPAN Networks in IoT," Wireless Personal Communications, vol. 108, pp. 1571–1594, DOI: 10.1007/s11277- 019-06485-w, Oct. 2019.

[55] D. Moon, H. Im, I. Kim and J. H. Park, "DTB-IDS: An Intrusion Detection System Based on Decision Tree Using behavior Analysis for Preventing APT Attacks," Journal of Supercomputing, vol. 73, pp. 2881–2895, DOI: 10.1007/s11227-015-1604-8, Jul. 2017.

[56] N. Martins, J. M. Cruz, T. Cruz and P. H. Abreu, "Adversarial Machine Learning Applied to Intrusion and Malware Scenarios: A Systematic Review," IEEE Access, vol. 8, pp. 35403–35419, Feb. 2020.

[57] A. A. Ramaki, A. Rasoolzadegan and A. J. Jafari, "A Systematic Review on Intrusion Detection Based on the Hidden Markov Model," Statistical Analysis and Data Mining, vol. 11, pp. 111–134, Apr. 2018.

[58] C. Gonzalez, "Increasing Security in Military Self-protected Software," Jordanian Journal of Computers and Information Technology (JJCIT), vol. 7, no. 3, pp. 253–267, Sep. 2021.

[59] A. B. Nassif, M. A. Talib, Q. Nasir, H. Albadani and F. M. Dakalbab, "Machine Learning for Cloud Security: A Systematic Review," IEEE Access, vol. 9, pp. 20717–20735, Jan. 2021.