DOI QR코드

DOI QR Code

Sequential Pattern Mining for Intrusion Detection System with Feature Selection on Big Data

  • Fidalcastro, A (Sathyabama University Department of Computer Science and Engineering) ;
  • Baburaj, E (Sun college of Engineering and Technology Department of Computer Science and Engineering)
  • Received : 2016.05.03
  • Accepted : 2017.03.09
  • Published : 2017.10.31

Abstract

Big data is an emerging technology which deals with wide range of data sets with sizes beyond the ability to work with software tools which is commonly used for processing of data. When we consider a huge network, we have to process a large amount of network information generated, which consists of both normal and abnormal activity logs in large volume of multi-dimensional data. Intrusion Detection System (IDS) is required to monitor the network and to detect the malicious nodes and activities in the network. Massive amount of data makes it difficult to detect threats and attacks. Sequential Pattern mining may be used to identify the patterns of malicious activities which have been an emerging popular trend due to the consideration of quantities, profits and time orders of item. Here we propose a sequential pattern mining algorithm with fuzzy logic feature selection and fuzzy weighted support for huge volumes of network logs to be implemented in Apache Hadoop YARN, which solves the problem of speed and time constraints. Fuzzy logic feature selection selects important features from the feature set. Fuzzy weighted supports provide weights to the inputs and avoid multiple scans. In our simulation we use the attack log from NS-2 MANET environment and compare the proposed algorithm with the state-of-the-art sequential Pattern Mining algorithm, SPADE and Support Vector Machine with Hadoop environment.

Keywords

References

  1. G.V Nadiammai, M. Hemalatha, "Effective approach toward Intrusion Detection System using Data Mining Techniques," Egyptian Informatics Journal, December 2013.
  2. Jen-Yan Hang, I-En Liao, Yu-Fang Chung, Kuen-Tzung Chen, "Shielding Wireless Sensor Network using Markovian Intrusion Detection System with Attack Pattern Mining," Information Sciences, 29,March 2011.
  3. A. Chaudhary, V.N.Tiwari and A. Kumar, "Analysis of Fuzzy Logic Based Intrusion Detection System in Mobile Ad Hoc Networks," BIJIT-BVICAM'S International Journal of Information Technology 2014.
  4. Portnoy, L., Eskin, E. And Stolfo, S., "Intrusion detection with unlabeled data using clustering," in Proc. of the Workshop on Data Mining for Security Applications, November 2001.
  5. Y. Zhao, W. Liu, W. Lou, and Y. Fang, "Securing mobile ad hoc networks with certificate less public keys," IEEE Trans. Dependable Secure Comput., vol. 3, no. 4, pp. 386-399, Oct.-Dec. 2006. https://doi.org/10.1109/TDSC.2006.58
  6. Agrawal, R., Ramakrishnan, S., "Mining sequential patterns," in Proc. of 11th International Conference on Data Engineering, pp. 3-14. IEEE, 1995.
  7. Mohammed J. Zaki, "SPADE: An Efficient Algorithm For Mining Frequent Sequences," Machine Learning, 42, 31-60, Kluwer Academic Publishers, 2001. https://doi.org/10.1023/A:1007652502315
  8. Aseervatham, S., Osmani, A., Viennet, E., bitSPADE, "A Lattice-based Sequential Pattern Mining Algorithm Using Bitmap Representation," in Proc. of 6th Intern.Conf. Data Mining, pp. 792-797. IEEE, 2006.
  9. Ayres, J., Flannick, J., Gehrke, J., Yiu, T., "Sequential pattern mining using a bitmap representation," in Proc. of 8th ACM SIGKDD International Conference on Knowledge Dis-covery and Data Mining, pp. 429-435. ACM, 2002.
  10. Unil Yun, "A new framework for detecting weighted sequential patterns in large sequence databases," Science direct Knowledge-Based Systems 21, 110-122, 2008. https://doi.org/10.1016/j.knosys.2007.04.002
  11. Fayyad, U.M., and Irani, K.B., "The attribute selection problem in decision tree generation," in Proc. of AAAI-92, Proceedings of the Ninth National Conference on Artificial Intelligence, AAAI Press/The MIT Press, 104-110, 1992.
  12. Sun, Ron, and C. Lee Giles, "Sequence learning: Paradigms, algorithms, and applications," Springer Science & Business Media, Vol. 1828, 2001.
  13. Philippe Fournier-Viger, Antonio Gomariz , Michal Sebek, Martin Hlosta, "VGEN: Fast Vertical Mining of Sequential Generator Patterns."
  14. C.-H. Tsang et al., "Genetic-fuzzy rule mining approach and evaluation of feature selection techniques for anomaly intrusion detection," Pattern Recognition 40, 2373-2391, 2007. https://doi.org/10.1016/j.patcog.2006.12.009
  15. M. Ramze Rezaee, B. Goedhart, B.P.F. Lelieveldt, J.H.C. Reiber, "Fuzzy feature selection," Pattern Recognition 32, 1999.
  16. Marion Leleu, Christophe Rigotti , Jean-Francois Boulicaut, and Guillaume, "GO-SPADE: Mining Sequential Patterns over Datasets with Consecutive Repetitions," MLDM 2003, LNAI 2734, pp. 293-306, 2003.
  17. Liu, H., and Setiono, R., "A probabilistic approach to feature selection-a filter solution," in Proc. of International Conference on Machine Learning (ICML-96), July 3-6, 1996, Bari, Italy, San Francisco: Morgan Kaufmann Publishers, CA, 319-327, 1996.
  18. Siedlecki, W., and Sklansky, J,. "On automatic feature selection," International Journal of Pattern Recognition and Artificial Intelligence, 2, 197-220, 1998.
  19. M. Setnes, et al., "Similarity measures in fuzzy rule base simplification," IEEE Trans. Syst. Man Cybernetics.-Part B: Cybernetics 28 (3) 376-386, 1998.
  20. Ke Sun and Fengshan Bai, "Mining Weighted Association Rules without Preassigned Weights," IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 20, NO. 4, APRIL 2008.
  21. Maybin Muyeba, M. Sulaiman Khan, Frans Coenen "Fuzzy Weighted Association Rule Mining with Weighted Support and Confidence Framework."
  22. F. Tao, "Weighted association rule mining using weighted support and significant framework," in Proc. of the Ninth ACMSIGKDD International Conference on Knowledge Discovery and Data Mining, August, pp. 661-666, 2003.
  23. W. Wang, J. Yang, and P.S. Yu, "Efficient Mining of Weighted Association Rules (WAR)," in Proc. of ACM SIGKDD '00, pp.270-274, 2000.
  24. Siedlecki, W., and Sklansky, J., "On automatic feature selection," International Journal of Pattern Recognition and Artificial Intelligence, 2, 197-220, 1998.
  25. Shan Suthaharan, "Big Data Classification: problems and challenges in Network Intrusion Prediction with Machine Learning," University of North Carolina, Greensboro, NC 27402, USA.
  26. Dunren Che, Mejdl Safran, and Zhiyong Peng, "From Big Data to Big Data Mining: Challenges, Issues' and Opportunities," Suthern LLinois University, 62901, USA.
  27. Amir Gandomi, Murtaza Haider, "Beyond the Hype: Big Data Concepts, Methods and analytics," International Journal of Information Management, 2014.
  28. Gu Jifa, Zhang Lingling, "Data, DIKW, Big Data and Data Science," in Proc. of 2nd International conference on Information Technology and quantitative Management, ITQM, 2014.
  29. Samson Oluwaseun Fadiya, Serdar Saydam, Vanduhe Vany Zira, "advancing big data for Humanitarian needs," Humanitarian Technology :science ,Systems and Global Impact 2014, HumTech 2014.
  30. Xindong Wu, Fellow, IEEE, Xingquan Zhu, Senior Member, IEEE, Gong-Qing Wu, and Wei Ding, Senior Member, IEEE, "Data mining with Big Data," IEEE transactions on Knowledge and Data Engineering, vol 26,no.1 January 2014.
  31. Karthik Kambatlaa, Giorgos Kollias b, Vipin Kumarc, Ananth Gramaa, "Trends in Big Data analytics," J.Parallel Distrib. Comput, 2014.
  32. Kevin D. Bowers, Catherine Hart, Ari Juels, "Securing the Data in Big Data Security Analytics."
  33. Shruti Karde1 , Mettu Govind Rao2 , Rajesh Bhise3 "INTRUSION DETECTION AND ANOMALY DETECTION SYSTEM USING SEQUENTIAL PATTERN MINING," IJRET: International Journal of Research in Engineering and Technology, eISSN: 2319-1163 pISSN: 2321-7308.
  34. Anisur Rahman, Yue Xu, Kenneth Radke, Ernest Foo, "Finding Anomalies in SCADA Logs Using Rare Sequential Pattern Mining," Network and System Security, Volume 9955 of the series Lecture Notes in Computer Science pp 499-5.

Cited by

  1. Leveraging Big Data for Spark Deep Learning to Predict Rating vol.21, pp.6, 2017, https://doi.org/10.7472/jksii.2020.21.6.33