DOI QR코드

DOI QR Code

Malware Family Recommendation using Multiple Sequence Alignment

다중 서열 정렬 기법을 이용한 악성코드 패밀리 추천

  • 조인겸 (한양대학교 컴퓨터소프트웨어공학과) ;
  • 임을규 (한양대학교 컴퓨터공학부)
  • Received : 2015.07.22
  • Accepted : 2015.11.13
  • Published : 2016.03.15

Abstract

Malware authors spread malware variants in order to evade detection. It's hard to detect malware variants using static analysis. Therefore dynamic analysis based on API call information is necessary. In this paper, we proposed a malware family recommendation method to assist malware analysts in classifying malware variants. Our proposed method extract API call information of malware families by dynamic analysis. Then the multiple sequence alignment technique was applied to the extracted API call information. A signature of each family was extracted from the alignment results. By the similarity of the extracted signatures, our proposed method recommends three family candidates for unknown malware. We also measured the accuracy of our proposed method in an experiment using real malware samples.

악성코드 개발자들은 악성코드 탐지를 회피하기 위하여 변종 악성코드를 유포한다. 정적 분석 기반의 안티 바이러스로는 변종 악성코드를 탐지하기 어려우며, 따라서 API 호출 정보 기반의 동적 분석이 필요하다. 본 논문에서는 악성코드 분석가의 변종 악성코드 패밀리 분류에 도움을 줄 수 있는 악성코드 패밀리 추천 기법을 제안하였다. 악성코드 패밀리의 API 호출 정보를 동적 분석을 통하여 추출하였다. 추출한 API 호출 정보에 다중 서열 정렬 기법을 적용하였다. 정렬 결과로부터 각 악성코드 패밀리의 시그니쳐를 추출하였다. 시그니쳐와의 유사도를 기준으로, 제안하는 기법이 새로운 악성코드의 패밀리 후보를 3개까지 추천하도록 하였다. 실험을 통하여 제안한 악성코드 패밀리 추천 기법의 정확도를 측정하였다.

Keywords

Acknowledgement

Supported by : 정보통신기술진흥센터

References

  1. D. Bilar, "Opcodes as predictor for malware," International Journal of Electronic Security and Digital Forensics, Vol. 1, No. 2, pp. 156-168, Jan. 2008. https://doi.org/10.1504/IJESDF.2007.016865
  2. I. Santos, Y. Penya, J. Devesa, P. Bringas, "N-gramsbased File Signatures for Malware Detection," Proc. of ICEIS '09, pp. 317-320, 2009.
  3. S. Tabish, M. Shafiq, M. Farooq, "Malware detection using statistical analysis of byte-level file content," Proc. of the ACM SIGKDD Workshop on Cyber-Security and Intelligence Informatics, pp. 23-31, 2009.
  4. C. Willems, T. Holz, F. Freiling, "Toward automated dynamic malware analysis using cwsandbox," IEEE Security & Privacy, Vol. 5, No. 2 pp. 32-39, Mar./ Apr. 2007.
  5. M. Alazab, S. Venkataraman, P. Watters, "Towards understanding malware behaviour by the extraction of API calls," Proc. of Cybercrime and Trustworthy Computing Workshop (CTC), pp. 52-59, 2010.
  6. M. Siddiqui, M. Wang, J. Lee, "A survey of data mining techniques for malware detection using file features," Proc. of the 46th Annual Southeast Regional Conference on XX, pp. 509-510, 2008.
  7. D. J. Bacon, W. F. Anderson, "Multiple sequence alignment," Journal of molecular biology, Vol. 191, No. 2, pp. 153-161, Sep. 1986. https://doi.org/10.1016/0022-2836(86)90252-4
  8. R. C. Edgar, S. Batzoglou, "Multiple sequence alignment," Current opinion in structural biology, Vol. 16, No. 3, pp. 368-373, Jun. 2006. https://doi.org/10.1016/j.sbi.2006.04.004
  9. D. Higgins, P. Sharp, "CLUSTAL: a package for performing multiple sequence alignment on a microcomputer," Gene, Vol. 73, No. 1, pp. 237-244, Dec. 1988. https://doi.org/10.1016/0378-1119(88)90330-7
  10. Clustal X, http://www.clustal.org/clustal2/
  11. Y. Ki, E. Kim, H. K. Kim, "A Novel Approach to Detect Malware Based on API Call Sequence Analysis," International Journal of Distributed Sensor Networks, Vol. 2015, 2015.
  12. I. K. Cho, T. G. Kim, Y. J. Shim, H. Park, B. Choi, E. G. Im, "Malware Similarity Analysis using API Sequence Alignments," Journal of Internet Services and Information Security (JISIS), Vol. 4, No. 4, pp. 103-114, 2014. https://doi.org/10.22667/JISIS.2014.11.31.103
  13. P. Vinod, V. Laxmi, M. Gaur, G. Chauhan, "MOMENTUM: metamorphic malware exploration techniques using MSA signatures," Proc. of Innovations in Information Technology (IIT), pp. 232-237, 2012.
  14. C. I. Fan, H. W. Hsiao, C. H. Chou, Y. F. Tseng, "Malware Detection Systems Based on API Log Data Mining," Proc. of Computer Software and Applications Conference (COMPSAC), pp. 225-260, 2015.
  15. K. S. Han, I. K. Kim, E. G. Im, "Malware family classification method using API sequential characteristic," Journal of Security Engineering, Vol. 8, No. 2, pp. 607-611, Dec. 2011.
  16. A. Sami, B. Yadegari, H. Rahimi, N. Peiravian, S. Hashemi, A. Hamze, "Malware detection based on mining API calls," Proc. of the 2010 ACM Symposium on Applied Computing, pp. 1020-1025, 2010.
  17. L. Wang, T. Jiang, "On the complexity of multiple sequence alignment," Journal of computational biology, Vol. 1, No. 4, pp. 337-348, WINTER 1994. https://doi.org/10.1089/cmb.1994.1.337
  18. W. Just, "Computational complexity of multiple sequence alignment with SP-score," Journal of computational biology, Vol. 8, No. 6, pp. 615-23, Nov. 2001. https://doi.org/10.1089/106652701753307511
  19. I. Elias, "Settling the intractability of multiple alignment," Journal of Computational Biology, Vol. 13, No. 7, pp. 1323-1339, Sep. 2006. https://doi.org/10.1089/cmb.2006.13.1323
  20. P. Hogeweg, B. Hesper, "The alignment of sets of sequences and the construction of phyletictrees: an integrated method," Journal of molecular evolution, Vol. 20, No. 2, pp. 175-186, Jun. 1984. https://doi.org/10.1007/BF02257378
  21. D. Mount, Bioinformatics: Sequence and Genome Analysis, 2nd Ed., Cold spring harbor laboratory press, New York, 2001.
  22. P. Legendre, L. F. Legendre, Numerical Ecology, 24th Ed., Elsevier, 2012.
  23. C. Notredame, G. Higgins, J. Heringa, "T-Coffee: A novel method for fast and accurate multiple sequence alignment," Journal of molecular biology, Vol. 302, No. 1, pp. 205-217, Sep. 2000. https://doi.org/10.1006/jmbi.2000.4042
  24. Clustal Omega, http://www.clustal.org/omega/
  25. Cuckoo Sandbox, http://cuckoosandbox.org/
  26. VxHeaven, http://vxer.org/
  27. Virus Total, http://www.virustotal.com/