Implementnig Data Mining for Detection of Malware from Code
Keywords:
Data Mining, malware, virus data setsAbstract
In this paper we discuss various data mining techniques that we have successfully applied for cyber security. This research investigates the use of data mining methods for malware (malicious programs) detection and proposed a framework as an alternative to the traditional signature detection methods. These applications include malicious code detection by mining binary executables by anomaly detection, and data stream mining. A serious security threat today is malicious executables, especially new, unseen malicious executables often arriving as email attachments. These new malicious executables are created at the rate of thousands every year and pose a serious security threat. Our research is closely related to information retrieval and classification techniques and borrows a number of ideas from the field. Current anti-virus systems attempt to detect these new malicious programs with heuristics generated by hand. This approach is costly and oftentimes ineffective. We present a data-mining framework that detects new, previously unseen malicious executables accurately and automatically. The data -mining framework automatically found patterns in our data set and used these patterns to detect a set of new malicious binaries. Comparing our detection methods with a traditional signature based method; this method is more than doubles the current detection rates for new malicious executables.
References
]Steve R. White. Open Problems in Computer Virus Research. Virus Bulletin Conference, 1998.
Dmitry Gryaznov. Scanners of the Year 2000: Heuristics.Proceedings of the 5th International Virus Bulletin, 1999
Fred Cohen. Computer Viruses. PhD thesis, University of Southern California, 1985.
Peter Szor. The Art of Computer Virus Research and Defense. Addison Wesley for Symantec Press, New Jersey, 2005.
The Data Mine: www.the-data-mine.com
KDnuggets - Data Mining, Web Mining, and Knowledge Discovery Guide: www.kdnuggets.com
I.H. Witten, E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, 2nd ed.Morgan Kaufmann, 2005.
M. G. Schultz, E. Eskin, E. Z., and S. J. Stolfo, ”Data mining methods for detection of new malicious executables,” in Proceedings of the IEEE Symp. on Security and Privacy, pp. 38-49, 2001.
W. Cohen, .“Fast effective rule induction,.”Proc. 12th International Conference on Machine Learning, pp. 115-23, San Francisco, CA: Morgan Kaufmann Publishers, 1995.
J. Z. Kolter and M. A. Maloof, “Learning to Detect Malicious Executables in the wild,” in Proceedings of the ACM Symp. on Knowledge
Discovery and Data Mining (KDD), pp. 470-478, August 2004.
T. Fawcett, “ROC Graphs: Notes and Practical Considerations for Researchers”, TR HPL-2003-4, HP Labs, USA, 2004.
M. Siddiqui, M. C. Wang, J. Lee, Detecting Internet worms Using Data Mining Techniques, Journal of Systemics, Cybernetics and Informatics, volume 6 - number 6, pp: 48-53, 2009.
U. Bayer, C. Kruegel, and E. Kirda. TTAnalyze: A tool for analyzing malware. In Proceedings of EICAR 2006, April 2006.
U. Bayer, A. Moser, C. Kruegel, and E. Kirda. Dynamic analysis of malicious code. Journal in Computer Virology, 2:67–77, 2006.
M. Bailey, J. Oberheide, J. Andersen, Z. M. Mao, F. Jahanian, and J. Nazario.Automated classification and analysis of internet malware. In Proceedings of the 10th Symposium on Recent Advances in Intrusion Detection (RAID‟07), pages 178–197, 2007.
T. Lee and J. J. Mody. Behavioral classification. In Proceedings of EICAR 2006, April 2006.
Dan Ellis. “Worm Anatomy and Model.” In Proceedings of the 2003 ACM workshop on Rapid malcode WORM ‟03, pp. 42–50, 2003.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2014 COMPUSOFT: An International Journal of Advanced Computer Technology
This work is licensed under a Creative Commons Attribution 4.0 International License.
©2023. COMPUSOFT: AN INTERNATIONAL OF ADVANCED COMPUTER TECHNOLOGY by COMPUSOFT PUBLICATION is licensed under a Creative Commons Attribution 4.0 International License. Based on a work at COMPUSOFT: AN INTERNATIONAL OF ADVANCED COMPUTER TECHNOLOGY. Permissions beyond the scope of this license may be available at Creative Commons Attribution 4.0 International Public License.