Outlier Detection via Online Oversampling in High Dimensional space
Keywords:
Anomaly detection, online updating, oversampling, principal component analysisAbstract
Anomaly detection is an important topic in data mining. Many applications such as intrusion or credit card fraud detection require efficient method to identify deviated data instances, mostly anomaly detection methods are typically implemented in batch mode, and thus cannot be easily extended to large-scale problems without sacrificing computation and memory requirements. This paper proposes an online oversampling principal component analysis (osPCA) algorithm, and aim to detecting the presence of outliers from a large amount of data via an online updating technique. In prior principal component analysis (PCA)-based approaches, we store the entire data matrix or covariance matrix, and in proposed method there is no need to store the entire data matrix or covariance matrix thus this approach is especially of used in online or large-scale problems. Oversampling the target instance and extracting the principal direction of the data, the proposed osPCA allows determining the Anomaly or outlier of the target instance according to the variation of the resulting dominant eigenvector. In the proposed method osPCA need not perform Eigen analysis, so the proposed method is applicable for online applications Compared with the well-known power method for PCA and other popular anomaly detection algorithms, o ur experimental results verify proposed method is feasible in terms of both efficiency and accuracy.
References
Yuh-Jye Lee, Yi-Ren Yeh, and Yu-Chiang Frank Wang, “Anomaly Detection Via Online Oversampling Principle Component Analysis”,vol.25,no 7.2013
V. Chandola, A. Banerjee, and V. Kumar, “Anomaly Detection: A Survey,” ACM Computing Surveys, vol. 41, no. 3, pp. 15:1-15:58, 2009.
D.M. Hawkins, Identification of Outliers. Chapman and Hall, 1980.
M. Breunig, H.-P. Kriegel, R.T. Ng, and J. Sander, “LOF: Identifying Density-Based Local Outliers,” Proc. ACM SIGMOD Int’l Conf. Management of Data, 2000
H.-P. Kriegel, M. Schubert, and A. Zimek, “Angle-Based Outlier Detection in High-Dimensional Data,” Proc. 14th ACM SIGKDD Int’l Conf. Knowledge Discovery and data Mining, 2008.
X. Song, M. Wu, and C.J., and S. Ranka, “Conditional Anomaly Detection,” IEEE Trans. Knowledge and Data Eng., vol. 19, no. 5, pp. 631-645, May 2007.
W. Wang, X. Guan, and X. Zhang, “A Novel Intrusion Detection Method Based on Principal Component Analysis in Computer Security,” Proc. Int’l Symp. Neural Networks, 2004.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2014 COMPUSOFT: An International Journal of Advanced Computer Technology
This work is licensed under a Creative Commons Attribution 4.0 International License.
©2023. COMPUSOFT: AN INTERNATIONAL OF ADVANCED COMPUTER TECHNOLOGY by COMPUSOFT PUBLICATION is licensed under a Creative Commons Attribution 4.0 International License. Based on a work at COMPUSOFT: AN INTERNATIONAL OF ADVANCED COMPUTER TECHNOLOGY. Permissions beyond the scope of this license may be available at Creative Commons Attribution 4.0 International Public License.