Optimization of DBSCAN algorithm using MapReduce method on network traffic data

Authors

  • Fejer HN Directorate General of Education in Dewanyah
  • Falih MA Directorate General of Education in Babylon

Keywords:

DBSCAN algorithm, MapReduce method, Network traffic data

Abstract

In this paper, a new method has been proposed to eliminate the weaknesses in the previous algorithms. The proposed method for data density clustering is reduced in the mapping programming model. Our analysis result shows that misleading data was presented to prove the function of the density-based clustering algorithm and the weakness of the base method on them has been represented. Then, local clustering was tested by competing methods for standard data clustering and its superiority to these methods was determined. When passing local clustering to distributed clustering, misleading data was again used to prove the quality of clustering. Distributed clustering quality is lower than local clustering, but it is still superior to the base method. The quality of clustering of the proposed method on competing methods was clearly determined by distributed network clustering. Finally, the method of choosing this parameter was described by evaluating the homogeneity and completeness criteria and the effect of the flexible parameter on different types of data.

References

Ester, M., Kriegel, H.P. et al. (1996) A DensityBased Algorithm for Discovering Clusters in Large Spatial Database with Noise. KDD, 226-231.

He YB, Tan HY, Luo WM, et al. MR-DBSCAN: a scalable MapReduce-based DBSCAN algorithm for heavily skewed data, Front Comput Sci 2014; 8(1): 83–99, DOI 10.1007/s11704-013-3158-3.

Kim Y, Shim K, Kim MS, et al. DBCURE-MR: an efficient density-based clustering algorithm for large data using MapReduce. Inform Syst 2014; 42: 15–35.

Birant, D. and A. Kut (2007). "ST-DBSCAN: An algorithm for clusteringspatial–temporal data." Data & Knowledge Engineering 60(1): 208-221.

Liu, P., et al .)2007( ."VDBSCAN: varied density based spatial clustering of applications with noise". Service Systems and Service Management, 2007 InternationalConference on, IEEE.

Ting, K. M., et al. (2013). "DEMass: a new density estimator for big data." Knowledge and information systems 35.493-524 :)3(

Esfandani, G., et al. (2012). "GDCLU: a new Grid-Density based CLUstring algorithm". Software Engineering, Artificial Intelligence, Networking and Parallel & Distributed Computing (SNPD), 2012 13th ACIS International Conference on, IEEE.

He, Y., et al. (2011). "Mr-dbscan: an efficient parallel density-basedclustering algorithm using mapreduce". Parallel and Distributed Systems (ICPADS), 2011 IEEE 17thInternational Conference on, IEEE.

Vinh, N. X., et al. (2009). "Information theoretic measures for clusterings comparison": is a correction for chance necessary? Proceedings of the 26th annual international conference on machine learning, ACM.

Rosenberg, A. and J. Hirschberg (2007). "VMeasure: A Conditional Entropy-Based External Cluster Evaluation Measure". EMNLP-CoNLL.

Downloads

Published

2024-02-26

How to Cite

Fejer, H. N., & Falih, M. (2024). Optimization of DBSCAN algorithm using MapReduce method on network traffic data. COMPUSOFT: An International Journal of Advanced Computer Technology, 8(03), 3097–3102. Retrieved from https://ijact.in/index.php/j/article/view/487

Issue

Section

Original Research Article

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.