Comparative analysis of the data structure for mining all frequent itemsets

Authors

  • Khongtuk T Faculty of Business Administration and Information Technology, Rajamangala University of Technology Suvarnabhumi, Suphanburi, Thailand

Keywords:

data mining, apriori algorithm, itemset

Abstract

Discovering All Frequent Items is one of the most important steps in the association rules mining process. Typically, the minimum support is used as a criterion for selecting an interesting itemset. There are many researchers who focus to improve the efficiency of the entire data set algorithm in various ways. For example, data reduction, structuring of new data, and search space reduction. This study analyzes the advantages and disadvantages of four-type data structure: Map Itemset - Horizontal Data, Map Itemset - Vertical Data, Map Different Set -Horizontal Data, and Map Different Set - Vertical data. The experiment was conducted with 6 datasets, which are dense and sparse datasets from the UCI standard datasets. The results show that Map Differential - Horizontal Data can reduce the size of datasets better than other techniques that use dense datasets. Map Itemset - Horizontal Data can reduce the size of datasets better than other techniques that use sparse datasets.

References

B. Goethals. 2002, Efficient Frequent Pattern Mining. Phd Thesis in Transnationale University Limburg.

C.C.Aggarwal and J. Han, Frequent Pattern Mining. Springer International Publishing Switzerland, 2014.

F.Flouvat, F. De Marchi and J. Marc Petit. A new classification of Datasets for frequent itemsets. J IntellInfSyst (2010) 34:1–19.

J.Han. Data Mining: Concepts and Techniques Second Edition. University of Illinois at Urbana- Champaign. Elsevier, 2006.

J.Han, J. Pei, and Y. Yin, Mining Frequent Patterns Without Candidate Generation, Proc. ACM-SIGMODInt’l Conf. Management of Data, pp. 1-12, May 2000.

J.Han, et al, Frequent pattern mining: current status and future Directions.

J. Zhu. 2014, Efficiently Mining Frequent ItemsetsFrom Very Large Databases. Phd. Thesis. Concordia University, Canada.

K. Garg and D. Kumar. Comparing the Performance of Frequent Pattern Mining Algorithms. InternationalJournal of Computer Applications (0975-8887). V. 69. No.25, 2013.

K.Gouda and M.J.Zaki. Efficiently mining maximal frequent itemsets. In 1st IEEE International Conference onData Mining (ICDM), pages 163-170, San Jose, November 2001.

L.Vu, and G.Alaghband. Mining Frequent Patterns Based on Data Characteristics. IEEE Department ofComputer Science and Engineering, University of Colorado Denver, Denver, CO, USA.

L.XU. 2011, A Study of Frequent Pattern Mining in Transaction Datasets. Thesis of Doctor of Philosophy inComputer Science, Massey University, Palmerston North, New Zealand.

M.J.Zaki and K.Gouda. Fast Vertical Mining Using Diffsets.Rensselaer Polytechnic Institute, Troy, NY,USA. Kyushu University, Fukuoka Japan.

Santhosh, M., Kavitha, S., Keerthana, R., Suganya, L., & Krishnakumar, S. (2016). Electronic Voting Machine Using Internet. InternationalCommunication and computer Technologies.

M.J. Zaki, et al. New algorithms for fast discovery of association rules. In 3rdIntl. Conf. on KnowledgeDiscovery and Data Mining. 1997, 283-296.

M.J. Zaki and W.M. JR. Data Mining and Analysis Fundamental Concepts and Algorithms. CambridgeUniversity Press. New York, NY 10013-2473, USA. 2014.

P. Sengar and M. Barot. Discovering Frequent Patterns with New Mining Procedure. IOSR Journal of Computer Engineering (IOSRJCE) V. 10, Issue 5. 2013, 32-37.

Q.Zou, et al. Mining Association Rule from Tabular Data Guided by Maximal Frequent Itemsets. This researchis supported by the NIH PPG Grant #4442511- 33780 and the NSF IIS Grant #97438.

R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In J.B. Bocca, Proceedings of the 20thInternational Confrerence on Very Large Data Bases (VLDB’94), Santiago dc Chile, Morgan Kaufmann,1994.

R. Agarwal, T. Imielinski, and A. Swami. Mining association rules Between Sets of Items in Large Databases.Proc. ACM SIGMOD International Conference on Management of Data (SIGMOD’93), Washington, D.C., USA, 2007.

R.C. Agarwal, C.C.Aggarwal, and V.V.V.Prasad. Depth first generation of long pattern, In Proceedings of thesixth ACM SIGKDD International conference on Knowledge discovery and data mining, Pages 108-118, 2000.

R.J. Bayardo. Efficiently Mining Long Patterns from Databases. Proc. ACM-SIGMOD Int’l Conf.Management on Data, pp. 85-93, 1998.

T. Hu, et al. Discovery of maximum length frequent itemsets, Journal of Information Sciences. 2007.

T. Khongtuk, et.al. MapDiff-FI: Map Different Sets for Frequent Itemsets Mining, International Conference on Applied Science and Technology (ICAST. 2018), Penang, Malaysia, 10-12 April 2018.

T.Lu, S.Tian, and S.Wang. Header Table Recursion Algorithm for Mining Frequent Patterns. Advances in information Sciences and Service Sciences (AISS) Volume 5, No 2, 2013.

X. Zhu, H. Deng and Z. Chen. A Brief Review on Frequent Pattern Mining. Jianghan University, Wuhan, China. IEEE, 2011.

Downloads

Published

2024-02-26

How to Cite

Khongtuk, T. (2024). Comparative analysis of the data structure for mining all frequent itemsets. COMPUSOFT: An International Journal of Advanced Computer Technology, 7(11), 2867–2873. Retrieved from https://ijact.in/index.php/j/article/view/452

Issue

Section

Original Research Article

Similar Articles

<< < 3 4 5 6 7 8 9 10 11 12 > >> 

You may also start an advanced similarity search for this article.