NUCLEAR: An Efficient Methods for Mining Frequent Itemsets and Generators from Closed Frequent Itemsets

Huy Quang Pham, Duc Tran, Ninh Bao Duong, Philippe Fournier-Viger and Alioune Ngom

Abstract


Frequent itemset (FI) mining is an interesting data mining task. Instead of directly mining the FIs from data it is preferred to mine only the closed frequent itemsets (CFIs) first and then extract the FIs for each CFI. However, some algorithms require the generators for each CFI in order to extract the FIs, leading to an extra cost. In this paper, we introduce an effective algorithm, called NUCLEAR, which can induce the FIs from the lattice of CFIs without the need of the generators. It can enumerate generators as well by similar fashion. Experimental results showed that NUCLEAR is effective as compared to previous studies, especially, the time for extracting the FIs is usually much smaller than that for mining the CFIs.

Keywords


Association rule, minimal association rule, kernel and extendable set, frequent itemset, closed frequent itemset, mining frequent itemset from closed frequent itemset, NUCLEAR.

References


Agrawal R., Imielinski T., Swami N, “Mining association rules between sets of items in large databases”, in ACM SIGMOID, 1993, pp. 207-216.

Mai T., Vo B., Nguyen L.T.T. A lattice-based approach for mining high utility association rules. Information Sciences, 399, 2017, pp.81-97.

Yun U., Lee G., Yoon E, “Efficient high utility pattern mining for establishing manufacturing plans with sliding window control”, IEEE Transactions on Industrial Electronics, 64(9), 2017, pp.7239 – 7249.

Mai T., Nguyen L.T.T, “An Efficient Approach for Mining Closed High Utility Itemsets and Generators. Journal of Information and Telecommunication”, 1(3), 2017, pp.193-207.

Bundit, M., Nunnapus, B., Arnon, R., Athasit, S., Putchong, U., “Parallel association rule mining based on FI-Growth algorithm”, in ICPDS’07, 2007, pp. 1-8.

Lakhal, L., and Stumme, G., “Efficient mining of association rules based on formal concept analysis”, in FCA’05, 2005, pp. 180-195.

Grahne G., Zhu J., “Fast algorithms for frequent itemset mining using FP-Trees”, IEEE Transactions on Knowledge and Data Engineering, 17(10), 2005, pp.1347-1362.

Zaki, M.J. and Hsiao, C.J., “Efficient algorithms for mining closed itemsets and their lattice structure”, IEEE Transactions on Knowledge and Data Engineering, 17(4), 2005, pp.462-478.

Zaki, M.J., “Mining non-redundant association rules”. Data Mining and Knowledge Discovery, 9(3), 2004, pp.223-248.

Sahoo, J., Das, A. K., Goswami, A., “An effective association rule mining scheme using a new generic basis”. Knowledge and Information Systems, 43(1), 2015, pp.127–156.

Negrevergne, B., Termier, A., Méhaut, J., Uno, T., “Discovering Closed Frequent Itemsets on Multicore: Parallelizing Computations and Optimizing Memory Accesses”, International Conference on High Performance Computing and Simulation (HPCS), 2010, pp. 521-528.

Le T., Vo B., “The Lattice-based approaches for mining association rules: a review”. WIREs Data Mining and Knowledge Discovery, 6(4), 2016, pp.140-151.

Vo B., Le B., “Mining traditional association rules using frequent itemsets lattice”, in CIE’09, 2009, pp. 1401–1406.

Tran N.A., Tran C.T., Le H.B., “Structures of association rule set” in ACIIDS’12, 2012, pp. 361-370.

Truong C.T., Tran N.A., “tructure of set of association rules based on concept lattice”, in ACIIDS’10, 2010, pp. 217-227.

Deng Z.H., “DiffNodesets: An efficient structure for fast mining frequent itemsets”, Applied Soft Computing, 41, 2016, pp.214-223.

Deng Z.H., Lv S.L., “PrePost+: An efficient N-lists-based algorithm for mining frequent itemsets via Children-Parent Equivalence pruning”, Expert Systems with Applications, 42(13), 2015, pp.5424- 5432.

Vo B., Le T., Coenen F., Hong T.P., “Mining frequent itemsets using the N-list and subsume concepts”, International Journal of Machine Learning and Cybernetics, 7(2), 2016, pp.253-265.

Goethals, B., and Zaki, M., “FIMI '03 Workshop on Frequent Itemset Mining Implementations”, 2003, http://www.cs.rpi.edu/~zaki/PaperDir/FIMI03.pdf.

Tran N.A., Duong V.H., Tran C.T., Le H.B, “Efficient algorithms for mining frequent itemsets with constraint”, in KSE’11, 2011, pp. 19-25.

Szathmary, L., Valtchev, P., Napoli, A., Godin, R., “Efficient vertical mining of frequent closures and generators”, in Advances in Intelligent Data Analysis VIII, Springer Berlin Heidelberg, 2009, pp. 393-404.

Anh N. T., Tin C.T., and Bac L.H., “An approach for mining concurrently closed itemsets and generators”, in ICCSAMA’13, 2013, pp.355–366.

Vo B., Hong T.P., Le B., “DBV-Miner: A dynamic Bit-Vector approach for fast mining closed frequent itemsets”, Expert Systems with ApplIcations, 39(8), 2012, pp.7196-7206.

Vo B., Le B., “nterestingness measures for association rules: Combination between lattice and hash tables”, Expert Systems with Applications, 38(9), 2011, pp.11630-11640.

Vo B., Le T., Hong T.P., Le, B., “An effective approach for maintenance of pre-large-based frequent-itemset lattice in incremental mining”, Applied Intelligence, 41(3), 2014, pp.759-775.

Agrawal R., Shafer J.C., “Parallel mining of association rules”. IEEE Transactions on Knowledge and Data Engineering, 8(6), 1996, pp.962- 969.

Han E., Karypis G., and Kumar V., “Scalable parallel data mining for association rules”, in ACM SIGMOD’97, 1997, pp. 277-288.

Zaïane, O.R., El-Hajj, M., and Lu, P., “Fast parallel association rule mining without candidacy generation”, in ICDM’01, 2001, pp. 665-668.

Pasquier N., Taouil R., Bastide Y., Stumme G., and Lakhal L., “Generating a condensed representation for association rules,” J. of Intelligent Information Systems, vol. 24, no. 1, 2005, pp. 29-60.

Ai, D., Pan, H., Li, X., Gao, Y., & He, D., “Association rule mining algorithms on high-dimensional datasets”. Artificial Life and Robotics, 23(3), 2018, pp.420-427.

Fournier-Viger P., Lin J.C.W., Vo B., Truong T.C., Zhang J., Le H.B. “A survey of itemset mining”. WIREs Data Mining and Knowledge Discovery, 7(4), 2017, e1207.

http://fimi.ua.ac.be/data.

http://coron.loria.fr/site/downloads_datasets.php.


Full Text: PDF

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

IT in Innovation IT in Business IT in Engineering IT in Health IT in Science IT in Design IT in Fashion

IT in Industry @ http://www.it-in-industry.com . ISSN (Online): 2203-1731; ISSN (Print): 2204-0595