Association rule learning

Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.[1]

Alternative measures of interestingness

In addition to confidence, other measures of interestingness for rules have been proposed. Some popular measures are:

  • All-confidence[7]
  • Collective strength[8]
  • Leverage[9]

Several more measures are presented and compared by Tan et al.[10] and by Hahsler.[4] Looking for techniques that can model what the user has known (and using these models as interestingness measures) is currently an active research trend under the name of “Subjective Interestingness.”


Statistically sound associations

One limitation of the standard approach to discovering associations is that by searching massive numbers of possible associations to look for collections of items that appear to be associated, there is a large risk of finding many spurious associations. These are collections of items that co-occur with unexpected frequency in the data, but only do so by chance. For example, suppose we are considering a collection of 10,000 items and looking for rules containing two items in the left-hand-side and 1 item in the right-hand-side. There are approximately 1,000,000,000,000 such rules. If we apply a statistical test for independence with a significance level of 0.05 it means there is only a 5% chance of accepting a rule if there is no association. If we assume there are no associations, we should nonetheless expect to find 50,000,000,000 rules. Statistically sound association discovery[18][19] controls this risk, in most cases reducing the risk of finding any spurious associations to a user-specified significance level.


Many algorithms for generating association rules have been proposed.

Some well-known algorithms are Apriori, Eclat and FP-Growth, but they only do half the job, since they are algorithms for mining frequent itemsets. Another step needs to be done after to generate rules from frequent itemsets found in a database.

Apriori algorithm

Apriori[13] uses a breadth-first search strategy to count the support of itemsets and uses a candidate generation function which exploits the downward closure property of support.

Eclat algorithm

Eclat[14] (alt. ECLAT, stands for Equivalence Class Transformation) is a depth-first search algorithm based on set intersection. It is suitable for both sequential as well as parallel execution with locality-enhancing properties.[20][21]



The ASSOC procedure[24] is a GUHA method which mines for generalized association rules using fast bitstrings operations. The association rules mined by this method are more general than those output by apriori, for example “items” can be connected both with conjunction and disjunctions and the relation between antecedent and consequent of the rule is not restricted to setting minimum support and confidence as in apriori: an arbitrary combination of supported interest measures can be used.

OPUS search

OPUS is an efficient algorithm for rule discovery that, in contrast to most alternatives, does not require either monotone or anti-monotone constraints such as minimum support.[25] Initially used to find rules for a fixed consequent[25][26] it has subsequently been extended to find rules with any item as a consequent.[27] OPUS search is the core technology in the popular Magnum Opus association discovery system.


A famous story about association rule mining is the “beer and diaper” story. A purported survey of behavior of supermarket shoppers discovered that customers (presumably young men) who buy diapers tend also to buy beer. This anecdote became popular as an example of how unexpected association rules might be found from everyday data. There are varying opinions as to how much of the story is true.[28] Daniel Powers says:[28]

In 1992, Thomas Blischok, manager of a retail consulting group at Teradata, and his staff prepared an analysis of 1.2 million market baskets from about 25 Osco Drug stores. Database queries were developed to identify affinities. The analysis “did discover that between 5:00 and 7:00 p.m. that consumers bought beer and diapers”. Osco managers did NOT exploit the beer and diapers relationship by moving the products closer together on the shelves.

Other types of association rule mining

Multi-Relation Association Rules: Multi-Relation Association Rules (MRAR) are association rules where each item may have several relations. These relations indicate indirect relationship between the entities. Consider the following MRAR where the first item consists of three relations live innearby and humid: “Those who live in a place which is nearby a city with humid climate type and also are younger than 20 -> their health condition is good”. Such association rules are extractable from RDBMS data or semantic web data.[29]

Contrast set learning is a form of associative learning. Contrast set learners use rules that differ meaningfully in their distribution across subsets.[30][31]

Weighted class learning is another form of associative learning in which weight may be assigned to classes to give focus to a particular issue of concern for the consumer of the data mining results.

High-order pattern discovery facilitate the capture of high-order (polythetic) patterns or event associations that are intrinsic to complex real-world data. [32]

K-optimal pattern discovery provides an alternative to the standard approach to association rule learning that requires that each pattern appear frequently in the data.

Approximate Frequent Itemset mining is a relaxed version of Frequent Itemset mining that allows some of the items in some of the rows to be 0.[33]

Generalized Association Rules hierarchical taxonomy (concept hierarchy)

Quantitative Association Rules categorical and quantitative data

Interval Data Association Rules e.g. partition the age into 5-year-increment ranged

Sequential pattern mining discovers subsequences that are common to more than minsup[clarification needed] sequences in a sequence database, where minsup is set by the user. A sequence is an ordered list of transactions.[34]

Subspace Clustering, a specific type of Clustering high-dimensional data, is in many variants also based on the downward-closure property for specific clustering models.[35]

Warmr is shipped as part of the ACE data mining suite. It allows association rule learning for first order relational rules.[36]


  1. ^Piatetsky-Shapiro, Gregory (1991), Discovery, analysis, and presentation of strong rules, in Piatetsky-Shapiro, Gregory; and Frawley, William J.; eds., Knowledge Discovery in Databases, AAAI/MIT Press, Cambridge, MA.
  2. ^ Jump up to:ab c d e f Agrawal, R.; Imieliński, T.; Swami, A. (1993). “Mining association rules between sets of items in large databases”. Proceedings of the 1993 ACM SIGMOD international conference on Management of data – SIGMOD ’93. p. 207. CiteSeerX doi:10.1145/170035.170072. ISBN 978-0897915922.
  3. ^ Jump up to:ab c Hahsler, Michael (2005). “Introduction to arules – A computational environment for mining association rules and frequent item sets” (PDF). Journal of Statistical Software.
  4. ^ Jump up to:ab Michael Hahsler (2015). A Probabilistic Comparison of Commonly Used Interest Measures for Association Rules.
  5. ^Hipp, J.; Güntzer, U.; Nakhaeizadeh, G. (2000). “Algorithms for association rule mining — a general survey and comparison”. ACM SIGKDD Explorations Newsletter. 2: 58–64. CiteSeerX doi:10.1145/360402.360421.
  6. ^Brin, Sergey; Motwani, Rajeev; Ullman, Jeffrey D.; Tsur, Shalom (1997). “Dynamic itemset counting and implication rules for market basket data”. Proceedings of the 1997 ACM SIGMOD international conference on Management of data – SIGMOD ’97. pp. 255–264. CiteSeerX doi:10.1145/253260.253325. ISBN 978-0897919111.
  7. ^Omiecinski, E.R. (2003). “Alternative interest measures for mining associations in databases”. IEEE Transactions on Knowledge and Data Engineering. 15: 57–69. CiteSeerX doi:10.1109/TKDE.2003.1161582.
  8. ^Aggarwal, Charu C.; Yu, Philip S. (1998). “A new framework for itemset generation”. Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems – PODS ’98. pp. 18–24. CiteSeerX doi:10.1145/275487.275490. ISBN 978-0897919968.
  9. ^Piatetsky-Shapiro, Gregory; Discovery, analysis, and presentation of strong rules, Knowledge Discovery in Databases, 1991, pp. 229-248
  10. ^Tan, Pang-Ning; Kumar, Vipin; Srivastava, Jaideep (2004). “Selecting the right objective measure for association analysis”. Information Systems. 29 (4): 293–313. CiteSeerX doi:10.1016/S0306-4379(03)00072-3.
  11. ^Tan, Pang-Ning; Michael, Steinbach; Kumar, Vipin (2005). “Chapter 6. Association Analysis: Basic Concepts and Algorithms” (PDF). Introduction to Data Mining. Addison-Wesley. ISBN 978-0-321-32136-7.
  12. ^Jian Pei; Jiawei Han; Lakshmanan, L.V.S. (2001). “Mining frequent itemsets with convertible constraints”. Proceedings 17th International Conference on Data Engineering. pp. 433–442. CiteSeerX doi:10.1109/ICDE.2001.914856. ISBN 978-0-7695-1001-9.
  13. ^ Jump up to:ab Agrawal, Rakesh; and Srikant, Ramakrishnan; Fast algorithms for mining association rules in large databases Archived2015-02-25 at the Wayback Machine, in Bocca, Jorge B.; Jarke, Matthias; and Zaniolo, Carlo; editors, Proceedings of the 20th International Conference on Very Large Data Bases (VLDB), Santiago, Chile, September 1994, pages 487-499
  14. ^ Jump up to:ab Zaki, M. J. (2000). “Scalable algorithms for association mining”. IEEE Transactions on Knowledge and Data Engineering. 12 (3): 372–390. CiteSeerX doi:10.1109/69.846291.
  15. ^Hájek, P.; Havel, I.; Chytil, M. (1966). “The GUHA method of automatic hypotheses determination”. Computing. 1 (4): 293–308. doi:10.1007/BF02345483.
  16. ^Hájek, Petr; Rauch, Jan; Coufal, David; Feglar, Tomáš (2004). “The GUHA Method, Data Preprocessing and Mining”. Database Support for Data Mining Applications. Lecture Notes in Computer Science. 2682. pp. 135–153. doi:10.1007/978-3-540-44497-8_7. ISBN 978-3-540-22479-2.
  17. ^Webb, Geoffrey (1989). “A Machine Learning Approach to Student Modelling”. Proceedings of the Third Australian Joint Conference on Artificial Intelligence (AI 89): 195–205.
  18. ^Webb, Geoffrey I. (2007). “Discovering Significant Patterns”. Machine Learning. 68: 1–33. doi:10.1007/s10994-007-5006-x.
  19. ^Gionis, Aristides; Mannila, Heikki; Mielikäinen, Taneli; Tsaparas, Panayiotis (2007). “Assessing data mining results via swap randomization”. ACM Transactions on Knowledge Discovery from Data. 1 (3): 14–es. CiteSeerX doi:10.1145/1297332.1297338.
  20. ^Zaki, Mohammed Javeed; Parthasarathy, Srinivasan; Ogihara, Mitsunori; Li, Wei (1997). “New Algorithms for Fast Discovery of Association Rules”: 283–286. CiteSeerX hdl:1802/501.
  21. ^Zaki, Mohammed J.; Parthasarathy, Srinivasan; Ogihara, Mitsunori; Li, Wei (1997). “Parallel Algorithms for Discovery of Association Rules”. Data Mining and Knowledge Discovery. 1 (4): 343–373. doi:10.1023/A:1009773317876.
  22. ^Han (2000). Mining Frequent Patterns Without Candidate Generation. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. SIGMOD ’00. pp. 1–12. CiteSeerX doi:10.1145/342009.335372. ISBN 978-1581132175.
  23. ^Witten, Frank, Hall: Data mining practical machine learning tools and techniques, 3rd edition[page needed]
  24. ^Hájek, Petr; Havránek, Tomáš (1978). Mechanizing Hypothesis Formation: Mathematical Foundations for a General Theory. Springer-Verlag. ISBN 978-3-540-08738-0.
  25. ^ Jump up to:ab Webb, Geoffrey I. (1995); OPUS: An Efficient Admissible Algorithm for Unordered Search, Journal of Artificial Intelligence Research 3, Menlo Park, CA: AAAI Press, pp. 431-465 online access
  26. ^Bayardo, Roberto J., Jr.; Agrawal, Rakesh; Gunopulos, Dimitrios (2000). “Constraint-based rule mining in large, dense databases”. Data Mining and Knowledge Discovery. 4 (2): 217–240. doi:10.1023/A:1009895914772.
  27. ^Webb, Geoffrey I. (2000). “Efficient search for association rules”. Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining – KDD ’00. pp. 99–107. CiteSeerX doi:10.1145/347090.347112. ISBN 978-1581132335.
  28. ^ Jump up to:ab “DSS News: Vol. 3, No. 23”.
  29. ^Ramezani, Reza, Mohamad Sunni ee, and Mohammad Ali Nematbakhsh; MRAR: Mining Multi-Relation Association Rules, Journal of Computing and Security, 1, no. 2 (2014)
  30. ^GI Webb and S. Butler and D. Newlands (2003). On Detecting Differences Between Groups. KDD’03 Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
  31. ^Menzies, T.; Ying Hu (2003). “Computing practices – Data mining for very busy people”. Computer. 36 (11): 22–29. doi:10.1109/MC.2003.1244531.
  32. ^Wong, A.K.C.; Yang Wang (1997). “High-order pattern discovery from discrete-valued data”. IEEE Transactions on Knowledge and Data Engineering. 9 (6): 877–893. CiteSeerX doi:10.1109/69.649314.
  33. ^Liu, Jinze; Paulsen, Susan; Sun, Xing; Wang, Wei; Nobel, Andrew; Prins, Jan (2006). “Mining Approximate Frequent Itemsets in the Presence of Noise: Algorithm and Analysis”. Proceedings of the 2006 SIAM International Conference on Data Mining. pp. 407–418. CiteSeerX doi:10.1137/1.9781611972764.36. ISBN 978-0-89871-611-5.
  34. ^Zaki, Mohammed J. (2001); SPADE: An Efficient Algorithm for Mining Frequent Sequences, Machine Learning Journal, 42, pp. 31–60
  35. ^Zimek, Arthur; Assent, Ira; Vreeken, Jilles (2014). Frequent Pattern Mining. pp. 403–423. doi:10.1007/978-3-319-07821-2_16. ISBN 978-3-319-07820-5.
  36. ^King, R. D.; Srinivasan, A.; Dehaspe, L. (Feb 2001). “Warmr: a data mining tool for chemical data”. J Comput Aided Mol Des. 15(2): 173–81. Bibcode:2001JCAMD..15..173K. doi:10.1023/A:1008171016861. PMID 11272703.

Ofer Abarbanel – Executive Profile

Ofer Abarbanel online library

Ofer Abarbanel online library

Ofer Abarbanel online library

Ofer Abarbanel online library