Association Rules

Association rules

Association Rules is another unsupervised learning method. There is no “prediction” performed but is used to discover relationships within the data. The example questions are • Which of my products tend to be purchased together? • What will other people who are like this person or product tend to buy/watch or click on for other products we may have to offer? In the online retailer example we analyzed in the previous lesson, we could use association rules to discover what products are purchased together within the group that yielded maximum LTV. For example if we set up the data appropriately, we could explore to further discover which products people in GP4 tend to buy together and derive any logical reasons for high rate of returns. We can discover the profile of purchases for people in different groups (Ex: people who buy high heel shoes and expensive purses tend to be in GP4 or people who buy walking shoes and camping gear tend to be in GP2 etc). The goal with Association rules is to discover “interesting” relationships among the variables and the definition of “interesting” depends on the algorithm used for the discovery. The rules you discover are of the form that when I observe X I also tend to observe Y. An example of “interesting” relationships are those rules identified with a measure of “confidence” (with a value >= a pre-defined threshold) with which a rule can be stated based on the data.

Association Rules are specifically designed for in-database mining over transactions in databases. Association rules are used over transactions that Consists of “itemsets”. Itemsets are discrete sets of items that are linked together. For example they could be a set of retail items purchased together in one transaction. Association rules are sometimes referred to as Market Basket Analysis and you can think of a itemset as everything in your shopping basket. We can also group the tasks done in one day or set of links clicked by a user in a single session into a basket or an itemset for discovering associations. “Apriori” is one of the earliest and the most commonly used algorithms for association rules and we will focus on Apriori in the rest of our lesson.