Skip to content

eMAGTechLabs/go-apriori

Repository files navigation

Go-Apriori

Go-Apriori is a simple go implementation of the Apriori algorithm for finding frequent sets and association rules

PkgGoDev Build Status Go Report Card

Short Apriori Algorithm description

Apriori is a classic algorithm for learning association rules. Apriori is designed to operate on databases / data sets containing transactions (for example, collections of items bought by customers)

The algorithm extracts useful information from large amounts of data. For example, the information that a customer who purchases 'butter' also tends to buy 'jam' at the same time is acquired from the association rule below:

  • Support: The percentage of task-relevant data transactions for which the pattern is true.
Support (Butter->Jam) = ( No of transactions containing both 'butter' and 'jam' ) / ( Total no of transactions )
  • Confidence: The measure of certainty or trustworthiness associated with each discovered pattern.
Confidence (Butter->Jam) = ( No of transactions containing both 'butter' and 'jam' ) / ( No of transactions containing 'butter' )
  • Lift: This measure of how likely item 'jam' is purchased when item 'butter' is purchased, while controlling for how popular item 'butter' is
Lift (Butter->Jam) =  ( No of transactions containing both 'butter' and 'jam' ) / ( No of transactions containing 'butter' ) * ( No of transactions containing 'jam' )

The algorithm aims to find the rules which satisfy both a minimum support threshold and a minimum confidence threshold.

  • Item: article in the basket.
  • Itemset: a group of items purchased together in a single transaction.

How it works

  • Find all frequent itemsets:
    • Get frequent items:
      • Items whose occurrence is greater than or equal to the minimum support threshold.
    • Get frequent itemsets:
      • Generate candidates from frequent items.
      • Prune the results to find the frequent itemsets.
  • Generate association rules from frequent itemsets:
    • Rules which satisfy the minimum support, minimum confidence and minimum lift thresholds.

Usage

How to get

go get github.com/eMAGTechLabs/go-apriori

Options

type Options struct {
    minSupport    float64 // The minimum support of relations (float).
    minConfidence float64 // The minimum confidence of relations (float).
    minLift       float64 // The minimum lift of relations (float).
    maxLength     int     // The maximum length of the relation (integer).
}

Note: If maxLength is set to 0, no max length will be taken into consideration

How to use

import "github.com/eMAGTechLabs/go-apriori"

transactions := [][]string{
    {"beer", "nuts", "cheese"},
    {"beer", "nuts", "jam"},
    {"beer", "butter"},
    {"nuts", "cheese"},
    {"beer", "nuts", "cheese", "jam"},
    {"butter"},
    {"beer", "nuts", "jam", "butter"},
    {"jam"},
}
apriori := NewApriori(transactions)
results := apriori.Calculate(NewOptions(0.1, 0.5, 0.0, 0))

Sample Output

[
...
    {
        supportRecord: {items: [beer cheese jam nuts] support:0.125 } 
        orderedStatistic: [
            { base: [beer cheese jam] add: [nuts] confidence: 1 lift: 1.6 }
            { base: [beer cheese nuts] add: [jam] confidence: 0.5 lift: 1 }
            { base: [cheese jam nuts] add: [beer] confidence: 1 lift: 1.6 }
        ]
    }
...
]

Inspiration

Contributing

Thanks for your interest in contributing! There are many ways to contribute to this project. Get started here.

Tags

#apriori-algorithm #go