"The Rete Match Algorithm is an efficient method for comparing a large collection of patterns to a large collection of objects. It finds all the objects that match each pattern. The algorithm was developed for use in production system interpreters, and it has been used for systems containing from a few hundred to more than a thousand patterns and objects" - C. Forgy
Boilerplate/PoC of a version of the Rete Algorithm implementated in Elixir
Rete is a complex stateful algorithm, this is an attempt of reproducing it with some slight modifications, using a functional immutable language such as Elixir/Erlang. Read more about Rete
- Erlang/OTP 24
- Elixir 1.14.2 (compiled with Erlang/OTP 22)
The algorithm utilizes symbols to create an internal representation of the world. Each element in the real world is converted into a triple known as a "Working Memory Element" (Retex.Wme.t()
), represented as {Entity, attribute, attribute_value}.
The world is represented through facts (WMEs) and Rules. A Rule consists of two essential parts: the "given" (right side) and the "then" (left side).
To perform inference, the rule generates a directed graph starting from a common and generic Root node, which branches out to form leaf nodes. The branches from the Root node correspond to the initial part of the WME, representing the working memory elements or "Entity". For instance, if we want to represent a customer's account status as "silver", we would encode it as "{Customer, account_status, silver}". Alternatively, with the use of a struct, we can achieve the same representation as Retex.Wme.new("Customer", "account status", "silver").
Now, let's explore how this would appear when compiling the rete algorithm with Retex:
flowchart
2332826675[==silver]
3108351631[Root]
3860425667[Customer]
3895425755[account_status]
3108351631 --> 3860425667
3860425667 --> 3895425755
3895425755 --> 2332826675
example nr. 1
Now, let's examine the graph, which consists of four nodes in the following order:
- The Root node
- This node serves as the root for all type nodes, such as Account, Customer, God, Table, and so on.
- The Customer node
- Also known as a Type node, it stores each known "type" of entity recognized by the algorithm.
- The account_status node
- Referred to as a Select node, it represents the attribute name of the entity being described.
- the ==silver node
- Known as a Test node, it includes the == symbol, indicating that the value of Customer.account_status is checked against "silver" as a literal string (tests can use all Elixir comparison symbols).
By expanding this network, we can continue mapping various aspects of the real world using any desired triple. Let's consider the entity representing a Flight, specifically its number of miles. We can represent this as {Flight, miles, 100} to signify a flight with a mileage of 100. Now, let's incorporate this into our network and observe the resulting graph:
Let's add this to our network and check what kind of graph we will get:
flowchart
2102090852[==100]
2332826675[==silver]
3108351631[Root]
3801762854[miles]
3860425667[Customer]
3895425755[account_status]
4112061991[Flight]
3108351631 --> 3860425667
3108351631 --> 4112061991
3801762854 --> 2102090852
3860425667 --> 3895425755
3895425755 --> 2332826675
4112061991 --> 3801762854
example nr. 2
Now we begin to observe the modeling of more complex scenarios. Let's consider the addition of our first inference to the network, which involves introducing our first rule.
The rule we want to encode states that when the Customer's account_status is "silver" and the Flight's miles are exactly "100," we should apply a discount to the Customer entity.
Let's examine how our network will appear after incorporating this rule:
flowchart
2102090852["==100"]
2332826675["==silver"]
2833714732["[{:Discount, :code, 50}]"]
3108351631["Root"]
3726656564["Join"]
3801762854["miles"]
3860425667["Customer"]
3895425755["account_status"]
4112061991["Flight"]
2102090852 --> 3726656564
2332826675 --> 3726656564
3108351631 --> 3860425667
3108351631 --> 4112061991
3726656564 --> 2833714732
3801762854 --> 2102090852
3860425667 --> 3895425755
3895425755 --> 2332826675
4112061991 --> 3801762854
Now we have constructed our network, which possesses a symbolic representation of the world and describes the relationships between multiple entities and their values to trigger a rule. Notably, the last node in the graph is represented as {:Discount, :code, 50}.
Let's examine how we can interpret this graph step by step:
- At the first level, we encounter the Root node, which serves as a placeholder.
- At the second level, we find the Flight and Customer nodes branching out from the Root node. It's important to note that they are at the same level.
- Both the Flight and Customer nodes branch out only once since they each have only one attribute.
- Each attribute node (==100 and ==silver) branches out once again to indicate that if we encounter the attribute Customer.account_status, we should verify that its value is indeed "silver."
- The last two nodes (==100 and ==silver) both connect to a new anonymous node called the Join node.
- The Join node branches out only once, leading to the right-hand side of the rule (also known as the production node).
This structure of the graph allows us to represent and process complex relationships and conditions within our network.
Join nodes are also what is called "beta memory" in the original C. Forgy paper. To make it simple we can assert that they group together a set of conditions that need to be true in order for a rule to fire. In our last example, the rule is:
# pseudocode
given: Flight.miles == 100 and Customer.account_status == "silver"
then: Discount.code == 50
In the graph representation, the Join node corresponds to the "and" in the "given" part of the rule. Its purpose is to evaluate and combine the conditions associated with its two parent nodes. Notably, a Join node can only have and will always have exactly two parents (incoming edges), which is a crucial characteristic of its design.
By utilizing Join nodes, the network is able to effectively represent complex conditions and evaluate them in order to trigger the corresponding rules.
Production nodes, as named in the Forgy paper, refer to the right-hand side of a rule (also known as the "given" part). These nodes are exclusively connected to one incoming Join node in the network.
To clarify, the purpose of a production node is to represent the actions or outcomes specified by the rule. It captures the consequences that should occur when the conditions specified in the Join node's associated "given" part are met. This relationship ensures that the rule's right-hand side is only triggered when the conditions of the Join node are satisfied.
Once we have a graph like the following and we know how to read it let's imagine we want to use to make inference and so to understand if we can give out such discount code to our customer.
flowchart
2102090852["==100"]
2332826675["==silver"]
2833714732["[{:Discount, :code, 50}]"]
3108351631["Root"]
3726656564["Join"]
3801762854["miles"]
3860425667["Customer"]
3895425755["account_status"]
4112061991["Flight"]
2102090852 --> 3726656564
2332826675 --> 3726656564
3108351631 --> 3860425667
3108351631 --> 4112061991
3726656564 --> 2833714732
3801762854 --> 2102090852
3860425667 --> 3895425755
3895425755 --> 2332826675
4112061991 --> 3801762854
This process is called adding WMEs (working memory elements) to the network. As you might have already guessed there is very little difference between a WME and a part of a rule.
Retex
exposes the function Retex.add_wme(t(), Retex.Wme.t())
which takes the network itself and a WME struct and tries to activate as many nodes as possible traversing the graph from the Root until each reachable branch executing a series of "tests" at each node. Let's see step by step how it would work.
Let's rewrite that same graph adding some names to the edges so we can reference them in the description:
flowchart
2102090852["==100"]
2332826675["==silver"]
2833714732["[{:Discount, :code, 50}]"]
3108351631["Root"]
3726656564["Join"]
3801762854["miles"]
3860425667["Customer"]
3895425755["account_status"]
4112061991["Flight"]
2102090852 --a--> 3726656564
2332826675 --b--> 3726656564
3108351631 --c--> 3860425667
3108351631 --d--> 4112061991
3726656564 --e--> 2833714732
3801762854 --f--> 2102090852
3860425667 --g--> 3895425755
3895425755 --h--> 2332826675
4112061991 --i--> 3801762854
Let's see what happens when adding the following working memory element to the Retex algorithm Retex.Wme.new(:Flight, :miles, 100
- Retex will receive the WME and start testing the network from the Root node which passes down anything as it doesn't test for anything
- The
Root
branches out inn
nodes (Flight and Customer)- the branch
d
will find a type node with value "Flight" and this is the first part of the WME so the test is passing- the next branch from
d
isi
which connects Flight tomiles
and so we test that the second part of the triple is exactlymiles
: the test is passing again- the next branch from
i
isf
which finds a test node== 100
which is the case of our new WME and so the test is passing- next is
a
which connects to theJoin
node which needs to be tested: a test for a Join node asserts that all incoming connections are active (their test passed) and given that the branchb
is not yet tested the traversal for now ends here and the Join remains only 50% activated
- next is
- the next branch from
- the next branch from
- the second branch to test is
c
which connects toCustomer
and this is not matchingFlight
so we can't go any further
- the branch
After adding the WME Retex.Wme.new(:Flight, :miles, 100)
the only active branches and nodes are d, i, f and a
Our rule can't be activated because the parent node Join
is not fully active yet and so it can't propagate the current WME to the production node (which is sad but fair)
Let's see what happens when adding the following working memory element to the Retex algorithm Retex.Wme.new(:Customer, :account_status, "silver")
- Retex will receive the WME and start testing the network from the Root node which passes down anything as it doesn't test for anything
- The
Root
branches out inn
nodes (Flight and Customer)- the branch
c
will find a type node with value "Customer" and this is the first part of the WME so the test is passing- the next branch from
c
isg
which connects Customer toaccount_status
and so we test that the second part of the triple is exactlyaccount_status
: the test is passing again- the next branch from
h
finds a test node== "silver"
which is the case of our new WME and so the test is passing- next is
b
which connects to theJoin
node which needs to be tested: a test for a Join node asserts that all incoming connections are active (their test passed) and given that the brancha
is also already active (we stored that in a map) we can continue the traversal- Now we find a production node which tells us that the Discount code can be applied
- next is
- the next branch from
- the next branch from
- the second branch to test is
d
which doesn't match so we can stop the traversal
- the branch
After adding the WME Retex.Wme.new(:Customer, :account_status, "silver")
all nodes are active and so the production node ends up in the agenda (just an elixir list to keep track of all production nodes which are activated)
We have now done inference and found an applicable rule. All we need to do now is to add the new WME to the network to check if any other node can be activated in the same way.
Now imagine adding more and more complex rules and following the same strategy to find activable production nodes. The conditions will all be joined by a Join
(and) node
and will point to a production.
- Retex compiles the rules using a directed acyclic graph data structure
- The activation of nodes is done using a State Monad and Forward Chaining.
- A list of bindings is stored at each active node in order to generate complete matches from partial ones
def deps do
[
{:retex, git: "https://github.com/lorenzosinisi/retex"}
]
end
If you want you can have a predefined generic DSL and the wrapper NeuralBridge so that you don't have to build the rest of the Expert System from zero
# or visit https://neuralbridge.fly.dev <- an online playground to create rules and add facts
def deps do
[
{:neural_bridge, git: "https://github.com/lorenzosinisi/neural_bridge"}
]
end
alias NeuralBridge.{Engine, Rule}
engine = Engine.new("test")
rules = [
Rule.new(
id: 1,
given: """
Person's name is equal "bob"
""",
then: """
Person's age is 23
"""
),
Rule.new(
id: 2,
given: """
Person's name is equal $name
Person's age is equal 23
""",
then: fn production ->
require Logger
bindings = Map.get(production, :bindings)
Logger.info(inspect(bindings))
end
)
]
engine = Engine.add_rules(engine, rules)
engine = Engine.add_facts(engine, "Person's name is \"bob\"")
rule = List.first(engine.rule_engine.agenda)
engine = Engine.apply_rule(engine, rule)
Enum.each(engine.rule_engine.agenda, fn pnode ->
Engine.apply_rule(engine, pnode)
end)
end # will log %{"$name" => "bob"}
alias NeuralBridge.{Engine, Rule}
engine = Engine.new("doctor_AI")
engine =
Engine.add_rules(engine, [
Rule.new(
id: 1,
given: """
Patient's fever is greater 38.5
Patient's name is equal $name
Patient's generic_weakness is equal "Yes"
""",
then: """
Patient's diagnosis is "flu"
"""
),
Rule.new(
id: 2,
given: """
Patient's fever is lesser 38.5
Patient's name is equal $name
Patient's generic_weakness is equal "No"
""",
then: """
Patient's diagnosis is "all good"
"""
)
])
engine =
Engine.add_facts(engine, """
Patient's fever is 39
Patient's name is "Aylon"
Patient's generic_weakness is "Yes"
""")
## contains Patient's diagnnosis
[
%_{
action: [
%Retex.Wme{
identifier: "Patient",
attribute: "diagnosis",
value: "flu"
}
],
bindings: %{"$name" => "Aylon"}
}
] = engine.rule_engine.agenda
- Run
mix test
- Run
elixir benchmark/rule_chain.exs
- Use at your own risk
- This is just a template for complexer implementations of the described algorithms
## Resources and documentation
- Rete
- Retex at ElixirConf EU
- Retex is available as playground here: https://neuralbridge.fly.dev
References:
Rete: A Fast Algorithm for the Many Pattern/Many Object Pattern Match Problem* by
Charles L. Forgy
Department of Computer Science, Carnegie-Mellon University,
Pittsburgh, PA 15213, U.S.A.