Skip to content

danieljtait/treed

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

treed

Methods to compute the edit distance between (ordered) trees.

The Tree-to-tree correction problem, Kho-Chung Tai

Simple Fast Algorithms for the Editing Distance between Trees and Related Problems, K. Zhang and D. Shasha

Specifying the cost function

The problem specification allows for three modifications in order to transform one tree into another. Deleting an existing node in the source tree, adding a new node to the target tree, and then modifying a node by relabeling.

These are specified by the following arguments to tree_distance. Note that in all of the below the integer variables represent indexes into the trees nodes in a postorder iteration;

  • delete_node_cost_fn: (i: int, t: Tree) -> int | float representing the cost of deleteing the ith node of tree t, (in postorder).
  • insert_node_cost_fn: (j: int, t: Tree) -> int | float represing the cost of inserting the jth node into tree t.
  • modify_node_cost_fn: (i: int, j: int, t1: Tree, t2: Tree) -> int | float the cost of relabelling the ith node of t1 to the jth node of t2.
def modify_node_cost_fn(
    i: int, j: int, t1: Tree, t2: Tree
) -> int:
    """Cost incurred if we have to relabel"""
    n1 = t1.nodes[i]; n2 = t2.nodes[j]
    return 0 if n1.name == n2.name else 1

tree_dist = tree_distance(
    t1, t2,
    modify_node_cost_fn=modify_node_cost_fn)
    
td = tree_dist[(len(t1.nodes)-1, len(t2nodes)-1)]

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages