Skip to content
/ XGES Public

Structure learning with extremely greedy equivalence search

Notifications You must be signed in to change notification settings

ANazaret/XGES

Repository files navigation

Extremely Greedy Equivalence Search

Extremely Greedy Equivalence Search (XGES) is an efficient algorithm for learning the structure of a causal graph from observational data. It improves upon the Greedy Equivalence Search (GES) algorithm with a more accurate search strategy and a more efficient implementation.

The algorithm is described in the paper "Extremely Greedy Equivalence Search" (see citation below)

In this repo, we provide:

  • a pure python implementation of the algorithm in xges/ available with pip install xges.
    • it can use numba for faster execution if the package is installed.
    • examples of usage are available in examples/simple.py.
  • a pure c++ implementation of the algorithm in src-cpp/, which is at least ~10x faster than the python implementation.
  • code to reproduce the experiments in the paper in paper_experiments/.
    • an ad-hoc python wrapper calling the cpp executable is available in paper_experiments/benchmarks.py.
    • a notebook to generate the figures in the paper in paper_experiments/paper.ipynb.

Using the python package

The python package can be installed with pip:

pip install xges

The package can be used as follows:

from xges import XGES

data = ...
xges = XGES()
pdag = xges.fit(data)  # PDAG object representing the Markov equivalence class (MEC)

# PDAG object with only directed edges, representing an arbitrary DAG in the MEC
a_dag = pdag.get_dag_extension()

# networkx DiGraph object with two edges for undirected PDAG edges
networkx_pdag = pdag.to_networkx()
networkx_dag = a_dag.to_networkx()

adjacency_pdag = pdag.to_adjacency_matrix()
adjacency_dag = a_dag.to_adjacency_matrix()

Reproducing the experiments

The experiments can be reproduced by running the paper_experiments/benchmarks.py script (after compiling the c++ code in src-cpp). The figures are generated in the notebook paper_experiments/paper.ipynb.

Building the c++ code

Use the CMakeLists.txt file to build the code.

The code can be run with the following command:

xges --input data.npy --output out.csv --stats stats.csv -v1

The input file should be a numpy file with the data matrix. The output file will contain the CPDAG. The stats file will contain some statistics collected during the execution of the algorithm. -v1 is the verbosity level. It can be set to 0, 1, or 2.

More options can be found by running xges --help.

Citation

If you use this code, please cite the following paper:

@inproceedings{nazaret2021extremely,
  title={Extremely Greedy Equivalence Search},
  author={Nazaret, Achille and Blei, David},
  booktitle={Uncertainty in Artificial Intelligence},
  year={2024}
}

About

Structure learning with extremely greedy equivalence search

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages