Skip to content

A data-driven method combining symbolic regression and compressed sensing for accurate & interpretable models.

License

Notifications You must be signed in to change notification settings

ItookapillinLY/SISSO

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Version SISSO.3.2, September, 2022.
This code is licensed under the Apache License, Version 2.0

If you are using this code, please cite:
R. Ouyang, S. Curtarolo, E. Ahmetcik, M. Scheffler, and L. M. Ghiringhelli, Phys. Rev. Mater. 2, 083802 (2018).

Features

  • Regression & Classification
    Ref.: [R. Ouyang et al., Phys. Rev. Mater. 2, 083802 (2018)]
  • Multi-Task Learning (MT-SISSO)
    Ref.: [R. Ouyang et al., J. Phys.: Mater. 2, 024002 (2019)]
  • Variables Selection assisted Symbolic Regression (VS-SISSO, see the VarSelect.py in 'utilities')
    Ref.: [Z. Guo et al., J. Chem. Theory Comput. 18, 4945 (2022).]
  • Sign-Constrained Multi-Task Learning (SCMT-SISSO)
    Ref.: [J. Wang et al., https://arxiv.org/abs/2301.06884]

(Please refer to the Refs. and the SISSO_guide.pdf for more details in using the code)

Installation

A Fortran mpi compiler is required to compile the SISSO parallel program. Below are two options for compiling the program using an IntelMPI compiler (other compilers may work as well). In the folder 'src', do:
(1) mpiifort -fp-model precise var_global.f90 libsisso.f90 DI.f90 FC.f90 SISSO.f90 -o ~/bin/SISSO (2) mpiifort -O2 var_global.f90 libsisso.f90 DI.f90 FC.f90 SISSO.f90 -o ~/bin/SISSO

Note:

  • option (1) enables better accuracy and run-to-run reproducibility of floating-point calculations; (2) is ~ 2X faster than (1) but tiny run-to-run variations may happen between processors of different types, e.g. Intel and AMD.
  • if 'mpi' related errors present during the compilation, try opening the file 'var_global.f90' and replace the line "use mpi" with "include 'mpif.h'". However, " use mpi " is strongly encouraged (see https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node411.htm).

Modules of the program:

  • var_global.f90 ! declaring the global variables
  • libsisso.f90 ! subroutines and functions for mathematical operations
  • DI.f90 ! model sparsification (descriptor identification)
  • FC.f90 ! feature construction
  • SISSO.f90 ! the main program

Running SISSO

Input Files: "SISSO.in" and "train.dat", whose templates can be found in 'input_templates'.
Command-line usage:
SISSO > log ! You may need to remove resource limit first by running the command 'ulimit -s unlimited'
Running on computer clusters, for example, using this command in your submission script:
mpirun -np number_of_cores SISSO >log

Output:

  • File "SISSO.out": overall information from feature construction to model building
  • Folder "models": the top ranked descriptors/models
  • Folder "SIS_subspaces": SIS-selected subspaces (feature data and expressions)
  • Folder "desc_dat": the data for the best descriptor/model
  • File "convexnd_hull": the vertices of the nD convex hulls in classification
  • File "VS_results": the results from the VS-SISSO run.

User guide

More details on using this code can be found in the SISSO_guide.pdf

About

Created and maintained by Runhai Ouyang. Please feel free to open issues in the Github or contact Ouyang
([email protected]) in case of any problems/comments/suggestions in using the code.

Other SISSO-related codes

MATLAB: https://github.com/NREL/SISSORegressor_MATLAB
Python interface: https://github.com/Matgenix/pysisso
SISSO++: https://gitlab.com/sissopp_developers/sissopp

About

A data-driven method combining symbolic regression and compressed sensing for accurate & interpretable models.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Fortran 82.0%
  • Python 18.0%