Project that collects all possible used IUPAC names in literature into a CCZero database. Every contributor guarantees that the contribution is CCZero and void of any legal claim otherwise. Autogenerated IUPAC names are forbidden and the IUPAC name must be found in literature. The latter includes the IUPAC names to be part of larger names, but a valid IUPAC name by itself. Zero metadata on the origin of the IUPAC name is recorded, and just the existence that the IUPAC name exists is the copyright-free fact we are recording here.
Our ambition is to have 1M IUPAC names within the first year.
This repository is very simple, consists of a single, sorted list of IUPAC names in the iupac-names.txt
file.
Each line in that file is a valid IUPAC names, as defined by OPSIN
being able to generate a SMILES
string from it.
The list is sort and contains only unique names. On GNU/Linux, the reference algorithm for this process is:
sort -f iupac-names.txt | uniq -i | tee tmp.txt | wc -l ; mv tmp.txt iupac-names.txt