This is a poorly named repo that has expanded beyond the original intention.
This repo contains scripts to gather various bits of data from a few CNCF projects along with some datasets under a creative commons license.
The datasets can be found in the datasets directory.
There are some obsolete scripts that I haven't updated in ages, but owners_details.py and get_more_owners.py are the up to date ones used to generate the datasets.