Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update contributors list #1010

Merged
merged 1 commit into from
Jul 4, 2023

Conversation

absolutelynothelix
Copy link
Collaborator

@absolutelynothelix absolutelynothelix commented Jan 24, 2023

it's (almost) automatically generated from the git log and the contributors' emails are (trivially) obfuscated

the script used: ideally, it should be written in python but i don't know python (just like i don't know javascript, i used to be good at it but getting back to it after a long time proved that it's not true anymore :D). and the `git log` command should be integrated into it so you only have to run a single command to get a nice contributors list.

if someone will rewrite this script in python it could be included in the source tree for convenience and be run before each release but for now i just leave it here:

#!/usr/bin/env node

import { readFile, writeFile } from 'node:fs';

// the header of the new CONTRIBUTORS file
let contributors = 'Sorted in alphabetical order. Feel free to open an issue ' +
	'or create a\npull request if you want to change or remove your mention.' +
	'\n\n';

// the CONTRIBUTORS file is supposed to be created with the following command in
// order to extract as much names and emails from the `git log` as possible:
// ```sh
// git log --format="%an|%ae%n%aN|%aE%n%cn|%ce%n%cN|%cE%n%gn|%ge%n%gN|%gE" |\
// sort | uniq > CONTRIBUTORS
// ```
readFile('CONTRIBUTORS', 'utf-8', (error, data) => {
	if (error) {
		console.error('something went wrong while reading the contributors ' +
			'file :(');

		process.exit(1);
	}

	// turn a long string into an array and iterate over it
	data = data.split('\n');
	for (let i = 0; i < data.length; i++) {
		// if someone will contribute with a name containing the "|" character
		// we will have to think of a new separator :')
		data[i] = data[i].split('|');

		// convenience aliases
		let name = data[i][0];
		let email = data[i][1];

		// contributors with these names will be discarded. used to gracefully
		// remove duplicates based on contributor's preference.
		const name_blacklist = [
			// i believe that Yuxuan Shui prefers `Yuxuan Shui` instead of this
			// one
			'yshui'
		];

		// contributors with these emails will be discarded. used to gracefully
		// remove duplicates based on contributor's preference. and to remove
		// non-contributors (bots, etc.).
		const email_blacklist = [
			// i believe that Adam Jackson prefers `[email protected]` instead of
			// this one
			'[email protected]',
			// The Gitter Badger
			'[email protected]',
			// GitHub
			'[email protected]',
			// Que Quotion prefers `[email protected]` instead of this one
			'[email protected]',
			// i prefer `[email protected]` instead of this one
			'[email protected]'
		];

		// discard invalid and blacklisted contributors
		if (!name || name_blacklist.includes(name) || !email ||
			email_blacklist.includes(email)) {
			data.splice(i--, 1);

			continue;
		}

		if (!email.includes('@') || email.endsWith('noreply.github.com')) {
			// if an email doesn't contain the "@" character or if it's a
			// github's private email assume that a contributor didn't provide a
			// email
			email = '';
		} else {
			// otherwise, obfuscate and format the provided email
			email = `<${email.replace('@', ' at ').toLowerCase()}>`;
		}

		// if the previous contributor has the same name as the current one join
		// the current email to the previous one and discard the current
		// contributor
		if (i > 0 && name === data[i - 1][0]) {
			data[i - 1][1] += `${data[i - 1][1] ? ' ' : ''}${email}`;
			data.splice(i--, 1);

			continue;
		}

		// we only edit the email
		data[i][1] = email;
	}

	// append the list of contributors to the predefined header
	for (let i = 0; i < data.length; i++) {
		const name = data[i][0];
		const email = data[i][1];

		contributors += name;
		if (email) {
			contributors += ` ${email}`;
		}

		contributors += '\n';
	}

	writeFile('CONTRIBUTORS', contributors, error => {
		if (error) {
			console.error('something went wrong while writing the ' +
				'contributors file :(');

			process.exit(1);
		}
	});

	console.log('don\'t forget to review the changes! the script may need to ' +
		'be modified to handle new contributors.');
});

@codecov
Copy link

codecov bot commented Jan 24, 2023

Codecov Report

Merging #1010 (0de0f3f) into next (05ef18d) will increase coverage by 0.08%.
The diff coverage is n/a.

❗ Current head 0de0f3f differs from pull request most recent head 23538d8. Consider uploading reports for the commit 23538d8 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##             next    #1010      +/-   ##
==========================================
+ Coverage   37.74%   37.82%   +0.08%     
==========================================
  Files          48       48              
  Lines       10869    10844      -25     
==========================================
  Hits         4102     4102              
+ Misses       6767     6742      -25     

see 7 files with indirect coverage changes

@absolutelynothelix absolutelynothelix marked this pull request as draft February 9, 2023 04:09
@absolutelynothelix
Copy link
Collaborator Author

absolutelynothelix commented Feb 9, 2023

i think that it's better to only replace the @ character with at. i'll update the script and the CONTRIBUTORS file later.

@tryone144
Copy link
Collaborator

I think automating this task is a good idea.

Ideally, it'd be written in python, but i don't know python (just like i don't know javascript, i used to be good at it, but getting back to it after a long time proved that it's not true anymore :D). and the git log command should be integrated into it, so you only have to run a single command to get a nice contributors list.

if someone will rewrite this script in python it could be included in the source tree for convenience and be run before each release, but for now i just leave it here.

Got a bit of free time... 🙈

Python script Rewritten the above script in python. Does not directly write the file but should instead be used with redirecting stdout to the CONTRIBUTORS file.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""Generate contributors list from git commit history."""

import shutil
import subprocess

# Contributors with these names will be discarded.
# Used to gracefully remove duplicates based on contributor's preference.
NAME_BLACKLIST = {
    "yshui",                        # prefer 'Yuxuan Shui'
}

# Contributors with these emails will be discarded.
# Used to gracefully remove duplicates based on contributor's preference,
# and to remove non-contributors (bots, etc.).
EMAIL_BLACKLIST = {
    "[email protected]",     # prefer "[email protected]"
    "[email protected]",         # prefer "[email protected]"
    "[email protected]",       # prefer "[email protected]"
    # Bots
    "[email protected]",           # GitHub
    "[email protected]",             # The Gitter Badger
}

HEADER = (
    "Sorted in alphabetical order. Feel free to open an issue or create a\n"
    "pull request if you want to change or remove your mention.\n"
)

GIT_LOG_FORMAT = "%an\t%ae%n%aN\t%aE%n%cn\t%ce%n%cN\t%cE%n%gn\t%ge%n%gN\t%ge"


def get_git_contributors():
    """Get sorted list of contributors from git commit history."""
    git_exe = shutil.which("git")
    command = (git_exe, "log", f"--format={GIT_LOG_FORMAT}")
    result = subprocess.run(command, shell=False,
                            capture_output=True, text=True)

    result.check_returncode()

    contributors = {
        line.strip()
        for line in result.stdout.splitlines()
        if line.strip()
    }

    return sorted(contributors, key=str.casefold)


def format_contributors(contributor_list):
    """Format name and email address of each contributor."""
    contributors = []

    for contributor in contributor_list:
        name, email = contributor.split("\t")
        name = name.strip()
        email = email.strip()

        # ignore empty and blacklisted entries
        if (not name or name in NAME_BLACKLIST
                or not email or email in EMAIL_BLACKLIST):
            continue

        # remove invalid or private github email addresses
        if "@" not in email or email.endswith("noreply.github.com"):
            email = None

        # rudimentary obfuscate the provided email address
        if email:
            email = f"<{email.replace('@', ' at ').lower()}>"

        if contributors and contributors[-1][0] == name:
            if email:
                contributors[-1][1].append(email)
        else:
            contributors.append((name, [email] if email else []))

    return contributors


def main():
    print(HEADER)

    contributors = format_contributors(get_git_contributors())
    for name, emails in contributors:
        if not emails:
            print(name)
        else:
            print(name, " ".join(emails))


if __name__ == "__main__":
    main()

@absolutelynothelix
Copy link
Collaborator Author

absolutelynothelix commented Apr 13, 2023

@tryone144, perfect, thank you! i've updated the CONTRIBUTORS file using your script to reflect the new contributors and it works well except a small difference in sorting but that's ok.

if you wish you could do a pull request in this branch to add the script to the source tree (i believe that's how it supposed to be done to preserve your authorship?) naming it whatever you like and placing it wherever you like.

@absolutelynothelix absolutelynothelix marked this pull request as draft June 17, 2023 14:16
it's (almost) automatically generated from the git log and the
contributors' emails are (trivially) obfuscated
@absolutelynothelix
Copy link
Collaborator Author

i think that this pull request can be merged without waiting for @tryone144. he could do a pull request adding his script to the source tree later if he wants. btw, a couple of notes about his script:

  • the last placeholder in the GIT_LOG_FORMAT must be %gE (sorry, it's my typo);
  • a new NAME_BLACKLIST entry: "Monsterovich" # prefer 'Nikolay Borodin'.

@absolutelynothelix absolutelynothelix marked this pull request as ready for review June 27, 2023 17:30
@yshui
Copy link
Owner

yshui commented Jul 4, 2023

nice thought replacing @ with at there 👍

@yshui yshui merged commit 3d82e76 into yshui:next Jul 4, 2023
@absolutelynothelix absolutelynothelix deleted the update-contributors-list branch July 4, 2023 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants