Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infra: Add pep2bib.py for generating BibTeX entries #2085

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

freundTech
Copy link

This PR adds a pep2bib.py script, which generates a bibtex file for use in LaTeX.

An example entry looks like this:

@techreport{pep634,
    author = "Brandt Bucher and Guido van Rossum",
    title = "PEP 634: Structural Pattern Matching: Specification",
    institution = "Python Software Foundation",
    year = "2020",
    month = sep,
    type = "PEP",
    number = "634",
    url = "https://www.python.org/dev/peps/pep-0634/"
}

Should you think that this repository is not the right place to host this script just close the PR and I'll try to maintain it in a separate repository. Having it in here would make it more discoverable for other people needing to cite PEPs though.

@AA-Turner
Copy link
Member

I'd generally be in favour of this, with two caveats.

  1. is is possible to avoid the extra dependency?
  2. we should really tighten the pep parsing helpers -- currently they use pure Docutils, so don't take advantage of e.g. the role overrides defined in pep2html. In addition, if PEP 676 is approved I imagine Sphinx-exclusive syntax will start to be used, which the parsing helpers would entirely error on. Not asking you to solve the problem in this PR, more something we need to consider (although perhaps only after PEP 676 has been pronounced upon)

@freundTech
Copy link
Author

Generating the bib file manually instead of using a library shouldn't be too hard. I'd have to look into how exactly escaping in bib works. Other than that it should be pretty straight forward.
That would eliminate the additional dependency.

Copy link
Member

@CAM-Gerlach CAM-Gerlach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! BibTeX output would indeed be very useful, to make the PEPs easier to cite. However, I think if we're going to add the script here to the upstream repo, it might be a good idea to at least consider how we might integrate it into our actual end product that 99.9% of users see—the PEPs website. Some kind of "cite as BibTeX" link on PEPs might be quite useful, for instance. That could be deferred to another PR, but I feel we should at least have some kind of plan to actually use this on the site, since the number of users who are actually going to clone the repo, set up their environment and run the script just to get a BibTeX citation seems likely to be vanishingly small.

Also, I'm not sure I see the wisdom in adding more pep2{output}.py docutils-only legacy script infrastructure (unless PEP 676 is deferred or rejected for some reason), nor following the somewhat quaint design patterns of the two-decade old scripts used here, but I defer to @AA-Turner on that judgement and don't comment further on points related to that in the PR, where the choices (API, structure, style, etc) match those in the other pep2{output} scripts even if they don't match modern best practices.

A few other notes:

  • This script should presumably be listed in the PEP infrastructure section in the .github/CODEOWNERS file with @AA-Turner as owner
  • I imagine you were attempting to mirror the other legacy build scripts, but given we nominally require Python 3.8-3.9 for building this repo, it would be nicer use f-strings instead of legacy printf interpolation, at least for the newely added file

Comment on lines +16 to +21
name_first_regex = re.compile(r'(.*)<.*>')
mail_first_regex = re.compile(r'.*\((.*)\)')
name_only_regex = re.compile(r'(.*)')


months = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are top-level constants, so I'd think it would make sense to make them UPPER_SNAKE_CASE like the others.

Comment on lines +7 to +9

# For bibliography
pybtex >= 0.24.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# For bibliography
pybtex >= 0.24.0

I would strongly advise against adding this to the requirements file, as this means all CIs and users that set up an env for our repo in a normal fashion will have to install it, when it isn't (yet) actually used anywhere other than a niche manual command. I suggest handling it just like we did the lint command and installing it on demand instead.

Comment on lines +1 to +3
import re
import datetime
import time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import re
import datetime
import time
import datetime
import re
import time

Comment on lines +54 to +56
return authors


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return authors
return authors

author_string = first_line_starting_with(full_path, 'Author:')
authors = parse_authors(author_string)
authors = authors_to_bib(authors)
url = 'https://www.python.org/dev/peps/pep-%0.4d/' % n
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to factor this into a top-level constant, so its easier to update if and when we move to peps.python.org

@AA-Turner
Copy link
Member

I agree it makes sense to integrate into the website (either in a peps.python.org/tools or on each page).

I don't want to ask you to do work that may prove fruitless, so @freundTech it may be best to hold off for a short while to see if PEP 676 gets accepted. I can then give you pointers on where we'd want to integrate this.

Cam points out we require Python 3.9 as a minimum, which might also allow for slightly cleaner code when it comes to it.

A

@CAM-Gerlach
Copy link
Member

Cam points out we require Python 3.9 as a minimum, which might also allow for slightly cleaner code when it comes to it.

Yeah, there is a lot of cruft that could be cleaned up here, but I didn't comment on anything that followed the pattern of the other similarly-named legacy scripts here (the creation of which as a formal PEP post-dated pep2html.py by a year, and for which Python 1.5 was the latest released version at the time), which I assumed the author's intent was to match.

Sidenote: Technically, our build job (for the legacy build system) is still hard-pinned to Python 3.8 (which I was going to fix in a forthcoming PR), which is why I mentioned 3.8/3.9 instead of just 3.9.

@AA-Turner
Copy link
Member

which I was going to fix in a forthcoming PR

Personally I'd hold off on updating it -- it requires at least testing to ensure everything works, and (contingent on 676 being accepted) the file will just be deleted.

I won't object if you want to do the work, though!

@CAM-Gerlach
Copy link
Member

CAM-Gerlach commented Jan 21, 2022

Personally I'd hold off on updating it -- it requires at least testing to ensure everything works, and (contingent on 676 being accepted) the file will just be deleted.

Good point; I was going to update a few other things (the linting, etc) and was prepared to fully test it but wasn't sure of the timeline on that. To note, is there a reason we don't have at least the non-deploy steps of deploy-gh-pages.yml running in a workflow triggered on PRs (as well as main, where it is already running), to ensure the new build system works in parallel to the old during this consideration/transition period, to ensure there isn't a regression or other change (in the build system or the reST source files) that breaks it but not the legacy one? For now I've just been manually testing all my PRs on both every time I make a change, which gets a bit tedious.

@AA-Turner
Copy link
Member

To note, is there a reason we don't have at least the non-deploy steps of deploy-gh-pages.yml running in a workflow triggered on PRs (as well as main, where it is already running), to ensure the new build system works in parallel to the old during this consideration/transition period, to ensure there isn't a regression or other change (in the build system or the reST source files) that breaks it but not the legacy one? For now I've just been manually testing all my PRs on both every time I make a change, which gets a bit tedious.

fixed

@CAM-Gerlach
Copy link
Member

Noticed that; thanks @AA-Turner

@encukou
Copy link
Member

encukou commented Jan 24, 2022

Is making PEPs easier to cite a good change? I worry about people who miss the fact that most PEPs are initial design discussions rather than documentation.

@AA-Turner
Copy link
Member

@encukou -- I'd agree, but guys that are going to cite PEPs will cite PEPs regardless, so a page with the button that also has a paragraph noting what you said may sway the more conscientious cite-or.

A PEP also has the benefit that it changes much less than documentation, making it marginally easier to cite, correctness aside.

If we add this feature here, perhaps a similar one might be considered for the Python docs in general?

A

@CAM-Gerlach
Copy link
Member

CAM-Gerlach commented Jan 24, 2022

Not to mention, the more likely alternative is that people don't cite anything, so better to cite the PEP than nothing...and generally, academic citations (likely most people using BibTeX, including myself) are much more likely to be interested in design discussion, motivation and rationale, and perhaps (in many PEPs) a formal specification, rather than implementation documentation—and in academia, one often needs to cite the foundational papers that incipiated a specific idea, alongside more current work on the topic, tracing the thread of knowledge back whence it came.

@CAM-Gerlach CAM-Gerlach changed the title Added pep2bib.py for generating bibtex entries Add pep2bib.py for generating BibTeX entries Jan 25, 2022
@hugovk
Copy link
Member

hugovk commented Mar 13, 2022

I don't want to ask you to do work that may prove fruitless, so @freundTech it may be best to hold off for a short while to see if PEP 676 gets accepted. I can then give you pointers on where we'd want to integrate this.

PEP 676 has been accepted.

@AA-Turner
Copy link
Member

My current thoughts are adding a step to the build process to generate a .bib file for each PEP (using this script or similar), and add a download link for each one in the PEP sidebar.

A

@hugovk hugovk added the infra Core infrastructure for building and rendering PEPs label Oct 25, 2022
@hugovk hugovk changed the title Add pep2bib.py for generating BibTeX entries Infra: Add pep2bib.py for generating BibTeX entries Oct 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA signed infra Core infrastructure for building and rendering PEPs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants