Add prek hook to auto-update copyright year in NOTICE files#67146
Add prek hook to auto-update copyright year in NOTICE files#67146xchwan wants to merge 9 commits into
Conversation
Signed-off-by: Xch1 <qchwan@gmail.com>
|
This will have the undesirble side effect that all PRs will be failing 01.01.2027 when everyone who can help might be partying hard (and not wake up early next day). I am not sure we want it. If anything - this could be manual prek hook, and we should have workflow scheduled maybe every week or so to check if the year changed and post notification on slack in the |
|
I think the simplest fix is to strip the year check out of
How do you think about this? |
Signed-off-by: Xch1 <qchwan@gmail.com>
|
I went ahead and removed the year check from check_notice_files.py in this PR. My concern is that enforcing the current year in CI guarantees a failure on January 1st every year unless someone manually updates and merges the year bump before then — a chicken-and-egg problem where the fix requires a PR to pass CI, but CI is broken until the fix lands. The ASF declaration check (The Apache Software Foundation) is retained. For the year itself, I think it's better left to the release manager to run prek run update-notice-year --all-files before the first release of each new year. Happy to hear if there's a better approach. |
| print(f"⚠️ {path}: no standard ASF copyright line found, skipping") | ||
| return False | ||
| if match.group(2) == CURRENT_YEAR: | ||
| return False | ||
| new_content = COPYRIGHT_RE.sub(rf"\g<1>{CURRENT_YEAR}\3", content) | ||
| path.write_text(new_content) | ||
| print(f"✅ {path}: updated {match.group(2)} → {CURRENT_YEAR}") |
There was a problem hiding this comment.
nits: Though there're already some prek check included the emoji. It would be better to avoid adding more of them if possible.
| notice_files = sorted( | ||
| f for f in repo_root.rglob("NOTICE") if not any(part in EXCLUDE_DIRS for part in f.parts) | ||
| ) |
There was a problem hiding this comment.
I don't think we need to sort those files before the validation.
There was a problem hiding this comment.
sorted() ensures the output order is deterministic across different operating systems and filesystems, since rglob() does not guarantee a consistent traversal order. Without it, the list of updated files would appear in a different order on each run, making the output harder to read and diff.
| from __future__ import annotations | ||
|
|
||
| import sys | ||
| from datetime import datetime | ||
| from pathlib import Path | ||
|
|
||
| CURRENT_YEAR = str(datetime.now().year) | ||
| ASF_DECLARATION = "The Apache Software Foundation" | ||
|
|
||
| errors = 0 | ||
|
|
||
| for notice_file in sys.argv[1:]: | ||
| content = Path(notice_file).read_text() | ||
|
|
||
| expected = f"Copyright 2016-{CURRENT_YEAR} The Apache Software Foundation" | ||
| if "Copyright" in content and expected not in content: | ||
| print(f"❌ {notice_file}: Missing expected string: {expected!r}") | ||
| if ASF_DECLARATION not in content: | ||
| print(f"❌ {notice_file}: Missing expected string: {ASF_DECLARATION!r}") | ||
| errors += 1 | ||
|
|
||
| sys.exit(1 if errors else 0) |
There was a problem hiding this comment.
How about keeping this file as-is but only add the logging to show that we need to run prek run update-notice-year --all-files before exit 1.
Showing the next step to fix the check would be better IMO.
There was a problem hiding this comment.
I'm still a bit concerned — even with the helpful message, all PRs will still be blocked on January 1st until someone runs the command and gets a fix merged. The error message makes it clearer how to fix it, but doesn't prevent the CI from failing in the first place. That's what Jarek motions.
Signed-off-by: Xch1 <qchwan@gmail.com>
Signed-off-by: Xch1 <qchwan@gmail.com>
issue: #60540
Airflow has 170+ NOTICE files that required manual year updates annually (e.g. 2016-2025 → 2016-2026) with no automation in place.
Adds update-notice-year hook
Was generative AI tooling used to co-author this PR?
{pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.