Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

insert-license detection does not ignore spaces after comment symbols #79

Open
XuehaoSun opened this issue Aug 2, 2023 · 5 comments
Open

Comments

@XuehaoSun
Copy link

Take python as an example
# Copyright... and
# Copyright...
It doesn't consider the two pieces of code to be the same, so it doesn't detect # Copyright...,just because it has two spaces after #
The result is that it will automatically add the license again.

@Lucas-C
Copy link
Owner

Lucas-C commented Aug 2, 2023

I agree that this is annoying.

Have you tried to fuzzy-match your license?
cf. https://github.com/Lucas-C/pre-commit-hooks#fuzzy-license-matching

@XuehaoSun
Copy link
Author

Of course, but that would make me have to delete the TODO for each file after running. I'd prefer to be able to automatically skip when a match is reached, rather than forcing the TODO to be inserted, because I don't think the space indentation issue is worth it Fix it individually.
So, I ended up choosing --skip-license-insertion-comment to avoid it from being automatically inserted, but this will cause --use-current-year fail, which is obviously not as reasonable as using fuzzy matching to skip.

@Lucas-C
Copy link
Owner

Lucas-C commented Aug 3, 2023

So, I ended up choosing --skip-license-insertion-comment to avoid it from being automatically inserted, but this will cause --use-current-year fail, which is obviously not as reasonable as using fuzzy matching to skip.

Does that mean that your problem is solved?

Otherwise, it's not clear to me what solution you suggest?

@XuehaoSun
Copy link
Author

I think it is more convenient to judge whether to skip insert by fuzzy matching rate.
But this feature has not been implemented yet, so I can only use --skip-license-insertion-comment as a substitute for it.

@peterjc
Copy link

peterjc commented Sep 2, 2024

I typically use a Python style format tool like black or similar, meaning I rarely have to worry about # Copyright... versus # Copyright... which is nice. That said it seems practical to ignore any space(s) (or other whitespace like tabs?) after the # for matching the license text.

The way I was expecting this to work would be the comparison is done on the comment block from the file with the comment syntax removed, which might be harder than it seems with assorted different commenting syntax configurations. However, in fact looking at the code, the comparison is done on the actual file contents versus the expected license block with the configured comment marker and one space.

See https://github.com/Lucas-C/pre-commit-hooks/blob/v1.5.5/pre_commit_hooks/insert_license.py#L167 which inserts one space when preparing the expected license block, not just for inserting into the file if missing, but also used for finding the license: https://github.com/Lucas-C/pre-commit-hooks/blob/v1.5.5/pre_commit_hooks/insert_license.py#L549

i.e. The simplest way I can see to fix this is to build a regular expression of "{opening comment}{at least one space}{line of license}" and use that in the search?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants