Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check_digests is under-documented, confusing everyone #124

Open
wumpus opened this issue Dec 22, 2020 · 0 comments
Open

check_digests is under-documented, confusing everyone #124

wumpus opened this issue Dec 22, 2020 · 0 comments

Comments

@wumpus
Copy link
Collaborator

wumpus commented Dec 22, 2020

When reporting problems, class DigestChecker expects check_digests to be 'raise' or 'log', elsewise it prints nothing. It does always set self._passed to False. So the boolean value check_digests=True passes all of the tests (thanks to _passed), but it confused the first user who attempted to use it.

The documentation:

  • CHANGELIST.rst says it's a boolean argument.
  • The CLI sets check_digests=True for "warcio check". The code then uses record.digest_checker.problems to print problems.
  • README.rst fails to mention check_digests at all.

The tests:

  • sets check_digests=True and then checks record.digest_checker.passed
  • do test 'raise' and 'log'
  • the 'problems' property is only checked for 0 length, not for any non-zero cases

The user code from issue #123 is using 'raise', but to start with, the user set it to True and had no idea that they should check .passed, due to lack of documentation.

There's a real reason for 'raise' being an option, it's because iterators can't raise and be resumed. warcio check needs to be able to fully read partially corrupted files. However, that's no excuse for confusing everyone.

Action:

  • add a test for problems being consistent with passed
  • document the current, complicated interface
  • Think about whether it's too clever

Thanks to @dlazesz for pointing this problem out.

@wumpus wumpus changed the title check_digests is a buggy POS check_digests is under-documented, confusing everyone Dec 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant