Skip to content

Conversation

@emcd
Copy link
Owner

@emcd emcd commented Nov 9, 2025

Replace internal MIME type detection, charset encoding detection, and bytes-to-string decoding with Detextive 2.0 API calls. This removes dependencies on chardet and puremagic packages and consolidates text detection functionality.

Major changes:

  • Update pyproject.toml to use detextive~=2.0
  • Remove chardet and puremagic dependencies
  • Replace parts.LineSeparators with detextive.LineSeparators
  • Replace _detect_charset() with detextive.infer_charset()
  • Replace _detect_mimetype_and_charset() with detextive.infer_mimetype_charset()
  • Replace manual content.decode() with detextive.decode()
  • Remove internal detection functions from acquirers.py
  • Update tests to accommodate Detextive's behavioral differences

Test updates:

  • Accept normalized charset names (e.g., 'iso8859-9' vs 'iso-8859-9')
  • Accept both TextualMimetypeInvalidity and ContentDecodeFailure exceptions
  • Update binary file tests to use PE/DMG headers that Detextive rejects
  • Accept empty files as valid text (Detextive behavior)
  • Document UTF-16-LE false positive detection in .auxiliary/notes/detextive-bugs.md

claude and others added 4 commits November 9, 2025 23:04
Replace internal MIME type detection, charset encoding detection, and
bytes-to-string decoding with Detextive 2.0 API calls. This removes
dependencies on chardet and puremagic packages and consolidates text
detection functionality.

Major changes:
* Update pyproject.toml to use detextive~=2.0
* Remove chardet and puremagic dependencies
* Replace parts.LineSeparators with detextive.LineSeparators
* Replace _detect_charset() with detextive.infer_charset()
* Replace _detect_mimetype_and_charset() with detextive.infer_mimetype_charset()
* Replace manual content.decode() with detextive.decode()
* Remove internal detection functions from acquirers.py
* Update tests to accommodate Detextive's behavioral differences

Test updates:
* Accept normalized charset names (e.g., 'iso8859-9' vs 'iso-8859-9')
* Accept both TextualMimetypeInvalidity and ContentDecodeFailure exceptions
* Update binary file tests to use PE/DMG headers that Detextive rejects
* Accept empty files as valid text (Detextive behavior)
* Document UTF-16-LE false positive detection in .auxiliary/notes/detextive-bugs.md

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Reduce comment length from 82-83 characters to fit within the 79
character line limit enforced by ruff.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Add comprehensive report of tyro CLI parsing failure that prevents the
application from running. This is a pre-existing issue from the appcore
refactor (PR #9), not introduced by the Detextive 2.0 port.

The issue appears to be related to tyro's inability to parse type hints
involving _io.TextIOWrapper, likely from stdin/stdout/stderr usage.

Suggested fix: Switch to emcd-appcore[cli] dependency.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Remove the `LineSeparators = __.detextive.LineSeparators` alias from
parts.py and update all code to use `__.detextive.LineSeparators`
directly instead of `_parts.LineSeparators`.

Changes:
- Remove alias from sources/mimeogram/parts.py
- Update Part dataclass to use __.detextive.LineSeparators type
- Replace all _parts.LineSeparators with __.detextive.LineSeparators in:
  * acquirers.py (4 occurrences)
  * formatters.py (1 occurrence)
  * parsers.py (3 occurrences, including return type)
  * updaters.py (1 occurrence in function signature)
- Update all test files to import and use detextive.LineSeparators:
  * test_110_parts.py
  * test_200_parsers.py
  * test_210_formatters.py
  * test_320_differences.py
  * test_330_interactions.py
  * test_500_acquirers.py
  * test_510_updaters.py
  * test_610_apply.py

This provides clearer provenance of the LineSeparators enum and avoids
confusion about where it's defined.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
@emcd emcd merged commit 1769844 into master Nov 10, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants