Skip to content

fix(tools): raise a graceful ToolError when a memory file is not valid UTF-8#1693

Open
WalkingDreams798 wants to merge 1 commit into
anthropics:mainfrom
WalkingDreams798:fix/memory-tool-non-utf8
Open

fix(tools): raise a graceful ToolError when a memory file is not valid UTF-8#1693
WalkingDreams798 wants to merge 1 commit into
anthropics:mainfrom
WalkingDreams798:fix/memory-tool-non-utf8

Conversation

@WalkingDreams798

Copy link
Copy Markdown

What

The local-filesystem memory tool reads files through _read_file_content (and its async twin _async_read_file_content), which decodes strict UTF-8 and only catches FileNotFoundError:

def _read_file_content(full_path: Path, memory_path: str) -> str:
    try:
        return full_path.read_text(encoding="utf-8")
    except FileNotFoundError as err:
        raise ToolError(...) from err

A memory file containing non-UTF-8 bytes therefore raised an uncaught UnicodeDecodeError out of view / str_replace / insert, instead of the normalized ToolError used for every other failure mode. Because the encoding is hard-coded to utf-8, this reproduces on all platforms.

Repro

import tempfile
from pathlib import Path
from anthropic.lib.tools._beta_builtin_memory_tool import BetaLocalFilesystemMemoryTool
from anthropic.types.beta import BetaMemoryTool20250818ViewCommand

tool = BetaLocalFilesystemMemoryTool(base_path=tempfile.mkdtemp())
tool.memory_root.mkdir(parents=True, exist_ok=True)
(tool.memory_root / "note.dat").write_bytes(b"\xff\xfe\x80 binary")

tool.view(BetaMemoryTool20250818ViewCommand(command="view", path="/memories/note.dat"))
# -> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff ...  (uncaught)

Change

Catch UnicodeDecodeError in both the sync and async readers and surface it as a ToolError ("… is not a valid UTF-8 text file …"). Added sync + async regression tests that write a binary file and assert the graceful ToolError.

Verification

  • uv run pytest -k non_utf8 tests/lib/tools/memory_tools/test_filesystem.py → 2 passed
  • uv run ruff check → all checks passed
  • uv run pyright (strict) → 0 errors, 0 warnings

Note: this repo's memory-tool filesystem suite has pre-existing failures on Windows (symlink/permission-mode tests) unrelated to this change; the two added tests pass and no previously-passing test regresses.

…d UTF-8

`view` / `str_replace` / `insert` read memory files via `_read_file_content`,
which decodes strict UTF-8 and only caught `FileNotFoundError`. A memory file
containing non-UTF-8 bytes therefore raised an uncaught `UnicodeDecodeError`
out of the tool (platform-independent, since the encoding is hard-coded to
utf-8), instead of a normalized `ToolError` like every other failure mode.

Catch `UnicodeDecodeError` in both the sync and async readers and surface it as
a `ToolError`. Added sync + async regression tests reading a binary file.
@WalkingDreams798 WalkingDreams798 requested a review from a team as a code owner June 19, 2026 18:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant