Validate shard filenames in checkpoint index to prevent path traversal (silent weight injection) by Snakinya · Pull Request #46913 · huggingface/transformers

Snakinya · 2026-06-26T09:04:32Z

Summary

get_checkpoint_shard_files in src/transformers/utils/hub.py reads the weight_map of model.safetensors.index.json / pytorch_model.bin.index.json and feeds weight_map.values() straight into os.path.join(repo, subfolder, f) with no validation. When loading an untrusted Hub repo, those values are attacker-controlled, and an absolute path or a ..-bearing value resolves the shard outside the checkpoint directory. The escaped path is then opened by safe_open (.safetensors) or torch.load (.bin) and loaded into the model as weights.

This patch rejects any shard filename that is not a single relative basename (no path separators, no absolute path, no ..), and adds regression tests.

Why this matters / impact

Defaults don't help. The bypass is reachable under the full safe-loading default stack:

use_safetensors=True (default) — the shard the attacker redirects to is itself a legal .safetensors file, so the safetensors-first cascade is happy.
weights_only=True (torch 2.6+ default) — safe_open doesn't go through pickle at all, so this gate never sees the attacker file.
trust_remote_code=False (default) — the bug is on the weight-loading path, not the remote-code path; no opt-in is required.

End-to-end PoC, run against the latest main at the time of writing (9b6af5d, 17 minutes old):

# attacker publishes a Hub repo with:
#   config.json = {"model_type":"bert","architectures":["BertModel"]}
#   model.safetensors.index.json = {"metadata":{},
#       "weight_map":{"bert.embeddings.word_embeddings.weight":
#                     "<absolute path to attacker-staged .safetensors>"}}
# attacker-staged safetensors contains a tensor of shape (30522, 768) filled with 0.31415
from transformers import AutoModel
m = AutoModel.from_pretrained("<attacker repo>")   # all defaults
sd = m.state_dict()
print(next(k for k in sd if "word_embeddings" in k), float(sd[k].mean()))
# >>> embeddings.word_embeddings.weight 0.314150

The attacker-controlled value loads into the model's word_embeddings cleanly — no error, no MISSING entry in _missing_keys, no code execution. The victim ends up running a model whose behavior is silently controlled by an attacker-staged weights file (silent model backdoor / weight injection).

Two practical attack shapes:

Single-repo, same-host: attacker's index points the shard at another *.safetensors file path the attacker can stage on the victim host (e.g. another HF-cache file from a separately-staged attacker repo the victim was nudged to download earlier). The victim's from_pretrained(<malicious repo>) loads attacker weights without any code execution.
Cross-repo weight confusion on shared cache (multi-tenant training hosts): attacker repo A is downloaded and cached; attacker repo B (completely separate) ships an index.json whose weight_map points at A's cached shard. User B loading "harmless" repo B silently ends up running attacker A's weights.

Severity rationale: bypass of the three default safe-loading gates, silent (no error), and persistent (model behavior tampering) — without any code execution primitive.

What this PR changes

src/transformers/utils/hub.py — after collecting shard_filenames, reject any filename that is "", ".", "..", an absolute path, or is not equal to its own os.path.basename. The error message names the offending value and the index file. The check runs before either the local-folder branch (os.path.join(pretrained_model_name_or_path, subfolder, f)) or the Hub branch (cached_files(...) — which sends the raw filenames to hf_hub_download / snapshot_download) ever sees the value.

tests/utils/test_hub_utils.py — new GetCheckpointShardFilesSecurityTests class:

test_rejects_absolute_path_in_weight_map
test_rejects_parent_traversal_in_weight_map
test_accepts_benign_relative_basename (sanity check: model-00001-of-00001.safetensors still loads)

Test commands run

PYTHONPATH=src TRANSFORMERS_OFFLINE=1 python -m unittest \
  tests.utils.test_hub_utils.GetCheckpointShardFilesSecurityTests -v

Result:

test_accepts_benign_relative_basename ... ok
test_rejects_absolute_path_in_weight_map ... ok
test_rejects_parent_traversal_in_weight_map ... ok

----------------------------------------------------------------------
Ran 3 tests in 0.002s

OK

PoC verification before/after the patch:

Before: get_checkpoint_shard_files(...) returns ['/tmp/attacker_dropper.safetensors'], safe_open on the returned path yields a tensor with mean = 0.314150.
After: get_checkpoint_shard_files(...) raises ValueError: Invalid shard filename in checkpoint index ....

Coordination / duplicate-work check

gh pr list --repo huggingface/transformers --state open --search "weight_map path traversal" → no open PRs.
gh pr list --repo huggingface/transformers --state open --search "checkpoint shard basename" → no open PRs.
No tracking issue found for this exact code path.

No public issue exists yet for this vulnerability (no security advisory was found in the repo's published advisories either). Opening this public PR per the contributor's explicit choice; happy to also file a private GitHub Security Advisory if maintainers prefer to handle disclosure that way and squash this into a follow-up.

AI-assistance disclosure

Per the repository's AGENTS.md agentic contribution policy:

This patch, the regression tests, and the PoC verification were produced with the help of an AI coding agent.
The human submitter (Snakinya) reviewed every changed line, ran the included regression tests locally, and ran the PoC end-to-end against the latest main to confirm both the vulnerability and the fix.
The change is small (18 lines in hub.py, 51 in tests), fully visible in the diff, and contains no model code, no copy/sync machinery, and no doc changes that would trip the modular/copy lints.

Backwards compatibility

The only behavioral change is that ValueError is raised when an index.json ships a weight_map value that is not a relative basename. Legitimate sharded checkpoints produced by save_pretrained always use names like model-00001-of-00002.safetensors (verified against save_pretrained's shard-naming code), so no normal usage is affected. The added test test_accepts_benign_relative_basename locks this in.

The `weight_map` values parsed from `model.safetensors.index.json` / `pytorch_model.bin.index.json` are attacker-controlled when loading an untrusted Hub repo. Previously these values were passed verbatim to `os.path.join(repo, subfolder, f)`, so a crafted index pointing an entry at an absolute path or a `..`-bearing value would resolve the shard OUTSIDE the checkpoint directory. The escaped path is then opened by safe_open (.safetensors) or torch.load (.bin) and loaded into the model as weights. This bypasses the loader's safe-loading contract: a malicious repo can silently substitute its declared weights with the contents of any arbitrary local .safetensors/.bin file (including one already cached on disk from a different, separately-staged attacker repo), with no trust_remote_code=True, no weights_only=False, and no opt-out of the safetensors-first cascade. The substituted weights load cleanly with no error and no missing-key report, so a victim running from_pretrained(<malicious repo>) ends up with attacker-controlled model behavior (silent model backdoor / weight injection). Reject any shard filename that is not a single relative basename: no path separators, no absolute path, no '..'. Add regression tests that exercise the absolute-path and '..' rejection plus a benign baseline. AI-assisted patch: the patch, tests, and PoC verification were produced with the help of an AI coding agent; the human submitter reviewed every changed line and ran the included regression tests locally. Co-authored-by: Snakinya <Snakinya@users.noreply.github.com>

github-actions · 2026-06-26T09:34:12Z

CI Dashboard: View test results in Grafana

归青 and others added 2 commits June 25, 2026 16:34

Reformat test for ruff format (per CI feedback)

37edb24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Validate shard filenames in checkpoint index to prevent path traversal (silent weight injection)#46913

Validate shard filenames in checkpoint index to prevent path traversal (silent weight injection)#46913
Snakinya wants to merge 2 commits into
huggingface:mainfrom
Snakinya:security/shard-index-path-traversal

Snakinya commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Snakinya commented Jun 26, 2026

Summary

Why this matters / impact

What this PR changes

Test commands run

Coordination / duplicate-work check

AI-assistance disclosure

Backwards compatibility

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant