Fix RagTokenizer attribute delegation by Pruthvi226 · Pull Request #46919 · huggingface/transformers

Pruthvi226 · 2026-06-26T13:43:59Z

What does this PR do?

This PR updates RagTokenizer so that missing tokenizer attributes and methods are delegated to the active current_tokenizer.

Previously, RagTokenizer.__call__, decode, and batch_decode were forwarded, but other tokenizer APIs such as encode, patch_token, and patch_token_id were not available through the wrapper. This caused AttributeError even though the active underlying tokenizer supported those attributes.

The change adds a guarded __getattr__ implementation that looks up missing attributes on current_tokenizer. This keeps the existing input/target tokenizer switching behavior while making RagTokenizer behave more like the tokenizer it wraps.

A regression test was added to check:

RagTokenizer.encode(...) delegates to the question encoder in input mode
tokenizer attributes like patch_token and patch_token_id are accessible through the wrapper
missing attributes still behave correctly with hasattr
encode(...) delegates to the generator after switching to target mode

Code Agent Policy

The Transformers repo is currently being overwhelmed by a large number of PRs and issue comments written by code agents. We are currently bottlenecked by our ability to review and respond to them. As a result, we ask that new users do not submit pure code agent PRs at this time. You may use code agents in drafting or to help you diagnose issues. We'd also ask autonomous "OpenClaw"-like agents not to open any PRs or issues for the moment.

PRs that appear to be fully agent-written will probably be closed without review, and we may block users who do this repeatedly or maliciously.

This is a rapidly-evolving situation that's causing significant shockwaves in the open-source community. As a result, this policy is likely to be updated regularly in the near future. For more information, please read CONTRIBUTING.md.

I confirm that this is not a pure code agent PR.

I used an AI coding assistant while drafting this change. I reviewed the issue, source changes, tests, and final diff before submitting.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline and the Pull Request checks?
Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.

Issue: #35532

Did you make sure to update the documentation with your changes according to the guidelines?

No documentation update needed; this is a small tokenizer wrapper behavior fix.

Did you write any new necessary tests?

Added RagTokenizerTest::test_delegates_missing_attributes_to_current_tokenizer.

Tests

PYTHONPATH=src python -m pytest -q -p no:cacheprovider tests/models/rag/test_tokenization_rag.py -k "delegates_missing_attributes_to_current_tokenizer or save_load_pretrained_with_saved_config"
# Passed 3 consecutive runs locally.

PYTHONPATH=src python -m pytest -q -p no:cacheprovider tests/models/rag/test_tokenization_rag.py
# 2 passed, 2 skipped locally. Slow pretrained RAG tests were skipped.

python -m ruff check src/transformers/models/rag/tokenization_rag.py tests/models/rag/test_tokenization_rag.py
python -m ruff format --check src/transformers/models/rag/tokenization_rag.py tests/models/rag/test_tokenization_rag.py
git diff --check origin/main...HEAD -- src/transformers/models/rag/tokenization_rag.py tests/models/rag/test_tokenization_rag.py

github-actions · 2026-06-26T13:45:12Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: rag

github-actions · 2026-06-26T13:48:35Z

CI Dashboard: View test results in Grafana

Fix RagTokenizer attribute delegation

8ec4417

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix RagTokenizer attribute delegation#46919

Fix RagTokenizer attribute delegation#46919
Pruthvi226 wants to merge 1 commit into
huggingface:mainfrom
Pruthvi226:contribution-good-first-issue

Pruthvi226 commented Jun 26, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Pruthvi226 commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Code Agent Policy

Before submitting

Tests

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Pruthvi226 commented Jun 26, 2026 •

edited

Loading