Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on the purpose and functionality of _pad_tensors_to_max_len in Trainer subclass #18

Open
oussaidene opened this issue Jan 31, 2025 · 0 comments

Comments

@oussaidene
Copy link

Hi,

I came across the following method in the codebase and wanted to ask for clarification about its purpose:

def _pad_tensors_to_max_len(self, tensor, max_length):
    if self.tokenizer is not None and hasattr(self.tokenizer, "pad_token_id"):
        # If PAD token is not defined at least EOS token has to be defined
        pad_token_id = (
            self.tokenizer.pad_token_id if self.tokenizer.pad_token_id is not None else self.tokenizer.eos_token_id
        )
    else:
        if self.model.config.pad_token_id is not None:
            pad_token_id = self.model.config.pad_token_id
        else:
            raise ValueError("Pad_token_id must be set in the configuration of the model, in order to pad tensors")
    tensor[tensor == -100] = self.tokenizer.pad_token_id
    padded_tensor = pad_token_id * torch.ones(
        (tensor.shape[0], max_length), dtype=tensor.dtype, device=tensor.device
    )
    padded_tensor[:, : tensor.shape[-1]] = tensor
    return padded_tensor

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant