failing to use BART models - Breaking the generation loop! #42

kontramind · 2023-10-18T07:59:39Z

Hi,

I'm trying to use f.eks, 'sshleifer/distilbart-cnn-6-6' and failing. Following message:

An error has occurred: Breaking the generation loop! To address this issue, consider fine-tuning the GReaT model for an longer period. This can be achieved by increasing the number of epochs. Alternatively, you might consider increasing the max_length parameter within the sample function. For example: model.sample(n_samples=10, max_length=2000) If the problem persists despite these adjustments, feel free to raise an issue on our GitHub page at: https://github.com/kathrinse/be_great/issues

Aleksandar

The text was updated successfully, but these errors were encountered:

unnir · 2023-10-18T08:37:09Z

Hi,

Could you please provide your training hyperparameters or whole python code?

kontramind · 2023-10-20T08:27:22Z

Hi,

Could you please provide your training hyperparameters or whole python code?

Hi @unnir ,

Sure. Here is the code. We run training on California dataset.
Keep in mind that we also introduce a workaround for BelenGarciaPascual' question.
Belen and me are collaborating on same task.
We are planning to work on a proper PR.

In the code below total number of epoch is 8*9.

```python
batch_size = 32
steps = len(data)//batch_size

epochs = [0,1,2,3,4,5,6,7]
columns = data.columns

for epoch in epochs:
    for idx, column in enumerate(columns):
        print(f'{epoch=} -> {column=}')
        great = GReaT(base,                                 # Name of the large language model used (see HuggingFace for more options)
              batch_size=batch_size,
              epochs=epoch*len(data.columns) + idx + 1,   # Number of epochs to train (only one epoch for demonstration)
              save_steps=steps,                            # Save model weights every x steps
              logging_steps=steps,                         # Log the loss and learning rate every x steps
              experiment_dir=f"aleks_{llm}_trainer",       # Name of the directory where all intermediate steps are saved
        )

        if epoch == 0 and  idx == 0:
            trainer = great.fit(data, conditional_col=column)
        else:
            trainer = great.fit(data, conditional_col=column, resume_from_checkpoint=True)
            rmtree(Path(f"aleks_{llm}_trainer")/f"checkpoint-{epoch*len(data.columns)*steps + idx*steps}")

        great.save(f"aleks_california_{llm}")
        

        for path in Path(f"aleks_{llm}_trainer").iterdir():
            if path.is_dir():
                print(f'{path=}')

unnir · 2023-10-20T09:40:43Z

My suggestion, again, is to train the model longer, but I will try to reproduce the error and debug it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

failing to use BART models - Breaking the generation loop! #42

failing to use BART models - Breaking the generation loop! #42

kontramind commented Oct 18, 2023

unnir commented Oct 18, 2023

kontramind commented Oct 20, 2023 •

edited

Loading

unnir commented Oct 20, 2023

failing to use BART models - Breaking the generation loop! #42

failing to use BART models - Breaking the generation loop! #42

Comments

kontramind commented Oct 18, 2023

unnir commented Oct 18, 2023

kontramind commented Oct 20, 2023 • edited Loading

unnir commented Oct 20, 2023

kontramind commented Oct 20, 2023 •

edited

Loading