Code stuck after running 1 epoch on TPU #5797
Answered
by
sumanthd17
sumanthd17
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 2 comments
-
Could you check again? I just ran your colab and it finished both epochs. Maybe it's random? So far did not see random behaviour |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thanks Adrian,
Sorry I've been editing the notebook. It's running now, but not sure what
the issue was. My bad I ended up editing the notebook and error was not
reproducible anymore. Will update it back to the old version.
…On Sun, Jan 24, 2021 at 12:00 AM Adrian Wälchli ***@***.***> wrote:
Could you check again? I just ran your colab and it finished both epochs.
Maybe it's random? So far did not see random behaviour
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5625 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGX3GHX6NN62CNNLKPFB6O3S3MIVXANCNFSM4WPSSWOA>
.
|
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
edenlightning
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
❓ Questions and Help
What is your question?
I'm trying to run the LitAutoEncoder on TPUs, but the code runs for 1 epoch and gets stuck there.
Code
Reproducible Colab Notebook
Notebook
What's your environment?
Beta Was this translation helpful? Give feedback.
All reactions