Help implementing "Multivariate Time Series Representation Learning" paper #2059
sc12752
started this conversation in
Development
Replies: 1 comment
-
Problem got solved by setting dropout rate in I assume probability dropout is harmful when applied to time series, especially given that another dropout gets applied once self-attention results are flattened. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I'm trying to implement a classifier part from Transformer-based Framework for Multivariate Time Series Representation Learning paper using DJL, my current code can be found in this github repo, the most important parts there are SupervisedClassifier block and dataset test.
While doing this I follow a
tsai
Python implementation using the same dataset and parameters they have chosen in that collab, a relevanttsai
source code can be found here.The general paper idea is as follows: use a self-attention encoder part from "Attention is all you need" paper, then flatten resulting features and use a final feed-forward layer as a classifier.
My problem is it seems that I have implemented everything right, but it does not work well enough. In particular, while training on
tsai
implementation an accuracy starts at0.5
and monotonically rises to0.7
after100
epochs. In my current implementation accuracy starts at0.5
, jumps between0.5
and~0.62
as training is progressing and then ends at about~0.5
again after100
epochs.At this point I don't have an idea what I'm doing wrong and looking for maybe some tips on what I could be missing in my implementation.
For reference, here's
tsai
model printout:And here's my current classifier printout:
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions