Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gated Convolution #3

Open
MadanMl opened this issue Oct 19, 2021 · 5 comments
Open

Gated Convolution #3

MadanMl opened this issue Oct 19, 2021 · 5 comments

Comments

@MadanMl
Copy link
Contributor

MadanMl commented Oct 19, 2021

In the code, the convolution operations is applied for the future time stamp also during the training. Example: for the second iteration (t=1), the stack of three time steps (0,1,2) is used for local context extractor (i.e. the convolutional block with gating unit). But doesn't the gated convolution be limited to the current and previous position only (i.e. only applying convolution on 0,1 stacked vector)

@jiaxiang-cheng
Copy link
Owner

In the code, the convolution operations is applied for the future time stamp also during the training. Example: for the second iteration (t=1), the stack of three time steps (0,1,2) is used for local context extractor (i.e. the convolutional block with gating unit). But doesn't the gated convolution be limited to the current and previous position only (i.e. only applying convolution on 0,1 stacked vector)

Hi Madan (if your name)! Many thanks for bringing this interesting point. I think it could be a potential problem for the functioning. Actually, when developing the model, I also tried taking only the previous states as the inputs for feature extraction. The difference is not so obvious. If we take the time step 2 for t = 1, this can also work because during testing stage, we can perform prediction until tmax - 1, instead of tmax. I tried this way because I was thinking this could better extract the local features. But this can be simply modified by yourself as well for sure :)

@MadanMl
Copy link
Contributor Author

MadanMl commented Oct 21, 2021

Yes,
thank you Cheng for your reply. When you tried taking only 2 blocks at a time i.e. time step 2 for t = 1 did you get better or worse results?

Could you share the hyperparamters with which we can reporduce the results menstioned in the paper (Remaining useful life estimation via transformer encoder enhanced by a gated convolutional unit. Journal of Intelligent Manufacturing, 1-10.).

@jiaxiang-cheng
Copy link
Owner

Yes, thank you Cheng for your reply. When you tried taking only 2 blocks at a time i.e. time step 2 for t = 1 did you get better or worse results?

Could you share the hyperparamters with which we can reporduce the results menstioned in the paper (Remaining useful life estimation via transformer encoder enhanced by a gated convolutional unit. Journal of Intelligent Manufacturing, 1-10.).

Hi Madan! please be noted that I'm not the author of the paper hahaha. I'm also trying to reproduce the experimental results they covered in the paper. But anyway, to some extent, the current architecture could be the best I could achieve by far (except that I have some own innovations to improve it but not made public). I guess if you are interested the changes you mentioned can be simply made and tested by yourself as well.

btw, please feel free to collaborate on this work if you have any interests. Thank you!

@MadanMl
Copy link
Contributor Author

MadanMl commented Oct 21, 2021

Hi Mr. Cheng,
(Hi Madan! please be noted that I'm not the author of the paper hahaha)
:) then I will try to reproduce the results.
(btw, please feel free to collaborate on this work if you have any interests. Thank you!)
Perfect, I will do that and also I will close this issue.
Thank you

@jjaay123
Copy link

Hi madan sir were you able to reproduce the paper and if yes can you share the hyperparmeter and thanks in advance sir

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants