Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

did you upload the lstm related code? and can you share the experiment result? #3

Open
brb-chen opened this issue Jul 14, 2016 · 11 comments

Comments

@brb-chen
Copy link

No description provided.

@stillbreeze
Copy link
Collaborator

We couldn't try out the LSTMs due to lack of enough GPU memory. If you wish to implement it, just follow the 2nd paper from the README. You need to compute the frame-wise feature vector from the FC layer of each video and feed it to a distinct timestep of an LSTM.

As for the results, I don't have a report, but the 2-stream CNNs yielded accuracy a bit lower than those given in section 4.2 of the 2nd paper from the README.
Spatial - 65%-70%
Temporal - 50%-55%

@brb-chen
Copy link
Author

the main consume comes from the feature extraction pre-frame, did
optical-flow helps save computation?

any idea about how to reduce consume in the front-end feature extraction
phase?

thks!

ren

2016-07-14 15:54 GMT+08:00 Ashar Javed [email protected]:

We couldn't try out the LSTMs due to lack of enough GPU memory. If you
wish to implement it, just follow the 2nd paper from the README. You need
to compute the frame-wise feature vector from the FC layer of each video
and feed it to a distinct timestep of an LSTM.

As for the results, I don't have a report, but the 2-stream CNNs yielded
accuracy a bit lower than those given in section 4.2 of the 2nd paper from
the README.
Spatial - 65%-70%
Temporal - 50%-55%


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ADTRYS-ukVrSk8-1-KP6mAttQa9mrk3dks5qVesvgaJpZM4JMKKI
.

@guoyang007
Copy link

@stillbreeze I know we need to compute the frame-wise feature vector from the FC layer of each video and feed it to a distinct timestep of an LSTM,but how can we use keras to make it ? the Sequential model can't work? I'm confused how to stack the outputs of the CNN layers to the LSTM,can you share some idea?

@kvignesh1420
Copy link

kvignesh1420 commented Feb 18, 2017

@wadhwasahil

how many epochs did it take for the accuracy value to settle(for spatial stream) ? and how long did it take on your system. Please let me know the specifications.

cheers.

@wadhwasahil
Copy link
Owner

@kvignesh1420 In the code we used 50 epochs for the spatial stream, but I am pretty sure, we got the results before that. @stillbreeze Do you know the specs for temporal training?

@stillbreeze
Copy link
Collaborator

@guoyang007 : Sorry for the late reply. One straightforward way which most papers also follow is to not train the whole network end-to-end. So, you train the CNN, freeze the weights and then obtain the FC vectors and then feed them into a separate LSTM model.

@wadhwasahil : Sorry, but can't remember the training configuration.

@kvignesh1420
Copy link

@wadhwasahil
@guoyang007

can you please upload the lstm code if available. I am not able to extract the FC feature vectors and feed it into LSTM. Help me out.

@wadhwasahil
Copy link
Owner

Post some code so that we can help.

@kvignesh1420
Copy link

@wadhwasahil
I have resolved the issue,

@wadhwasahil
Copy link
Owner

But still @kvignesh1420 , how did u solve it? It would be better for other people if you could share your views or code.

@kvignesh1420
Copy link

@wadhwasahil give me some time. I will upload the whole procedure.. :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants