Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-activation as output of VGGish #24

Open
eatsleepraverepeat opened this issue Nov 2, 2021 · 1 comment
Open

Pre-activation as output of VGGish #24

eatsleepraverepeat opened this issue Nov 2, 2021 · 1 comment

Comments

@eatsleepraverepeat
Copy link

eatsleepraverepeat commented Nov 2, 2021

Hello there,

when comparing this code to the one placed in tensorflow/models I've found that implementations use different layers as output of VGGish model (if considering activation as a separate layer),

yours:

nn.ReLU(True))

google's: https://github.com/tensorflow/models/blob/f32dea32e3e9d3de7ed13c9b16dc7a8fea3bd73d/research/audioset/vggish/vggish_slim.py#L104-L106 (activation_fn=None)

Also, it's mentioned in README

Note that the embedding layer does not include a final non-linear activation, so the embedding value is pre-activation

Changing output layer of VGGish in your implementation to pre-activation one (w/o RELU) makes embeddings (almost) equal in both cases, - raw and PCA'ed ones.

Thanks for porting though, great work!

@brentspell
Copy link

First, I would like to echo the kudos for publishing this port of VGGIsh. I am implementing a Fréchet Audio Distance (FAD) library and will definitely make use of it.

For anyone else who arrives here looking for a workaround, the final ReLU can be removed from the pretrained VGGish model with the following snippet:

vggish = pt.hub.load("harritaylor/torchvggish", "vggish")
vggish.embeddings = pt.nn.Sequential(*list(vggish.embeddings.children())[:-1])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants