Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results in NeuralQA inconsistent with same model running on HF #60

Open
jvence opened this issue Oct 8, 2020 · 6 comments
Open

Results in NeuralQA inconsistent with same model running on HF #60

jvence opened this issue Oct 8, 2020 · 6 comments

Comments

@jvence
Copy link

jvence commented Oct 8, 2020

I've tested a model that I've deployed on NeuralQa vs one deployed on HF and noticed that the same inputs are yielding different outputs even though it's using the exact same model. This can of course be attributed to a few things but I can't seem to identify the culprit.

Here's the context:

Question:
Are your handsets locked or unlocked?

Corpus:
['No, all our handsets are unlocked.','Since your SIM isn’t working in your handset while other SIM cards are, it might be an issue with your handset provider; or the mobile phone could be locked, meaning it only accepts SIM cards from a particular service provider. Please contact the handset dealer for more assistance.']

The following returns 'unlocked' which is the correct response:
See Demo on HuggingFace

I've configured the exact same model in NeuralQA (with relsnip disabled) and the result is 'locked' even though I'm feeding exactly the same inputs.
Here my log:

0:No, all our handsets are unlocked.
[{'answer': 'unlocked', 'took': 0.35032129287719727, 'start_probability': '0.92030567', 'end_probability': '0.00026586326', 'probability': '0.460418697912246', 'question': 'Are your handsets locked or unlocked?', 'context': 'no, all our handsets are unlocked '}]
1:Since your SIM isn’t working in your handset while other SIM cards are, it might be an issue with your handset provider; or the mobile phone could be locked, meaning it only accepts SIM cards from a particular service provider. Please contact the handset dealer for more assistance.
[{'answer': 'locked', 'took': 0.5319299697875977, 'start_probability': '0.9462091', 'end_probability': '0.007203659', 'probability': '0.48030819557607174', 'question': 'Are your handsets locked or unlocked?', 'context': 'since your sim isn ’ t working in your handset while other sim cards are, it might be an issue with your handset provider ; or the mobile phone could be locked , meaning it only accepts sim cards from a particular service provider. please contact the handset dealer for more assistance'}]

As you can see the 2nd answer gets a higher probability but that doesn't really make sense as it's exactly the same model.
The main difference is that the NeuralQA model is feeding the corpus content independently while in the HF example, we're feeding the entire corpus.

Any ideas on why this is happening?

@jvence
Copy link
Author

jvence commented Oct 8, 2020

Could this be related to #39

@victordibia
Copy link
Owner

@jvence ,

Yup, it is definitely related to #39 .The solution will be to rewrite that piece using the HF approach.
Its part of some work to convert the entire lib to use pytorch. See #53 .
Hoping to have some updates in the coming week or so.

@jvence
Copy link
Author

jvence commented Oct 9, 2020

Yes further testing with multiple models does confirm that the results given by NeuralQA are way off the ones returned by HF face model. Hope this can be resolved soon as it's critical to us. Thank you

@jvence
Copy link
Author

jvence commented Oct 29, 2020

Hi @victordibia, just checking in to see if there's any update on this? Seems like a pretty critical issue. Thanks

@jvence
Copy link
Author

jvence commented Dec 8, 2020

@victordibia Is this project still maintained? We have not heard from you for a while. Hope everything is ok.

@jvence
Copy link
Author

jvence commented May 3, 2021

@victordibia It's a shame that this is no longer maintained. What are you plans vis-a-vis this project?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants