Results in NeuralQA inconsistent with same model running on HF #60

jvence · 2020-10-08T15:24:45Z

I've tested a model that I've deployed on NeuralQa vs one deployed on HF and noticed that the same inputs are yielding different outputs even though it's using the exact same model. This can of course be attributed to a few things but I can't seem to identify the culprit.

Here's the context:

Question:
Are your handsets locked or unlocked?

Corpus:
['No, all our handsets are unlocked.','Since your SIM isn’t working in your handset while other SIM cards are, it might be an issue with your handset provider; or the mobile phone could be locked, meaning it only accepts SIM cards from a particular service provider. Please contact the handset dealer for more assistance.']

The following returns 'unlocked' which is the correct response:
See Demo on HuggingFace

I've configured the exact same model in NeuralQA (with relsnip disabled) and the result is 'locked' even though I'm feeding exactly the same inputs.
Here my log:

0:No, all our handsets are unlocked.
[{'answer': 'unlocked', 'took': 0.35032129287719727, 'start_probability': '0.92030567', 'end_probability': '0.00026586326', 'probability': '0.460418697912246', 'question': 'Are your handsets locked or unlocked?', 'context': 'no, all our handsets are unlocked '}]
1:Since your SIM isn’t working in your handset while other SIM cards are, it might be an issue with your handset provider; or the mobile phone could be locked, meaning it only accepts SIM cards from a particular service provider. Please contact the handset dealer for more assistance.
[{'answer': 'locked', 'took': 0.5319299697875977, 'start_probability': '0.9462091', 'end_probability': '0.007203659', 'probability': '0.48030819557607174', 'question': 'Are your handsets locked or unlocked?', 'context': 'since your sim isn ’ t working in your handset while other sim cards are, it might be an issue with your handset provider ; or the mobile phone could be locked , meaning it only accepts sim cards from a particular service provider. please contact the handset dealer for more assistance'}]

As you can see the 2nd answer gets a higher probability but that doesn't really make sense as it's exactly the same model.
The main difference is that the NeuralQA model is feeding the corpus content independently while in the HF example, we're feeding the entire corpus.

Any ideas on why this is happening?

jvence · 2020-10-08T15:35:20Z

Could this be related to #39

victordibia · 2020-10-08T16:56:41Z

@jvence ,

Yup, it is definitely related to #39 .The solution will be to rewrite that piece using the HF approach.
Its part of some work to convert the entire lib to use pytorch. See #53 .
Hoping to have some updates in the coming week or so.

jvence · 2020-10-09T09:17:41Z

Yes further testing with multiple models does confirm that the results given by NeuralQA are way off the ones returned by HF face model. Hope this can be resolved soon as it's critical to us. Thank you

jvence · 2020-10-29T15:15:14Z

Hi @victordibia, just checking in to see if there's any update on this? Seems like a pretty critical issue. Thanks

jvence · 2020-12-08T01:40:09Z

@victordibia Is this project still maintained? We have not heard from you for a while. Hope everything is ok.

jvence · 2021-05-03T12:29:10Z

@victordibia It's a shame that this is no longer maintained. What are you plans vis-a-vis this project?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Results in NeuralQA inconsistent with same model running on HF #60

Results in NeuralQA inconsistent with same model running on HF #60

jvence commented Oct 8, 2020 •

edited

Loading

jvence commented Oct 8, 2020

victordibia commented Oct 8, 2020

jvence commented Oct 9, 2020

jvence commented Oct 29, 2020

jvence commented Dec 8, 2020

jvence commented May 3, 2021

Results in NeuralQA inconsistent with same model running on HF #60

Results in NeuralQA inconsistent with same model running on HF #60

Comments

jvence commented Oct 8, 2020 • edited Loading

jvence commented Oct 8, 2020

victordibia commented Oct 8, 2020

jvence commented Oct 9, 2020

jvence commented Oct 29, 2020

jvence commented Dec 8, 2020

jvence commented May 3, 2021

jvence commented Oct 8, 2020 •

edited

Loading