You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, when using the inference capabilities of the library, i came a cross some weird behaviour, more concreatly i was training the UMLS dataset using the TransE model, I obtained very high results for the hits@10 (0.5651) and filtered hits@10 (0.9713). However when using the infer_tails() method to infer some triples taken from the test set, i noticed that the correct triples were nowhere near the top 10, on the contrary, i noticed they where always near the bottom 10 vaues.
As such i decided to look a bit more into it. It was when i looked at the metric calculator, more specifically the get_tail_rank() and get_head_rank() methods, that i noticed that there the list of tail and head candidates specifically, were being transversed from last to first:
trank = 0
ftrank = 0
for j in range(len(tail_candidate)):
val = tail_candidate[-j - 1]
if val != t:
trank += 1
ftrank += 1
if val in self.hr_t[(h, r)]:
ftrank -= 1
else:
break
return trank, ftrank
This made sense since the tail_candidates is obtained by calling the test_tail_rank() with topk=total_entities: self.test_tail_rank(h_tensor, r_tensor, self.config.tot_entity)
function that returns: _, rank = torch.topk(preds, k=topk)
The rank is a list of indexes of the entities ordered from the highest "pred" to lowest, and since this "pred" value is the value of the scoring function (h +r - t, in the case of TransE) the lower values are the ones more likely to be the correct link. Hence i understood why the list was being transeversed from last to first.
However when it comes to the infer_tails() and infer_heads() methods, they call the test_tail_rank() and test_head_rank() but do not inverse the list, which is returning the user the top X less likelly predicted tails/heads, instead of the top X most likely predictions.
This leads me to think that this is a bug, or alternatevelly I am missing some factor in terms of using this inference capability.
Sorry for the long post,
Best regards,
Rodrigo Pereira
The text was updated successfully, but these errors were encountered:
Is there any further evidence which can be shared here? Such as top X predicted tails/heads on UMLS as well as the true X most likely tails/heads but treated as least likely.
Hi, when using the inference capabilities of the library, i came a cross some weird behaviour, more concreatly i was training the UMLS dataset using the TransE model, I obtained very high results for the hits@10 (0.5651) and filtered hits@10 (0.9713). However when using the infer_tails() method to infer some triples taken from the test set, i noticed that the correct triples were nowhere near the top 10, on the contrary, i noticed they where always near the bottom 10 vaues.
As such i decided to look a bit more into it. It was when i looked at the metric calculator, more specifically the get_tail_rank() and get_head_rank() methods, that i noticed that there the list of tail and head candidates specifically, were being transversed from last to first:
This made sense since the tail_candidates is obtained by calling the test_tail_rank() with topk=total_entities:
self.test_tail_rank(h_tensor, r_tensor, self.config.tot_entity)
function that returns:
_, rank = torch.topk(preds, k=topk)
The rank is a list of indexes of the entities ordered from the highest "pred" to lowest, and since this "pred" value is the value of the scoring function (h +r - t, in the case of TransE) the lower values are the ones more likely to be the correct link. Hence i understood why the list was being transeversed from last to first.
However when it comes to the infer_tails() and infer_heads() methods, they call the test_tail_rank() and test_head_rank() but do not inverse the list, which is returning the user the top X less likelly predicted tails/heads, instead of the top X most likely predictions.
This leads me to think that this is a bug, or alternatevelly I am missing some factor in terms of using this inference capability.
Sorry for the long post,
Best regards,
Rodrigo Pereira
The text was updated successfully, but these errors were encountered: