You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The following example shows that phospholipase C (PLC) δ1 can not be correctly extracted. This usually happens when there are (). Can this bug be systematically fixed?
$ cat main2.py
#!/usr/bin/env python
# vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8:
import spacy
nlp = spacy.load('en', disable=['tokenizer', 'ner', 'textcat'])
## 'tagger' and 'parser' can not be disabled.
doc = nlp(u'We previously revealed that the expression of phospholipase C (PLC) δ1, one of the most basal PLCs, is down-regulated in colon adenocarcinoma, and that the KRAS signaling pathway suppresses PLCδ1 expression.')
print [x for x in doc.noun_chunks]
$ ./main2.py
[We, the expression, phospholipase C (PLC, δ1, the most basal PLCs, colon adenocarcinoma, the KRAS, PLCδ1 expression]
The text was updated successfully, but these errors were encountered:
The noun chunks depend on the part-of-speech tags and dependency parse, so this issue likely comes down to incorrect predictions made by the tagger or parser.
I'm merging this with #3052. We've now added a master thread for incorrect predictions and related reports – see the issue for more details.
The following example shows that
phospholipase C (PLC) δ1
can not be correctly extracted. This usually happens when there are()
. Can this bug be systematically fixed?The text was updated successfully, but these errors were encountered: