nagisa v0.2.9
nagisa 0.2.9 incorporates the following changes:
- Improve the bottleneck in part-of-speech tagging caused by 'list and append', problem resolved by using 'set and add'
Until now, there was an issue where the processing time would slow down as the results analyzed by the following code increased in tagger.py.
tids = []
for w in words:
if w in self._word2postags:
w2p = self._word2postags[w]
else:
w2p = [0]
if self.use_noun_heuristic is True:
if w.isalnum() is True:
if w2p == [0]:
w2p = [self._pos2id[u'名詞']]
else:
# bottleneck is here!
w2p.append(self._pos2id[u'名詞'])
w2p = list(set(w2p))
tids.append(w2p)
By changing to the following code, we have resolved the issue of the processing slowing down.
tids = []
for w in words:
w2p = set(self._word2postags.get(w, [0]))
if self.use_noun_heuristic and w.isalnum():
if 0 in w2p:
w2p.remove(0)
w2p.add(2) # nagisa.tagger._pos2id["名詞"] = 2
tids.append(list(w2p))
- Fix dash-separated 'description-file' error in setup.cfg to use 'description_file' in setup.cfg
[metadata]
description_file = README.md
- Add Python wheels (3.6, 3.7, 3.8, 3.9, 3.10, 3.11) to PyPI for Linux
- Add Python wheels (3.6, 3.7, 3.8, 3.9, 3.10) to PyPI for macOS
- Add Python wheels (3.6, 3.7, 3.8) to PyPI for Windows