Release nagisa v0.2.9 · taishi-i/nagisa

nagisa 0.2.9 incorporates the following changes:

Improve the bottleneck in part-of-speech tagging caused by 'list and append', problem resolved by using 'set and add'

Until now, there was an issue where the processing time would slow down as the results analyzed by the following code increased in tagger.py.

tids = []
for w in words:
    if w in self._word2postags:
        w2p = self._word2postags[w]
    else:
        w2p = [0] 
    if self.use_noun_heuristic is True:
        if w.isalnum() is True:
            if w2p == [0]:
                w2p = [self._pos2id[u'名詞']] 
            else:
                # bottleneck is here!
                w2p.append(self._pos2id[u'名詞']) 
    w2p = list(set(w2p))
    tids.append(w2p)

By changing to the following code, we have resolved the issue of the processing slowing down.

tids = []
for w in words:
    w2p = set(self._word2postags.get(w, [0]))
    if self.use_noun_heuristic and w.isalnum():
        if 0 in w2p:
            w2p.remove(0)
        w2p.add(2)  # nagisa.tagger._pos2id["名詞"] = 2 
    tids.append(list(w2p))

Fix dash-separated 'description-file' error in setup.cfg to use 'description_file' in setup.cfg

[metadata]
description_file = README.md

Add Python wheels (3.6, 3.7, 3.8, 3.9, 3.10, 3.11) to PyPI for Linux
Add Python wheels (3.6, 3.7, 3.8, 3.9, 3.10) to PyPI for macOS
Add Python wheels (3.6, 3.7, 3.8) to PyPI for Windows

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nagisa v0.2.9