ValueError: end_index must be non-negative (again) #32

jtlz2 · 2020-06-26T07:29:27Z

This presents just as in #13. See below to reproduce. Awesome module, thanks!

Version info:

Python 2.7.16 |Anaconda custom (64-bit)| (default, Aug 22 2019, 10:59:10)
fuzzysearch.__version__ = 0.7.2

import fuzzysearch
fuzzysearch.find_near_matches('ABC 0123456', 'ABC', max_l_dist=1).next()

Traceback:


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-40-9ccf0e63dac4> in <module>()
----> 1 fuzzysearch.find_near_matches('ABC 0123456', 'ABC', max_l_dist=1).next()

/anaconda2/lib/python2.7/site-packages/fuzzysearch/__init__.pyc in find_near_matches(subsequence, sequence, max_substitutions, max_insertions, max_deletions, max_l_dist)
     55     search_class = choose_search_class(search_params)
     56     matches = search_class.search(subsequence, sequence, search_params)
---> 57     return search_class.consolidate_matches(matches)
     58
     59

/anaconda2/lib/python2.7/site-packages/fuzzysearch/levenshtein.pyc in consolidate_matches(cls, matches)
    159     @classmethod
    160     def consolidate_matches(cls, matches):
--> 161         return consolidate_overlapping_matches(matches)
    162
    163     @classmethod

/anaconda2/lib/python2.7/site-packages/fuzzysearch/common.pyc in consolidate_overlapping_matches(matches)
    186 def consolidate_overlapping_matches(matches):
    187     """Replace overlapping matches with a single, "best" match."""
--> 188     groups = group_matches(matches)
    189     best_matches = [get_best_match_in_group(group) for group in groups]
    190     return sorted(best_matches)

/anaconda2/lib/python2.7/site-packages/fuzzysearch/common.pyc in group_matches(matches)
    162 def group_matches(matches):
    163     groups = []
--> 164     for match in matches:
    165         overlapping_groups = [g for g in groups if g.is_match_in_group(match)]
    166         if not overlapping_groups:

/anaconda2/lib/python2.7/site-packages/fuzzysearch/levenshtein.pyc in search(cls, subsequence, sequence, search_params)
    154     def search(cls, subsequence, sequence, search_params):
    155         for match in find_near_matches_levenshtein(subsequence, sequence,
--> 156                                                    search_params.max_l_dist):
    157             yield match
    158

/anaconda2/lib/python2.7/site-packages/fuzzysearch/levenshtein_ngram.pyc in find_near_matches_levenshtein_ngrams(subsequence, sequence, max_l_dist)
    175         start_index = max(0, ngram_start - max_l_dist)
    176         end_index = min(seq_len, seq_len - subseq_len + ngram_end + max_l_dist)
--> 177         for index in search_exact(subsequence[ngram_start:ngram_end], sequence, start_index, end_index):
    178             # try to expand left and/or right according to n_ngram
    179             dist_right, right_expand_size = _expand(

/anaconda2/lib/python2.7/site-packages/fuzzysearch/search_exact.pyc in search_exact(subsequence, sequence, start_index, end_index)
     69         try:
     70             return search_exact_byteslike(subsequence, sequence,
---> 71                                           start_index, end_index)
     72         except (TypeError, UnicodeEncodeError):
     73             return _search_exact(subsequence, sequence, start_index, end_index)

ValueError: end_index must be non-negative

The text was updated successfully, but these errors were encountered:

taleinat · 2020-06-26T12:48:56Z

Awesome module, thanks!

Thanks for the kind words, I'm happy you're finding it useful! It would be great to hear what you're using it for.

taleinat · 2020-06-26T12:50:18Z

@jtlz2, which platform are you running this on? Windows / Linux / macOS, which exact version, 32 or 64 bit?

taleinat · 2020-06-26T12:57:55Z

@jtlz2, could you try running the same code, with bytes objects rather than strings? I.e.:

fuzzysearch.find_near_matches(b'ABC 0123456', b'ABC', max_l_dist=1).next()

jtlz2 · 2020-06-26T13:07:13Z

@taleinat Apologies - macOS 10.13.6..

We are trialling it for OCR post-processing.

The error comes out the same when using bytes as you suggest (ValueError at L71).

Thanks again!

taleinat · 2020-06-27T08:07:39Z

@jtlz2, I've started working on this. It seems like a problem with the native (C) extensions.

In the meantime, you may install fuzzysearch without the native extensions by fetching a source archive, unpacking it running python setup.py install --noexts.

taleinat · 2020-06-28T07:19:04Z

@jtlz2, I've fixed what appears to be the source of this issue. The fix is available in version 0.7.3 which I've just released. Please let me know if it resolves this issue for you!

jtlz2 · 2020-07-02T12:22:54Z

@taleinat Still get the same problem in 0.7.3 :\

taleinat · 2020-07-03T12:30:19Z

Still get the same problem in 0.7.3 :\

☹️

This seems to be related to the Anaconda distribution somehow, as it only appears to happen with it, but not with Python from python.org or built from the main git repo. I'll have to investigate further when I have more time.

jtlz2 mentioned this issue Jun 26, 2020

ValueError: end_index must be non-negative #13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: end_index must be non-negative (again) #32

ValueError: end_index must be non-negative (again) #32

jtlz2 commented Jun 26, 2020

taleinat commented Jun 26, 2020

taleinat commented Jun 26, 2020

taleinat commented Jun 26, 2020 •

edited

Loading

jtlz2 commented Jun 26, 2020

taleinat commented Jun 27, 2020 •

edited

Loading

taleinat commented Jun 28, 2020

jtlz2 commented Jul 2, 2020

taleinat commented Jul 3, 2020

ValueError: end_index must be non-negative (again) #32

ValueError: end_index must be non-negative (again) #32

Comments

jtlz2 commented Jun 26, 2020

taleinat commented Jun 26, 2020

taleinat commented Jun 26, 2020

taleinat commented Jun 26, 2020 • edited Loading

jtlz2 commented Jun 26, 2020

taleinat commented Jun 27, 2020 • edited Loading

taleinat commented Jun 28, 2020

jtlz2 commented Jul 2, 2020

taleinat commented Jul 3, 2020

taleinat commented Jun 26, 2020 •

edited

Loading

taleinat commented Jun 27, 2020 •

edited

Loading