Implement rfind() for unicode based on Cpython #7

densmirn · 2019-09-26T14:04:09Z

The PR is continuation of numba#4611.

shssf · 2019-09-30T22:03:33Z

numba/unicode.py

@@ -382,6 +382,15 @@ def _find(substr, s):
    return -1


+@njit(_nrt=False)
+def _rfind(substr, s):


May be it is better to put this code into unicode_rfind directly?

I agree it make sense.

Vyacheslav-Smirnov

Please format all rows to < 80 characters
Need to add tests for start/end/substring types and start/end Optional Type
Examples are: https://github.com/numba/numba/blob/master/numba/tests/test_unicode.py#L447-L492
And https://github.com/numba/numba/blob/master/numba/tests/test_unicode.py#L843-L905

Vyacheslav-Smirnov · 2019-10-01T09:48:03Z

numba/tests/test_unicode.py

@@ -6,7 +6,7 @@
 from __future__ import print_function

 import sys
-from itertools import permutations
+from itertools import permutations, product


Suggest to use another line for importing product, before this line

Why don't you like this importing approach?

It is already imported in way that I described in the latest merge to master:
https://github.com/numba/numba/blob/master/numba/tests/test_unicode.py#L9-L10

Ok I will make the change, but usually importing module parts is doing with comma in one line.

Vyacheslav-Smirnov · 2019-10-01T09:51:50Z

numba/tests/test_unicode.py

+    return x.rfind(y, start, end)
+
+
+def rfind_with_start_only_usecase(x, y, start):


Suggested change

def rfind_with_start_only_usecase(x, y, start):

def rfind_with_start_usecase(x, y, start):

Why does it make sense?

Vyacheslav-Smirnov · 2019-10-01T09:52:17Z

numba/tests/test_unicode.py

+
+
+def rfind_with_start_only_usecase(x, y, start):
+    return x.rfind(y, start)


Could you please reorder functions like rfind_usecase -> rfind_with_start_usecase -> rfind_with_start_end_usecase

Yes I could.

Vyacheslav-Smirnov · 2019-10-01T09:59:39Z

numba/tests/test_unicode.py

+            ('_a' + '\U00100304' * 100, ['_a']),
+            ('_\u0102' + '\U00100304' * 100, ['_\u0102']),
+        ]
+        for s, subs in subs + cpython_subs:


Are you sure that subs does not conflict with subs from 356 line?
Suggest to rename in something else (sub or sub_list for example)

I agree with the comment and will fix it.

Vyacheslav-Smirnov · 2019-10-01T10:04:19Z

numba/tests/test_unicode.py

+                for start, end in product(range(-20, 20), range(-20, 20)):
+                    msg = 'Results of interpreted and compiled "{}".rfind("{}", {}, {}) should be equal'
+                    self.assertEqual(pyfunc(s, sub_str, start, end), cfunc(s, sub_str, start, end),
+                                     msg=msg.format(s, sub_str, start, end))


Missed testing with None indexes

I agree with the comment and will add that.

Vyacheslav-Smirnov · 2019-10-01T12:51:11Z

numba/unicode.py

+@overload_method(types.UnicodeType, 'rfind')
+def unicode_rfind(s, substr):
+    if not isinstance(substr, types.UnicodeType):
+        return None


Should not it raise TypeError if the substring is not unicodeType?

I agree, it should.

Vyacheslav-Smirnov · 2019-10-01T13:00:26Z

numba/unicode.py

@@ -531,6 +531,20 @@ def find_impl(a, b):
        return find_impl


+@overload_method(types.UnicodeType, 'rfind')
+def unicode_rfind(s, substr):


Missed optional start and end args
Also need to check type of these args, like is done for count: https://github.com/numba/numba/blob/master/numba/unicode.py#L903-L909

I will do it.

Vyacheslav-Smirnov · 2019-10-01T13:03:07Z

numba/unicode.py

+        return None
+
+    def rfind_impl(s, substr):
+        # Naive, slow string matching for now


Remove this comment

Vyacheslav-Smirnov · 2019-10-01T13:03:39Z

numba/unicode.py

+    if not isinstance(substr, types.UnicodeType):
+        return None
+
+    def rfind_impl(s, substr):


Missed optional start and end args

I will add the arguments.

Vyacheslav-Smirnov · 2019-10-01T13:11:19Z

numba/tests/test_unicode.py

+
+        for s in UNICODE_EXAMPLES:
+            subs = ['', 'xx', s[:-2], s[3:], s]
+            for sub_str in subs:


Probably, use ['', 'xx', s[:-2], s[3:], s] in loop definition?

shssf · 2019-10-02T00:27:32Z

git squash please

shssf reviewed Sep 30, 2019

View reviewed changes

Vyacheslav-Smirnov suggested changes Oct 1, 2019

View reviewed changes

Implement str.rfind()

f32c2db

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement rfind() for unicode based on Cpython #7

Implement rfind() for unicode based on Cpython #7

densmirn commented Sep 26, 2019

shssf Sep 30, 2019 •

edited

Loading

densmirn Sep 30, 2019

Vyacheslav-Smirnov left a comment

Vyacheslav-Smirnov Oct 1, 2019

densmirn Oct 1, 2019

Vyacheslav-Smirnov Oct 1, 2019

densmirn Oct 1, 2019

Vyacheslav-Smirnov Oct 1, 2019

densmirn Oct 1, 2019

Vyacheslav-Smirnov Oct 1, 2019

densmirn Oct 1, 2019

Vyacheslav-Smirnov Oct 1, 2019

densmirn Oct 1, 2019

Vyacheslav-Smirnov Oct 1, 2019

densmirn Oct 1, 2019

Vyacheslav-Smirnov Oct 1, 2019

densmirn Oct 1, 2019 •

edited

Loading

Vyacheslav-Smirnov Oct 1, 2019

densmirn Oct 1, 2019

Vyacheslav-Smirnov Oct 1, 2019

densmirn Oct 1, 2019

Vyacheslav-Smirnov Oct 1, 2019

densmirn Oct 1, 2019

Vyacheslav-Smirnov Oct 1, 2019

densmirn Oct 1, 2019

shssf commented Oct 2, 2019

		return x.rfind(y, start, end)


		def rfind_with_start_only_usecase(x, y, start):

	def rfind_with_start_only_usecase(x, y, start):
	def rfind_with_start_usecase(x, y, start):



		def rfind_with_start_only_usecase(x, y, start):
		return x.rfind(y, start)

Implement rfind() for unicode based on Cpython #7

Are you sure you want to change the base?

Implement rfind() for unicode based on Cpython #7

Conversation

densmirn commented Sep 26, 2019

shssf Sep 30, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Vyacheslav-Smirnov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

densmirn Oct 1, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shssf commented Oct 2, 2019

shssf Sep 30, 2019 •

edited

Loading

densmirn Oct 1, 2019 •

edited

Loading