You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I realize that dealing with Python 2-to-3 unicode issues was painful, but we ought to know whether we're dealing with a bytes or a str based on the context and avoid using smart_str() as a cure-all. Instead, let's use encode('utf-8') or decode('utf-8')—or nothing at all—where possible.
We can avoid smart_str(f.read()) by opening the file with the default mode of r instead of rb (line 167)
The content of a Django HttpResponseis always a bytes, but in this case (line 178), the data attribute of a Django REST FrameworkResponse is "The unrendered, serialized data of the response." Let's look to see how this is generated, especially since we're ending up with "b'Received empty submission", which looks like str() or repr() was called on a bytes instead of decode()-ing it.
The create() method of the XFormSubmissionApi viewset bypasses the serializer and goes directly to error_response() when something bad happens:
It seems like error could be a variety of things, but if we're assuming it has a content attribute (line 214), then it's probably an instance of HttpResponse or its subclass—in this case, it is a OpenRosaResponseBadRequest. Since HttpResponse.content is always a bytes, we can unambiguously call decode('utf-8') on it instead of smart_str().
On line 217, we actually know that error_msg is either _("Unable to create submission."), an instance of str, or the first matching group from xml_error_re.search(). Since xml_error_re is a string pattern, the argument to xml_error_re.search() cannot be anything but a str, and everything it returns will be a str as well. Demonstration:
>>> re.compile('fun').search(b'funny')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: cannot use a string pattern on a bytes-like object
In all cases, error_msg is a str, and smart_str() is unnecessary.
This still doesn't explain where the weird "b'Received empty submission" comes from. For that, we have to turn to OpenRosaResponse:
This is quite naughty, because HttpResponse.content should always be a bytes. The superclass does its part to ensure that, but then we run the bytes that it gave us through string interpolation with %s, which "converts any Python object using str()". Demonstration:
>>> '''wow %s''' % b'yeah'
"wow b'yeah'"
What's the best way to proceed here? We couldself.content.decode('utf-8'), do the string interpolation, and then re-encode the whole result, but that's yucky. This seems like a job for plain ol' concatenation: after all, bytesare just "immutable sequences of single bytes". Something like this would be fine:
I realize that dealing with Python 2-to-3 unicode issues was painful, but we ought to know whether we're dealing with a
bytes
or astr
based on the context and avoid usingsmart_str()
as a cure-all. Instead, let's useencode('utf-8')
ordecode('utf-8')
—or nothing at all—where possible.Consider this example:
kobocat/onadata/apps/api/tests/viewsets/test_xform_submission_api.py
Lines 166 to 179 in 67cdfa5
smart_str(f.read())
by opening the file with the default mode ofr
instead ofrb
(line 167)content
of a DjangoHttpResponse
is always abytes
, but in this case (line 178), thedata
attribute of a Django REST FrameworkResponse
is "The unrendered, serialized data of the response." Let's look to see how this is generated, especially since we're ending up with"b'Received empty submission"
, which looks likestr()
orrepr()
was called on abytes
instead ofdecode()
-ing it.create()
method of theXFormSubmissionApi
viewset bypasses the serializer and goes directly toerror_response()
when something bad happens:kobocat/onadata/apps/api/viewsets/xform_submission_api.py
Lines 204 to 219 in 67cdfa5
Here we have two more calls to
smart_str()
.error
could be a variety of things, but if we're assuming it has acontent
attribute (line 214), then it's probably an instance ofHttpResponse
or its subclass—in this case, it is aOpenRosaResponseBadRequest
. SinceHttpResponse.content
is always abytes
, we can unambiguously calldecode('utf-8')
on it instead ofsmart_str()
.error_msg
is either_("Unable to create submission.")
, an instance ofstr
, or the first matching group fromxml_error_re.search()
. Sincexml_error_re
is a string pattern, the argument toxml_error_re.search()
cannot be anything but astr
, and everything it returns will be astr
as well. Demonstration:error_msg
is astr
, andsmart_str()
is unnecessary."b'Received empty submission"
comes from. For that, we have to turn toOpenRosaResponse
:kobocat/onadata/libs/utils/logger_tools.py
Lines 555 to 564 in 67cdfa5
This is quite naughty, because
HttpResponse.content
should always be abytes
. The superclass does its part to ensure that, but then we run thebytes
that it gave us through string interpolation with%s
, which "converts any Python object usingstr()
". Demonstration:self.content.decode('utf-8')
, do the string interpolation, and then re-encode the whole result, but that's yucky. This seems like a job for plain ol' concatenation: after all,bytes
are just "immutable sequences of single bytes". Something like this would be fine:The text was updated successfully, but these errors were encountered: