You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The string detection metrics are a little off in general.
The functions in str_search.c have some not further described metrics and some of them are definitely not correct (although probably work for the context they are in).
And rz_str_guess_encoding_from_buffer doesn't check for ibm037 and non-Unicode encodings and have the problem mentioned above.
To add to this. The problem of string encoding detection is also a nice fit for the knowledge base. Since different encodings have overlapping characters. Being able to seamlessly switch, define the expected encoding once or detect the expected encoding from according to some statistics, would be nice to have. But this in itself is a single module on top of the knowledge base I think.
Work environment
rizin 0.8.0 @ linux-x86-64
commit: 73d85d2
Expected behavior
Detect and display string (hex
f0 9f 9f aa f0 9f 9f aa 00
, decoded🟪🟪
) as UTF8Actual behavior
UTF16BE (which is incorrectly parsed as well, if it actually was UTF16 but that's a separate bug)
Steps to reproduce the behavior
ELF AMD64
The text was updated successfully, but these errors were encountered: