-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for specific fixed encoding in ANSI functions #45
Comments
Just to note, one of such ODBC driver which always uses UTF-8 char* for ANSI ODBC functions is |
|
SQLite works internally in UTF-8 and therefore preferred way to use it is via ANSI ODBC API. Unicode ODBC API means to work in UTF-16 mode. Preferred Unicode encoding on Linux is UTF-8 (and always was UTF-8) therefore most applications use UTF-8. So your suggestion is basically to convert strings from native application encoding UTF-8 to UTF-16, then pass UTF-16 strings to iODBC manager via Unicode API which pass them to SQLite ODBC driver via Unicode API. SQLite ODBC driver in Unicode API then converts UTF-16 string to UTF-8 and pass it to SQLite ODBC driver ANSI API which then pass it to SQLite database implementation. So basically there are two useless conversions UTF-8 --> UTF-16 and UTF-16 --> UTF-8 involved. I think that it is always better to pass UTF-8 string directly and avoid doing useless conversions on different layers.
I understand that adding another encoding library and its usage does not have to be simple. That is why I opened this feature request -- it would be nice to avoid re-encoding when it is not needed.
I quickly looked at this code and if I understood correctly, Unicode API has a switch to supply UTF-8 strings via SQLWCHAR*. But this is something which is not widely supported. Most ODBC drivers expect either UTF-16 or UTF-32 buffers in Unicode SQLWCHAR* API, not UTF-8. It is also because UTF-8 strings are null-term string, stored in char* type, which is mapped in most cases to ANSI API on unixes. |
Anyway, SQLite ODBC is not the only driver which works in this mode. I mentioned it as a good example, most developers knows it, can be easily tested (checked how it works) and plus is open source so anybody can check how is really implemented. But I have there another example of ODBC driver which pass into this category of fixed encoding: Vertica ODBC driver. It is commercial proprietary database and its ODBC driver also ignores locale settings (*). So e.g. when locale is set to Latin1 it expects that ODBC manager pass UTF-8 strings. And because it is proprietary it is not possible to change this behavior and even this behavior is not documented. Vertica is commercial product and ODBC is the only way how to use it in C/C++ application. (*) - one exception, when locale is set to 7 bit ascii "C" or "POSIX" then it respects it and works only in 7 bit mode. |
After this iODBC driver manager will convert all Unicode data between App Unicode CodePage and Driver Unicode CodePage, so you could use UTF8 Unicode call(for example) with all Unicode ODBC drivers UTF8/UTF16/UCS4 and etc. |
@pali |
It is really not a good idea to call |
To all functions which do conversion iODC already passing structure Would it be really hard to extend this But this is just a my result of inspecing current iODBC code. |
I see that some ODBC drivers support So e.g. |
Currently ANSI functions uses
char*
type for passing string arguments. And value ofchar*
on Linux builds is interpreted to be encoded according to current locale settings, more precisely what was passed tosetlocale(LC_CTYPE, ...)
call. By default when application does not call anysetlocale
function, 7bit ASCII is configured as current locale, env variables are ignored.But some ODBC drivers excepts that
char *
values in ANSI functions are always encoded inUTF-8
, independently of what is set via current locale settings (setlocale()
).So it would be nice if iODBC manager provides some API to set explicit encoding which would be used for any conversion from
char*
toSQLWCHAR*
and vice-versa. To have better support for those drivers which expects fixed encoding (e.g. UTF-8) in ANSI functions.The text was updated successfully, but these errors were encountered: