You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If a sql text file is encoded as ANSI (as opposed to UTF-8 or similar) the newer Go version of sqlcmd will not correctly parse non-ASCII characters.
For example, if a file contains non-breaking spaces (character 160), which in T-SQL is generally treated identically to a normal space. In ANSI Windows-1252, this is encoded as a single-byte hex A0.
The Go version of sqlcmd appears to assume all files are UTF encoded, for it treats such a character as unknown and replaces it with unicode character 65533, which would be consistent with assuming UTF-8 encoded, for the single byte A0 is not valid UTF-8.
The attached file is a simple example txt file encoded using the Windows notepad as ANSI, containing "SELECT{Non-breaking-space}CURRENT_TIMESTAMP"
It can be run in sqlcmd with a command like:
sqlcmd -i testfile.txt
The original ODBC version of sqlcmd has no problem running the above file, returning the expected timestamp.
The GO version however fails:
"Could not find stored procedure 'SELECT�CURRENT_TIMESTAMP'."
The behavior of the GO sqlcmd should either match the ODBC behavior, or this should be documented as one of the "Breaking changes from sqlcmd (ODBC)" that ANSI-encoded text files are not supported.
The text was updated successfully, but these errors were encountered:
thx for opening the issue. This is related to #111
ODBC SqlCmd treats non-Unicode/non-UTF8 files as "system code page encoded" and converts them to UTF16 on read using the Win32 API MultiByteToWideChar, at least on Windows. I am not sure what their Linux version does.
There's not much support in the Go dev community for code pages and we encourage folks who develop cloud-first applications that run on Linux etc to use UTF8 or UTF16 encoded files instead of relying on ambient properties like the system code page.
I do want to support the code page conversions but we just haven't had the time to do the work yet. I will update the README appropriately.
this content is relevant for ODBC SqlCmd on Linux and may guide our implementation.
I don't know offhand what the Go method to detect "current locale" is.
If a sql text file is encoded as ANSI (as opposed to UTF-8 or similar) the newer Go version of sqlcmd will not correctly parse non-ASCII characters.
For example, if a file contains non-breaking spaces (character 160), which in T-SQL is generally treated identically to a normal space. In ANSI Windows-1252, this is encoded as a single-byte hex A0.
The Go version of sqlcmd appears to assume all files are UTF encoded, for it treats such a character as unknown and replaces it with unicode character 65533, which would be consistent with assuming UTF-8 encoded, for the single byte A0 is not valid UTF-8.
The attached file is a simple example txt file encoded using the Windows notepad as ANSI, containing "SELECT{Non-breaking-space}CURRENT_TIMESTAMP"
testfile.txt
It can be run in sqlcmd with a command like:
sqlcmd -i testfile.txt
The original ODBC version of sqlcmd has no problem running the above file, returning the expected timestamp.
The GO version however fails:
"Could not find stored procedure 'SELECT�CURRENT_TIMESTAMP'."
The behavior of the GO sqlcmd should either match the ODBC behavior, or this should be documented as one of the "Breaking changes from sqlcmd (ODBC)" that ANSI-encoded text files are not supported.
The text was updated successfully, but these errors were encountered: