Skip to content

Character encoding problem #7

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
thomaz-yuji opened this issue Oct 20, 2023 · 3 comments
Open

Character encoding problem #7

thomaz-yuji opened this issue Oct 20, 2023 · 3 comments

Comments

@thomaz-yuji
Copy link

thomaz-yuji commented Oct 20, 2023

While trying to use interbase python package, i had some errors related to character encoding, probably from latin1, the error below is just one from lots of more:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc7 in position 23: invalid continuation byte

0xC7 represents the Ç character (capital letter C with cedilla).

Also, some other errors:

File "C:\dev\.venv\Lib\site-packages\pandas\io\sql.py", line 2079, in read_query columns = [col_desc[0] for col_desc in cursor.description] ^^^^^^^^^^^^^^^^^^ File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 3290, in __get_description return self._ps.description ^^^^^^^^^^^^^^^^^^^^ File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 2202, in __get_description precision = (self.cursor._connection._determine_field_precision(sqlvar)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 1096, in _determine_field_precision self.__ic.execute("SELECT FIELD_SPEC.RDB$FIELD_PRECISION" File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 3411, in execute self._ps._execute(parameters) File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 3130, in _execute raise exception_from_status(DatabaseError, self._isc_status, interbase.ibcore.DatabaseError: ("Error while executing SQL statement:\n- SQLCODE: -804\n- b'Dynamic SQL Error'\n- b'SQL error code = -804'\n- b'Incorrect values within SQLDA structure'", -804, 335544569) Exception ignored in: <function Connection.__del__ at 0x00000181BCEB3380> Traceback (most recent call last): File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 1638, in __del__ self.__close() File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 975, in __close self.__ic.close() File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 3376, in close self._ps.close() File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 3205, in close self._free_handle() File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 3061, in _free_handle raise exception_from_status(DatabaseError, self._isc_status, interbase.ibcore.DatabaseError: ("Error while releasing SQL statement handle:\n- SQLCODE: -501\n- b'Dynamic SQL Error'\n- b'SQL error code = -501'\n- b'Attempt to reclose a closed cursor'", -501, 335544569) Exception ignored in: <function PreparedStatement.__del__ at 0x00000181BCEBD440>

That one is problably because of incompatibility of interbase and read_sql from pandas?

My code was based on pyodbc and now i'm trying to change to interbase.

Any hints?

@wuarmin
Copy link

wuarmin commented Jan 30, 2024

I have a similar issue. If I set the charset of the connection to "ISO8859_1", which is correct in my case

        self.conn = interbase.connect(
            host=host,
            user=username,
            password=passwd,
            charset="ISO8859_1",
            database="c:/dbs/mydb.ib",
            ib_library_name="/opt/interbase/lib/libgds.so"
        )

I get:

interbase.ibcore.DatabaseError: ("Cursor.fetchone:\n- SQLCODE: -802\n- b'arithmetic exception, numeric overflow, or string truncation'\n- b'Cannot transliterate character between character sets'", -802, 335544321)

If I set the charset to None, it fails on line 365 in ibcore.py:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 18: invalid continuation byte

because the str is encoded in ISO5589_1.

For now I have to use following workaround:

        self.conn = interbase.connect(
            host=host,
            user=username,
            password=passwd,
            charset="ISO8859_1",
            database="c:/dbs/mydb.ib",
            ib_library_name="/opt/interbase/lib/libgds.so"
        )

        # monkey patch interbase to return bytes instead of strings
        def b2u(st, charset):
            "Decode to unicode if charset is defined. For conversion of result set data."
            return st
        interbase.ibcore.b2u = b2u

and decode the column values on my side.

Hey 👋 @lmbelo, maybe you can help with this?
Thanks

@lmbelo
Copy link
Member

lmbelo commented Apr 11, 2025

Hello @thomaz-yuji and @wuarmin, is this a thing yet?

@wuarmin
Copy link

wuarmin commented Apr 12, 2025

Hey, my workaround is still active

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants