Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random Segmentation Faults #305

Open
JCoder01 opened this issue Apr 17, 2021 · 6 comments
Open

Random Segmentation Faults #305

JCoder01 opened this issue Apr 17, 2021 · 6 comments

Comments

@JCoder01
Copy link

JCoder01 commented Apr 17, 2021

I'm reading from MS Sql Server using Microsoft ODBC driver, V17.
I run in a container with the miniconda3 base image and build a conda env using
python 3.7
turbodbc 4.1.2 (conda-forge)
pyarrow 3.0 (conda-forge)

I'm reading the data out of the cursor using cursor.fetchallnumpy() . Randomly, I see a seg fault. I tried building turbodbc rather than pulling in the package from conda-forge and got the same result. I can run the same query 50 times in a loop and only occasionally will it crash. Enabling fault handler points to an error in cursor.close() when self.impl._reset() is called.
Any ideas on how to further troubleshoot?

@xhochy
Copy link
Collaborator

xhochy commented Apr 17, 2021

Can you please post the traceback?

@JCoder01
Copy link
Author

JCoder01 commented Apr 18, 2021

Sure, there isn't much to it though. A little bit of added information is that this happens as part of an Arrow Flight server, so I'm not sure if the threading that takes place inside the flight server is in some way causing a problem with the threads being used by Turbodbc.

flight-server_3  | Thread 0x00007f3085b0e700 (most recent call first):
flight-server_3  |   File "/opt/conda/envs/arrow-flight/lib/python3.7/site-packages/turbodbc/cursor.py", line 380 in close
flight-server_3  |   File "/usr/local/flight_server.py", line 153 in _turbodbc
flight-server_3  |   File "/usr/local/flight_server.py", line 64 in do_get

@JCoder01
Copy link
Author

Originally, I had use_async_io=True in the connection_options. If I remove that, then the seg faults go away. I'll try and do some testing of turbodbc outside the flight server, however it's looking more to me like it's the combination of the packages that are causing the issue..

@JCoder01
Copy link
Author

Yes, things run fine outside of flight so it is related to the interaction between the 2 libraries. I'm at a bit of a loss how to further troubleshoot though.

@xhochy
Copy link
Collaborator

xhochy commented Apr 21, 2021

I'm a bit confused about the /usr/local part in /usr/local/flight_server.py. Is that your code? Otherwise this would be spooky as all dependencies should be part of the conda environment.

@JCoder01
Copy link
Author

yes, everything in /usr/local/flight_server.py is my code. _turbodbc is a method there -

def _turbodbc(self, connection_string, query, params):
        from turbodbc import connect, make_options
        from turbodbc.exceptions import DatabaseError
        connected = False
        connection_options = make_options(
            use_async_io=True,
            varchar_max_character_limit=2000000
        )
        try:
            conn = connect(
                turbodbc_options=connection_options, connection_string=connection_string
            )
        except DatabaseError:
            time.sleep(2)
            conn = connect(
                turbodbc_options=connection_options, connection_string=connection_string
            )
        is_committed = False
        data = None
        cursor = conn.cursor()
        try:
            cursor.execute(query, parameters=params)
            if cursor.description:
                self.logger.debug(query)
                data = cursor.fetchallnumpy()
            else:
                self.logger.debug('empty result set')
            cursor.close()
            conn.commit()
            is_committed = True
        except Exception as e:
            print(e)
            print(query)
            cursor.close()
            if not is_committed:
                conn.rollback()
            raise
        finally:
            conn.close()
        return pa.Table.from_pydict(data or {})```

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants