Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

session.verify got overwritten by env var #35

Open
rizhansas opened this issue Jun 26, 2024 · 0 comments
Open

session.verify got overwritten by env var #35

rizhansas opened this issue Jun 26, 2024 · 0 comments

Comments

@rizhansas
Copy link

Problem:

In our Airflow worker pod, we specify an env var REQUESTS_CA_BUNDLE. This leads to SAS Studio Flow operator failed to honor the extra field of Airflow Connection {"ssl_certificate_verification": false } to skip the cert verification.

As you can see, it confirmed TLS verification is turned off and even get the access token from SAS Logon Get oauth token. But it failed to talk to SAS Studio REST endpoint.

[2024-06-26, 17:08:58 UTC] {sas.py:52} INFO - TLS verification is turned off
[2024-06-26, 17:08:58 UTC] {sas.py:62} INFO - Creating session for connection named sas_default to host https://d21670.ingress-nginx.miadmin-01-m1.irm.sashq-d.openstack.sas.com/
[2024-06-26, 17:08:58 UTC] {sas.py:82} INFO - Get oauth token (see README if this crashes)
[2024-06-26, 17:08:59 UTC] {sas_studioflow.py:90} INFO - Generate code for Studio Flow: /Users/miadmin/TestFlow.flw
[2024-06-26, 17:08:59 UTC] {logging_mixin.py:188} INFO - Code Generation for Studio Flow without Compute session
[2024-06-26, 17:08:59 UTC] {taskinstance.py:441} ▼ Post task execution logs
[2024-06-26, 17:08:59 UTC] {taskinstance.py:2905} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/home/sas/.local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 715, in urlopen
    httplib_response = self._make_request(
  File "/home/sas/.local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 404, in _make_request
    self._validate_conn(conn)
  File "/home/sas/.local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 1060, in _validate_conn
    conn.connect()
  File "/home/sas/.local/lib/python3.8/site-packages/urllib3/connection.py", line 419, in connect
    self.sock = ssl_wrap_socket(
  File "/home/sas/.local/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
  File "/home/sas/.local/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib64/python3.8/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/usr/lib64/python3.8/ssl.py", line 1040, in _create
    self.do_handshake()
  File "/usr/lib64/python3.8/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1131)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/sas/.local/lib/python3.8/site-packages/requests/adapters.py", line 564, in send
    resp = conn.urlopen(
  File "/home/sas/.local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 801, in urlopen
    retries = retries.increment(
  File "/home/sas/.local/lib/python3.8/site-packages/urllib3/util/retry.py", line 594, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='d21670.ingress-nginx.miadmin-01-m1.irm.sashq-d.openstack.sas.com', port=443): Max retries exceeded with url: /studioDevelopment/code (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1131)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/sas/.local/lib/python3.8/site-packages/sas_airflow_provider/operators/sas_studioflow.py", line 91, in execute
    code = _generate_flow_code(
  File "/home/sas/.local/lib/python3.8/site-packages/sas_airflow_provider/operators/sas_studioflow.py", line 199, in _generate_flow_code
    response = session.post(uri, json=req)
  File "/home/sas/.local/lib/python3.8/site-packages/sas_airflow_provider/hooks/sas.py", line 112, in <lambda>
    session.post = lambda *args, **kwargs: requests.Session.post(  # type: ignore
  File "/home/sas/.local/lib/python3.8/site-packages/requests/sessions.py", line 637, in post
    return self.request("POST", url, data=data, json=json, **kwargs)
  File "/home/sas/.local/lib/python3.8/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/sas/.local/lib/python3.8/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/home/sas/.local/lib/python3.8/site-packages/requests/adapters.py", line 595, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='d21670.ingress-nginx.miadmin-01-m1.irm.sashq-d.openstack.sas.com', port=443): Max retries exceeded with url: /studioDevelopment/code (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1131)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/sas/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 465, in _execute_task
    result = _execute_callable(context=context, **execute_callable_kwargs)
  File "/home/sas/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 432, in _execute_callable
    return execute_callable(context=context, **execute_callable_kwargs)
  File "/home/sas/.local/lib/python3.8/site-packages/airflow/models/baseoperator.py", line 400, in wrapper
    return func(self, *args, **kwargs)
  File "/home/sas/.local/lib/python3.8/site-packages/sas_airflow_provider/operators/sas_studioflow.py", line 124, in execute
    raise AirflowException(f"SASStudioFlowOperator error: {str(e)}")
airflow.exceptions.AirflowException: SASStudioFlowOperator error: HTTPSConnectionPool(host='d21670.ingress-nginx.miadmin-01-m1.irm.sashq-d.openstack.sas.com', port=443): Max retries exceeded with url: /studioDevelopment/code (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1131)')))
[2024-06-26, 17:08:59 UTC] {taskinstance.py:1206} INFO - Marking task as FAILED. dag_id=MySASStudioFlowOperatorDAG, task_id=sas_studio_test_flow, run_id=manual__2024-06-26T17:08:55.695486+00:00, execution_date=20240626T170855, start_date=20240626T170858, end_date=20240626T170859
[2024-06-26, 17:08:59 UTC] {standard_task_runner.py:110} ERROR - Failed to execute job 6 for task sas_studio_test_flow (SASStudioFlowOperator error: HTTPSConnectionPool(host='d21670.ingress-nginx.miadmin-01-m1.irm.sashq-d.openstack.sas.com', port=443): Max retries exceeded with url: /studioDevelopment/code (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1131)'))); 14161)
[2024-06-26, 17:08:59 UTC] {local_task_job_runner.py:240} INFO - Task exited with return code 1
[2024-06-26, 17:08:59 UTC] {taskinstance.py:3498} INFO - 0 downstream tasks scheduled from follow-on schedule check
[2024-06-26, 17:08:59 UTC] {local_task_job_runner.py:222} ▲▲▲ Log group end

Root Cause

In the 1st REST call, it explicitly passed the boolean value verify to the request.post function. It works as expected.

response = requests.post(
f"{self.host}/SASLogon/oauth/token",
data=payload,
verify=self.cert_verify,
headers=my_headers,
timeout=http_timeout
)

In the 2nd REST call, it didn't pass verify to the request.* function but rather Session.verify.

# set to false if using self-signed certs
session.verify = self.cert_verify
# prepend the root url for all operations on the session, so that consumers can just provide
# resource uri without the protocol and host
root_url = self.host
session.get = lambda *args, **kwargs: requests.Session.get( # type: ignore
session, urllib.parse.urljoin(root_url, args[0]), *args[1:], **kwargs
)
session.post = lambda *args, **kwargs: requests.Session.post( # type: ignore
session, urllib.parse.urljoin(root_url, args[0]), *args[1:], **kwargs
)
session.put = lambda *args, **kwargs: requests.Session.put( # type: ignore
session, urllib.parse.urljoin(root_url, args[0]), *args[1:], **kwargs
)
session.delete = lambda *args, **kwargs: requests.Session.delete( # type: ignore
session, urllib.parse.urljoin(root_url, args[0]), *args[1:], **kwargs
)
return session

There is a bug in Python request function for years. But still everyone is wasting hours for this overwritten issue. It is better that we fix it in our code or at least make two REST calls behave in a consistent way (either both fail or both succeed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant