Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dcp-diag tool ingest authentication #372

Closed
aaclan-ebi opened this issue Mar 28, 2019 · 8 comments
Closed

Fix dcp-diag tool ingest authentication #372

aaclan-ebi opened this issue Mar 28, 2019 · 8 comments
Labels
bug Something isn't working emerged We couldn't predict this would emerge during the sprint and it needed immediate work t-shirt medium A task that can't have a numeric estimate but is assessed as medium

Comments

@aaclan-ebi
Copy link
Collaborator

aaclan-ebi commented Mar 28, 2019

https://humancellatlas.slack.com/archives/C6YTC4NJW/p1553733826258300

Steps To Reproduce:

  1. install dcp-diag tool
  2. analyze-submission -d prod 5bfe4d3a9460a300074eebc8

Actual Behavior:

the analyze-submissionfeature in the dcp-diag tool is returning the error:

HASE 1: Get submission primary bundle list from Ingest:
	Retrieving submission...Traceback (most recent call last):
  File "/Users/mfreeberg/miniconda3/envs/hca_py3/bin/analyze-submission", line 576, in <module>
    AnalyzeSubmission()
  File "/Users/mfreeberg/miniconda3/envs/hca_py3/bin/analyze-submission", line 501, in __init__
    self._get_submission_project_and_primary_bundle_list_from_ingest()
  File "/Users/mfreeberg/miniconda3/envs/hca_py3/bin/analyze-submission", line 548, in _get_submission_project_and_primary_bundle_list_from_ingest
    finder = Finder.factory(finder_name="ingest", deployment=self.deployment)
  File "/Users/mfreeberg/miniconda3/envs/hca_py3/lib/python3.6/site-packages/dcp_diag/finders/finder.py", line 13, in factory
    return finder(deployment=deployment, **args)
  File "/Users/mfreeberg/miniconda3/envs/hca_py3/lib/python3.6/site-packages/dcp_diag/finders/ingest_finder.py", line 14, in __init__
    self.ingest = IngestApiAgent(deployment=deployment)
  File "/Users/mfreeberg/miniconda3/envs/hca_py3/lib/python3.6/site-packages/dcp_diag/component_agents/ingest_api_agent.py", line 10, in __init__
    auth_headers=IngestAuthAgent().make_auth_header())
  File "/Users/mfreeberg/miniconda3/envs/hca_py3/lib/python3.6/site-packages/dcp_diag/component_agents/ingest_auth_agent.py", line 26, in __init__
    self.auth_token = self._get_auth_token()
  File "/Users/mfreeberg/miniconda3/envs/hca_py3/lib/python3.6/site-packages/dcp_diag/component_agents/ingest_auth_agent.py", line 45, in _get_auth_token
    response.raise_for_status()
  File "/Users/mfreeberg/miniconda3/envs/hca_py3/lib/python3.6/site-packages/requests/models.py", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://danielvaughan.eu.auth0.com/oauth/token

Expected Behavior:

No authentication error

@aaclan-ebi aaclan-ebi added t-shirt medium A task that can't have a numeric estimate but is assessed as medium bug Something isn't working labels Mar 28, 2019
@aaclan-ebi
Copy link
Collaborator Author

aaclan-ebi commented Mar 28, 2019

Possible solutions:

Option 1: Do not supply token in the GET requests.

  • Currently, only create submission endpoint is being authenticated in Ingest. The dcp-diag tool is only doing GET requests at the moment. Other endpoints don't require the token but if supplied and token is invalid , the request will be invalid and will return 401 http error

Pros: much quicker than option 2
Cons: but only a temporary solution as we may want to authenticate other endpoints

Option 2: Use DCP Fusillade Auth for user authentication

  • Investigate how DSS user authentication is being done. Their authentication is using the Fusillade User authentication where user is redirected to the browser to do OAuth

Pros: This is the proper solution in my opinion.
Cons: But needs learning curve how to implement.

X Option 3: Provide GCP Service account to wranglers

  • Wranglers will store GCP Service account file locally and specify the file location in the environment variable GOOGLE_APPLICATION_CREDENTIALS.

Cons: very insecure

@justincc
Copy link
Contributor

In my opinion, definitely not option 3. We have many other priority task so I suggest option 1 and kick the can down the road to when it actually becomes necessary.

@aaclan-ebi
Copy link
Collaborator Author

@justincc justincc added the emerged We couldn't predict this would emerge during the sprint and it needed immediate work label Apr 2, 2019
@aaclan-ebi
Copy link
Collaborator Author

Released a new version of DCP Diag tool:

dcp-diag==1.0.1

https://github.com/HumanCellAtlas/dcp-diag/blob/master/Changes.md#changes-for-v101-2019-04-02

Verified that error is no longer happening:

19:50 $ analyze-submission -d prod 5bfe4d3a9460a300074eebc8
Using deployment: prod

PHASE 1: Get submission primary bundle list from Ingest:
        Retrieving submission...done.
        Submission ID: 5bfe4d3a9460a300074eebc8
        Project UUID: 0c7bbbce-3c70-4d6b-a443-1b92c1f205c8
        Retrieving submission's primary bundle list...done.
        Ingest created 25 bundles.

PHASE 2: Checking bundles are present in DSS:
        Checking for bundle manifests: AWS: 25/25 GCP: 25/25...done.
        25 bundle are present in AWS
        25 bundle are present in GCP

PHASE 3: Check DSS for primary bundles with this project UUID:
        Searching DSS...done.
        In AWS DSS, 25 primary bundles are indexed by project
        In GCP DSS, 25 primary bundles are indexed by project

PHASE 4: No auth information provided, skip checking Secondary Analysis for workflows.

PHASE 5: Check DSS for secondary bundles:
        Searching for secondary bundles: AWS: 25/25 GCP: 25/25...done.
        In AWS there are 25 primary bundles with 0 results bundles
        In GCP there are 25 primary bundles with 0 results bundles

PHASE 6: Check Azul for primary bundles:
        Counting bundles in webservice...done.
        In Azul, 25 primary bundles are indexed

PHASE 7: Check Azul for secondary bundles:
        Counting secondary bundles in webservice...done.
        In Azul there are 25 primary bundles with 0 results bundles

PHASE 8: Save state:
        Saving state in 5bfe4d3a9460a300074eebc8.json...done.

@zperova @hewgreen this is now fixed. Please upgrade your version of dcp-diag

pip install dcp-diag --upgrade

Thanks!

@zperova
Copy link

zperova commented Apr 2, 2019

@aaclan-ebi thanks! I will upgrade now and try it for my bundle in integration.

@zperova
Copy link

zperova commented Apr 2, 2019

@aaclan-ebi I can run what you have tested with no problem, but when I try to test my submission in integration I get the following error:

(metadata-schema) C02X488AJHD2:metadata-schema zina$ analyze-submission -d int 5c9b50a6436bf1000897212c
Using deployment: int

PHASE 1: Get submission primary bundle list from Ingest:
	Retrieving submission...Traceback (most recent call last):
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/urllib3/connection.py", line 141, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/urllib3/util/connection.py", line 60, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/anaconda3/envs/metadata-schema/lib/python3.6/socket.py", line 745, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/urllib3/connectionpool.py", line 346, in _make_request
    self._validate_conn(conn)
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/urllib3/connectionpool.py", line 850, in _validate_conn
    conn.connect()
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/urllib3/connection.py", line 284, in connect
    conn = self._new_conn()
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/urllib3/connection.py", line 150, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x10f41f320>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/urllib3/connectionpool.py", line 639, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/urllib3/util/retry.py", line 388, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='api.ingest.int.data.humancellatlas.org', port=443): Max retries exceeded with url: /submissionEnvelopes/5c9b50a6436bf1000897212c (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x10f41f320>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/anaconda3/envs/metadata-schema/bin/analyze-submission", line 677, in <module>
    AnalyzeSubmission()
  File "/anaconda3/envs/metadata-schema/bin/analyze-submission", line 592, in __init__
    self._get_submission_project_and_primary_bundle_list_from_ingest()
  File "/anaconda3/envs/metadata-schema/bin/analyze-submission", line 650, in _get_submission_project_and_primary_bundle_list_from_ingest
    submission = finder.find(f"subm_id={self.state.submission_id}")
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/dcp_diag/finders/ingest_finder.py", line 27, in find
    return SubmissionEnvelope.load_by_id(submission_id=field_value, ingest_api_agent=self.ingest)
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/dcp_diag/component_entities/ingest_entities.py", line 67, in load_by_id
    data = ingest_api_agent.get(f"/submissionEnvelopes/{submission_id}")
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/dcp_diag/component_agents/hateoas_agent.py", line 52, in get
    response = requests.get(url, headers=self.headers)
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/anaconda3/envs/metadata-schema/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.ingest.int.data.humancellatlas.org', port=443): Max retries exceeded with url: /submissionEnvelopes/5c9b50a6436bf1000897212c (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x10f41f320>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))

@aaclan-ebi
Copy link
Collaborator Author

@zperova, please use integration instead of int

analyze-submission -d integration 5c9b50a6436bf1000897212c

Filed an issue in the dcp-diag repo for now.
HumanCellAtlas/dcp-diag#18

@zperova
Copy link

zperova commented Apr 3, 2019

@aaclan-ebi good point! it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working emerged We couldn't predict this would emerge during the sprint and it needed immediate work t-shirt medium A task that can't have a numeric estimate but is assessed as medium
Projects
None yet
Development

No branches or pull requests

3 participants