Skip to content

BUGFIX: prevent gaia from querying server at import time #3344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

keflavich
Copy link
Contributor

see #3343.

I believe we had some discussion about this point in another issue as well, but I want to reaffirm that we have a hard rule that no remote queries should be performed at import time for exactly the reasons highlighted in the linked issue.

Copy link

codecov bot commented Jun 11, 2025

Codecov Report

Attention: Patch coverage is 54.76190% with 19 lines in your changes missing coverage. Please review.

Project coverage is 70.04%. Comparing base (1becbba) to head (d40dabd).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
astroquery/esa/euclid/core.py 7.69% 12 Missing ⚠️
astroquery/gaia/core.py 69.23% 4 Missing ⚠️
astroquery/esa/hubble/core.py 62.50% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3344      +/-   ##
==========================================
- Coverage   70.08%   70.04%   -0.05%     
==========================================
  Files         232      232              
  Lines       19890    19878      -12     
==========================================
- Hits        13940    13923      -17     
- Misses       5950     5955       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bsipocz
Copy link
Member

bsipocz commented Jun 11, 2025

This is not a bug but a feature, I remember we went through whether we should have this enable by default or have it as an opt in during the review process.

cc @jespinosaar

So, I would rather suggest the user to opt out explicitly.

@bsipocz
Copy link
Member

bsipocz commented Jun 11, 2025

So, maybe this needs to be better documented, but I certainly not call it a bug.

@keflavich
Copy link
Contributor Author

The bug is that the user cannot opt out because the option is set by default and called at import time.

@keflavich
Copy link
Contributor Author

See also:
#3196
which we just left unaddressed. It's the same issue, and clearly is affecting many users.

@keflavich
Copy link
Contributor Author

There was some discussion about this here: #2376 (comment), but it didn't go into it.

The problem is not the verbosity, it's the attempt to make a remote connection "without permission". I view this as a violation of the expectation that importing should not produce queries.

This is pervasive throughout the ESA packages, and it's presently hidden because of the try/except clause in get_status_messages.

The problem is exacerbated by the fact that there is no user-accessible way to set the timeout, so if the server's down, the user may have to wait a full minute (or whatever the default timeout is) for imports.

except IndexError:
print("Archive down for maintenance")

except OSError:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exception catching is a particular problem: it hid from the tests that there is a remote query being performed, since pytest-remotedata raises an OSError

@keflavich
Copy link
Contributor Author

this issue also affects our tests: every time we've run ESA tests, it has spammed a lot of server queries. The except OSError catch made the tests pass, but they should have raised exceptions.

@keflavich
Copy link
Contributor Author

@jespinosaar I would like your input on this - I'm proposing a lot of changes with this PR.

If there is a need to show_messages during the tests (especially Euclid), then we should mock the connection handler in all of the non-remote test suites.

The OSError raised by pytest-remotedata should never be caught: it exists only for testing purposes, and catching those exceptions hides real test failures.

There are two straightforward options that I've implemented differently so far:

  • In gaia, I just set show_server_message=False by default, but this means that users instantiating Gaia classes won't see the message
  • In the others, I just set the import-time message check to be off (i.e., the last line of the file)

I'm open to opinions on which approach is preferred. This PR should not be merged until we have decided which way to go, though.

@bsipocz
Copy link
Member

bsipocz commented Jun 11, 2025

maybe have this as a config item? So the users can globally override the default more easily.

@keflavich
Copy link
Contributor Author

I like the idea of using a config item. The default should be 'no remote queries', but I'm happy enough if the first line of the documentation suggests changing that.

@jespinosaar
Copy link
Contributor

Hi all, many thanks for looking into this. This method is intended to show warning messages automatically stored in our server. So, we can configure them to provide messages to users we can customize (e.g. if a maintenance is planned, if the server is down...). This is why this is executed during import (in the init).

If this is causing you a lot of problems when testing, we can think on either on mock in in all tests (and modify our development procedure to ensure we are covering this case) or maybe change the approach. Instead of executing this message during runtime, we could think on call this method if one of the methods fails, checking first if there is a message the user should be aware of and include it in the error trace.

In case we decide to remove it from the initialization as well, I would propose to remove the show_messages parameter and the associated code, as this won't be necessary anymore. We can then include in the documentation a comment saying that users can verify if the TAP is working as expected using get_status_message() method.

As this is a decision to be made with different missions and stakeholders, I am including @cosmoJFH as well in this discussion.

@jespinosaar
Copy link
Contributor

Or we can show_messages=False and advertise in the documentation to use the initialization with show_messages=True if they want to verify the status before using it. Then, we can keep it as a parameter in the init or as a config item, whatever you prefer

@keflavich
Copy link
Contributor Author

Instead of executing this message during runtime, we could think on call this method if one of the methods fails, checking first if there is a message the user should be aware of and include it in the error trace.

This is a very good idea, as I think it captures the intended purpose: to warn users if the error they're seeing is a temporary one.

@jespinosaar
Copy link
Contributor

Instead of executing this message during runtime, we could think on call this method if one of the methods fails, checking first if there is a message the user should be aware of and include it in the error trace.

This is a very good idea, as I think it captures the intended purpose: to warn users if the error they're seeing is a temporary one.

Yes, but this would imply a change in all our methods then. So, maybe, we can accept this PR and, in the meantime, update the missions where this is being used. But I would like to hear about @cosmoJFH opinion before deciding. I am more involved in charge of Astronomy missions and he is supporting the Survey ones.

@keflavich
Copy link
Contributor Author

OK, that sounds like a good plan.

For this PR, though - I think the least acceptable change would be to set show_messages=False in the import-time initialization, but leave the default to True.

For tests, instead of mocking, we can just set show_messages=False in all of the non-remote queries, as I've done. Is that good for you?

@bsipocz bsipocz added this to the v0.4.11 milestone Jun 12, 2025
@bsipocz bsipocz added the gaia label Jun 12, 2025
@jespinosaar
Copy link
Contributor

OK, that sounds like a good plan.

For this PR, though - I think the least acceptable change would be to set show_messages=False in the import-time initialization, but leave the default to True.

For tests, instead of mocking, we can just set show_messages=False in all of the non-remote queries, as I've done. Is that good for you?

For me it is ok. We will discuss internally how we change it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants