Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rosetta #2

Open
Erard opened this issue Jul 3, 2023 · 15 comments
Open

Rosetta #2

Erard opened this issue Jul 3, 2023 · 15 comments

Comments

@Erard
Copy link
Member

Erard commented Jul 3, 2023

"Rosetta" won't find the value used in ESA / NASA archives, this is of course annoying: "international rosetta mission". This should be included in the alias list

@Erard
Copy link
Member Author

Erard commented Jul 3, 2023

As a general rule, all values of instrument_host_name in the PDS / PSA should be included

@BaptisteCecconi
Copy link
Member

Where can we find the list ? (to help @LauraD12)

@Erard
Copy link
Member Author

Erard commented Jul 3, 2023

No idea. An entry point is https://pds.nasa.gov/tools/dd-search/
There used to be a doc providing a list of value for PDS3 - Google is your friend

@BaptisteCecconi
Copy link
Member

The PDS dd-search interface is a first good step. We probably need the same for ESA and JAXA at least.

@Erard
Copy link
Member Author

Erard commented Jul 3, 2023

They should all use the same keywords / values - except perhaps for PDS3 archives stored only outside the US.
One matter is the use of mission_name vs instrument_host_name in PDS3 (the latter being a submodule, eg lander / orbiter, but also individual telescopes). Unsure if we want only mission_name or both - need to get back to our old notes.

@Erard
Copy link
Member Author

Erard commented Sep 6, 2023

Another source of info that should be implemented is the PSA doc (from p93):
https://www.cosmos.esa.int/web/psa/psa-user-guide

@BaptisteCecconi
Copy link
Member

BaptisteCecconi commented Sep 6, 2023

The list is available on page 9 of that document (@Erard, you confirm, this is the list you have in mind?) and is the following (and I put the URL to the resolver):

NB1: I excluded the last one: GROUND-BASED
NB2: for EXOMARS 2016, the resolver should be improved to have the correct name in the first rank in the resolver.

@Erard
Copy link
Member Author

Erard commented Sep 6, 2023

p.9 is the short version. On p 93 an extended version provides:

  • mission_name
  • instrument_host_name (different from mission, eg lander / orbiter)
  • instrument_host_id

I think we want all these values, although they are those used for CQL queries (not necessarily = PDS values from what I see)

In addition, it gives a list of instruments in the PSA, which is nice to have

@Erard
Copy link
Member Author

Erard commented Sep 6, 2023

NB2: for EXOMARS 2016, the resolver should be improved to have the correct name in the first rank in the resolver.

It is much worse for Mars Express ;(

@BaptisteCecconi
Copy link
Member

I think there is a misunderstanding... The goal of the resolver is to propose a ranked list of results. If the first item of the list of the right one, I considered it is a success. So for Mars Express, there are plenty of results, but the first one is correct. This is not the case for Exomars.

@Erard
Copy link
Member Author

Erard commented Sep 6, 2023

Well, yes and no:
it only works if no other key provides it as a first item - otherwise we can't use this to resolve aliases.
Meaning that we need to check the first item of every entry

@BaptisteCecconi
Copy link
Member

This is the same for SSODNET name resolver, from a user input, you get a list of names, and the user selects the correct name.

We may be able to refine the resolver ranking algorithm, but first we want to make sure that the first result is correct. Then we will refine.

@Erard
Copy link
Member Author

Erard commented Sep 6, 2023

Not really: in the portal SSODnet is used to return all known aliases, which are included in a single ADQL query.
In my opinion this is the main point in having a resolver.
Plus, if we only use the first item, we need to be able to disable the conversion: If I'm asking for international-Rosetta-mission and get Rosetta I often need to overwrite it - otherwise I'd never find the instrument archive.
And again, the added value is debatable.

@BaptisteCecconi
Copy link
Member

There is a real misunderstanding... (or I completely miss the point...).

There are two prototype queries on the obs-facility database:

  • the resolve?q= query, which should be used by the providers to find the term to put in the epn_core table, or by the user to find the term to be selected in the search interface.
  • the aliases?label= query, which is used to find all the known aliases for a term.

The aliases?label= query has to be used now, since the providers are not using the standard terms. However, eventually, we may (should) impose to use the terms from the obs-facility vocabulary, and then the aliases?label= query will not be so useful anymore.

@Erard
Copy link
Member Author

Erard commented Sep 6, 2023

We certainly need a real discussion to clarify the objective of this activity - short anwser for now:

  • the alias query is very noisy at this point, that was my concern
  • I don't think we can impose a main value to data providers - especially space agencies or telescopes

The situation is identical to target_name: each object has many names which are all relevant and legitimate in their context. What we need for global / blind queries is a set of equivalent values for the same concept/object in various contexts. The only way out is to work on a centralized db of metadata (ElasticSearch) and convert all values to a unique string. But this would only work for EPN-TAP, not the whole VO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants