Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎨 Confusing name: Curate.add_validated_from #1860

Closed
falexwolf opened this issue Aug 28, 2024 · 13 comments · Fixed by #2081
Closed

🎨 Confusing name: Curate.add_validated_from #1860

falexwolf opened this issue Aug 28, 2024 · 13 comments · Fixed by #2081
Assignees

Comments

@falexwolf
Copy link
Member

The problem with the below is that values in a public ontology are not validated in the LaminDB definition. Hence, the name of the method is highly confusing.

image

How shall this method be called?

@falexwolf falexwolf changed the title Confusing name add_validated_from 🎨 Confusing name: Curate.add_validated_from Aug 28, 2024
@falexwolf
Copy link
Member Author

This here is still a quite basic thing that we should get resolved, shouldn't we? I think you wanted to work on it at some point @Zethson?

@Zethson
Copy link
Member

Zethson commented Sep 27, 2024

I'll think about it next week but if we change the default of Curator to also validate against records without a Source record even if a Source is passed, the comment above would certainly be false and the name fine.

@falexwolf
Copy link
Member Author

falexwolf commented Oct 16, 2024

We need a big clear simple box that defines what “validated” means. This is urgent as it’s so basic.

In my mind “validated” == “exists in registry or an underlying ontology”.

If this is what “validated” is, curator.add_validated_from should not be user-facing but records that are not yet in the ontology source should be added to registry upon the first call of .validate() that touches them.

What’s important about this “auto-import-from-the-ontology-source” is that the definition gets easier in practice: "validated" == "record exists in registry". This means the whole complexity of the ontology source is no longer important as soon as the record is imported.

It’s important that we have an easy practical definition of “validated” that’s free of “validating against source” and “validating against another instance” both of which are complicated things.

@Zethson
Copy link
Member

Zethson commented Oct 16, 2024

In my mind “validated” == “contained in registry or an underlying ontology source”.

Yes that's also where @sunnyosun and me ended up. However, I think the current behavior when sources is passed to Curate would break this assumption because it would only validate against the specific ontology source and NOT the remainder of the registry (basically values that have no source). This needs to be changed for consistency.

+1 the rest of what you wrote.

@falexwolf
Copy link
Member Author

Can we make this a priority then so that it's in a LaminDB release ASAP? Asking both of you @Zethson @sunnyosun

@Zethson
Copy link
Member

Zethson commented Oct 16, 2024

While I could figure it out, I think @sunnyosun knows this part better and is probably substantially faster? @sunnyosun I'm happy to look into it if you want me to. Just let me know.

@sunnyosun
Copy link
Member

What about we run add_validated_from("all") under the hood of .validate()? Essentially all validated records will be saved before running validation. This way we make add_validated_from no longer user facing.

@Zethson
Copy link
Member

Zethson commented Oct 16, 2024

That's how I understood @falexwolf proposal, yes.

@falexwolf
Copy link
Member Author

What about we run add_validated_from("all") under the hood of .validate()? Essentially all validated records will be saved before running validation. This way we make add_validated_from no longer user facing.

Yes, that's what I meant. I'd also make it private then (with deprecation time).

@sunnyosun
Copy link
Member

What about we run add_validated_from("all") under the hood of .validate()? Essentially all validated records will be saved before running validation. This way we make add_validated_from no longer user facing.

Yes, that's what I meant. I'd also make it private then (with deprecation time).

I think we can just remove it, because users will not be asked call this method anymore.

@falexwolf
Copy link
Member Author

Great! That's going to simplify things!

@falexwolf
Copy link
Member Author

I'm looking forward to this getting simpler:
image

@falexwolf
Copy link
Member Author

And this:
image

@sunnyosun sunnyosun linked a pull request Oct 18, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants