-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a "nonphysical" keyword to Rearrangement and Cell #769
Comments
Ah, okay, I understand better what you are saying now. Creating inferred Rearrangments/Cells would be a nice way to re-use the schema, yes so crazy it might work! However, it creates the situation that a fake Yes, this is particularly tricky and goes beyond just the idea of supporting "simulated" data sets. I'll ponder on this awhile, but my initial thought is that these inferred things need to be in their own "collections" separate from the other data, yet tied to it using an independent identifier. |
My hope is that if we create a way to have a simulated repertoire, it could be relatively easily extended to a "fake" (inferred?) repertoire, as well. But I'm not as optimistic as @javh =P so I'm guessing it'll get pretty hairy. |
But you still want it to be connected to a real repertoire with the experimental protocol, right? Because if I'm understanding properly, you are still doing a (say) single-cell experiment, which is described in a That's slightly different from a simulated dataset where essentially everything is "fake" |
Yes and yes. So unlike a simulated repertoire, those fields wouldn't be nulled. |
There is an "easy" solution but it unfortunately creates significant churn. That is, add an identifier. Just like We want to avoid breaking the existing tool chain, so that implies that the inferred rearrangements/cells need to have a different |
I don't think that would work, anyway: |
Ok, I missed that. So I guess when you say |
Closely related to #201, obviously, but I'm actually more thinking about #317 and efforts to simplify the
Clone
schema. For #201, allRearrangement
s/Cell
s in aRepertoire
would be nonphysical, which is why I suggested aRepertoire
-levelis_simulated
keyword.However, in the
Clone
space we have inferred intermediates/ancestors, which I guess would either be part of the sameRepertoire
as the observedRearrangement
s/Cell
s they are based on, or maybe not part of aRepertoire
at all.Currently, we handle this by siloing them into the
Clone
schema, either directly inClone
(using fields likev_call
,germline_alignment
, etc) or by converting them intoNode
objects (which in turn requiresTree
to be an object instead of just a field). That's what's making #317 hard, because we've set upClone
andNode
to mimicRearrangement
s and now we also want them to be able to mimicCell
s.If we instead store the inferred intermediates/ancestors as bona fide but
nonphysical
Rearrangement
s/Cell
s, thenClone
can just have a generic array of members and the problem goes away. So crazy it just might work?The text was updated successfully, but these errors were encountered: