More general mechanics for row grouping (and filtering)? #853
Replies: 8 comments 24 replies
-
One more detail to consider: This is essentially a simple window function query. However, there are more complicated window function queries where the group associated with a row for the purposes of applying a function to that row (e.g., a filter) changes based on the row itself. For example, you could have a filter: "return only rows where the My point isn't that we shouldn't pursue this as a first generalization of the concept of "filtering", but that our portrayal in the API shouldn't rely on group ID concepts (the implementation probably should, though, in the cases you noted). |
Beta Was this translation helpful? Give feedback.
-
@dmos62: I see what you're describing and it makes sense but I'm not entirely sure what you're proposing to implement so I'm having a hard time providing any useful feedback. It would be helpful if you could write up something concrete that you'd like to take action on and what outstanding questions you'd like feedback on. A general thought: We don't need very general filtering on groupings of rows yet – we are just trying to solve how to filter for duplicate rows. I'd prefer a solution that prioritizes the immediate problem but is extendable to future problems (even if it will involve some refactoring later). |
Beta Was this translation helpful? Give feedback.
-
@kgodey you're right, this proposal is abstract at this stage. From filtering side of things, which I'm most familiar with, things seem to be trivial: it's about having access to those temporary columns when exposing filtering options to frontend, when accepting them, and when actually filtering. It's on grouping side of things that group-attributes need to be calculated per-row and made accessible to the rest of Mathesar (or just filtering in this case), and I'm fuzzy about how that could be done in concert with the current grouping system. @mathemancer, could you provide some input? |
Beta Was this translation helpful? Give feedback.
-
I think we should hold off on generalizing this until we have more use cases designed/planned for implementation. Right now, we only have the duplicates filter and it is difficult to anticipate future needs and architecture based on one use case. |
Beta Was this translation helpful? Give feedback.
-
@dmos62 If you check my Does that help? Or make things more complicated? Or did I miss the point of the question? |
Beta Was this translation helpful? Give feedback.
-
I'll list some semi-conceptual problems I'm seeing and try and propose a concrete solution after:
In summary, I came to the realisation that if we used views as the primary way to do sorting, filtering and grouping, that would solve above problems in an elegant and user-empowering way. I say primary, because we could still use current What would the workflow look like?Suppose you want to filter some data to only see duplicate rows (based on column set A). You create a view whose definition contains following directives:
We could package a string of operations like this in some way that a novice user could apply it or look it up easily when creating or altering a view, but it would then unpackage into his view definition without hiding the underlying mechanics. I'm imagining a searchable collection of queries, a bit like a curated stackoverflow.com search. Advantages:
Disadvantages or neutral notes:
ChallengesTechnically, this doesn't seem challenging. Relevant UX and the concepts surrounding view definitions are the areas where we're still fuzzy, I think:
@kgodey @mathemancer @silentninja @ghislaineguerin @seancolsen @pavish could you provide feedback? Context:
Edit: creation of views seems to be relatively undefined in the design specs https://wiki.mathesar.org/design/specs/create-edit-delete-views |
Beta Was this translation helpful? Give feedback.
-
I'm responding to #853 (comment), doing it in a new comment since the previous thread is already long enough that all replies aren't loaded without a further click. Some thoughts in no particular order:
|
Beta Was this translation helpful? Give feedback.
-
I'm not sure where to go from here in this conversation. @dmos62, could you clarify what action you'd like to see coming out of this discussion? |
Beta Was this translation helpful? Give feedback.
-
We noticed that there are situations where we'd like to filter not on rows, but on information derived from groupings of rows (group attributes). Example of such group-attribute-based filters are: "is this row a duplicate", "is this row unique", "sum of column A in this group is larger than X". Also, we'd like to compose such filters with the simple filters:
or(empty(col1), unique(col2))
.Basically, group-based filters deconstruct to the following algorithm:
What this gets us is a unified way to both do row grouping that @mathemancer is working on, and do group-attribute-based filtering that I'm working on, without duplicated logic. Plus, it's pretty powerful in terms of expressiveness.
I can see two things that require more definition:
Relevant discussions:
Beta Was this translation helpful? Give feedback.
All reactions