Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OR filter support for all engines #392

Open
pranavbhole opened this issue Jan 18, 2019 · 8 comments
Open

OR filter support for all engines #392

pranavbhole opened this issue Jan 18, 2019 · 8 comments

Comments

@pranavbhole
Copy link
Contributor

No description provided.

@ryankwagner
Copy link
Collaborator

ryankwagner commented Feb 12, 2019

OrFilter concept query:

"fields":
  [
    { "field": "Homeroom Teacher Name" },
    { "field": "Student ID" },
    { "field": "Class ID" },
    { "field": "Class Name" },
    { "field": "Grade" }
  ],
"filters": [
  { "operator": "=", "fieldName": "Homeroom Teacher Name", "value": "Ms. Olsen" },
  { "operator": "OR", 
    "filters": 
    [
      {"operator": "=", "fieldName": "Student ID", "value": "12345"},
      {"operator": ">", "fieldName": "Grade", "value": "90"},
      {"operator": "<", "fieldName": "Number of Classes", "value": "4"}
    ]
  }
] 

The concept behind this FilterOperation is that it will become a new type of FilterOperation, called CompoundFilterOperation or CombinedFilterOperation.

Rather than having a concept of 'from' and 'to' or 'compareTo', it will contain an operator (or) and a set of input Filters to render.

In this way, assuming we independently decide to implement a new type of nested combined filter (such as AND), that new type can be passed within the list.

This new filter will be phased in as such:

  1. Create and implement CombinedFilter (or CombinedForcedFilter, as is being done with MultiField) without any renderers, only testing the operation itself directly.
  2. Implement OrFilter as a type of CombinedFilter (or ForcedFilter) on OracleEngine.
  3. Implement OrFilter on Hive & Presto Engines.
  4. Implement OrFilter on DruidEngine.

This phased development should serve to simplify the work being done & the review process such that errors can't slip in, as with large requests.

@pranavbhole
Copy link
Contributor Author

Good use case, just few additions to design. CompoundFilterOperation or CombinedFilterOperation filter list should support all the existing filters. Input Reporting Request looks good to me.

@ryankwagner
Copy link
Collaborator

At the moment, the design looks like it should be able to support all existing, as well as nesting on itself for future usage that is to say: ( = OR = OR ( = AND =) OR =)

As well as any existing FilterOperations. It should be entirely agnostic of what sort of FilterOperation passes into it.

@patelh
Copy link
Collaborator

patelh commented Feb 12, 2019

The current implementation assumes all items in the filter list are AND'd. We would want to keep the existing functionality intact. However, the user has the option to use logical operators as necessary (AND, OR, NOT).

Here are some scenarios we should cover:

  1. FilterA AND FilterB AND FilterC
    • supported using existing behavior of providing a list of filters
    • supported using an AND logical operator and providing the list of filters to the operator
  2. FilterA OR FilterB AND FilterC
    • supported using an OR logical operator with the list of filters being FilterA and an AND logical operator, e.g. OrFilter(List(FilterA, AndFilter(List(FilterB, FilterC))))
  3. FilterA AND (FilterB OR FilterC)
    • supported using an AND logical operator with the list of filters being FilterA and an OR logical operator, e.g. AndFilter(List(FilterA, OrFilter(List(FilterB, FilterC))))
  4. FilterA AND (FilterB OR (NOT FilterC))
    • supported using an AND logical operator with the list of filters being FilterA and an OR logical operator, e.g. AndFilter(List(FilterA, OrFilter(List(FilterB, NotFilter(FilterC)))))

@ryankwagner
Copy link
Collaborator

I understand the premise of making this a larger change that allows for top-level AND, OR, NOT however that sort of change would be quite large. It would mean breaking default behavior to allow for top-level OR, AND, NOT.

The proposal here is backwards compatible by giving this new OR filter the same presedence as existing operators. In this case,

{ "operator": "OR", 
    "filters": 
    [
      {"operator": "=", "fieldName": "Student ID", "value": "12345"},
      {"operator": ">", "fieldName": "Grade", "value": "90"},
      {"operator": "<", "fieldName": "Number of Classes", "value": "4"}
    ]
  }

Using strictly this operator on all fields would generate a top-level OR operation without any further top-level filters. Likewise, AND and NOT can be created with the same functionality in mind. As nesting is a possibility, a stretch functionality would look something like:

"filters": [
  {"operator": "OR",
    "filters": [
      {"operator": "AND",
        "filters": [
          {"field": "Student Name", "operator": "=", "value": "Bob"},
          {"field": "Student ID", "operator": "<", "value": "100}
        ]
      },
      {"field": "Class Name", "operator": "<>", "value": "Biology 100"}
    ]
  }
]
((('Student Name' == 'Bob') AND ('Student ID' < 100)) 
OR 
 ('Class Name' NOT 'Biology 100'))

@patelh
Copy link
Collaborator

patelh commented Feb 12, 2019

@ryankwagner What you are proposing is the same, top level filter which is a logical operator. That's what I also said, just that the existing functionality of not having top level logical operator be preserved. E.g. if top level filter is not a logical operator, then you can just assume all filters are AND'd together, e.g. by default produce AndFilter(filters)...

@ryankwagner
Copy link
Collaborator

PR #415 outlines the steps that will be taken in OrFilter support more concretely.

This issue can't be cleanly solved in one Pull Request, and will likely take place over 5. Please add comments/suggested edits on this first step, as it will be the basis for future filter work.

@ryankwagner
Copy link
Collaborator

ryankwagner commented Feb 22, 2019

TODO List:

  • Add OrFilter Operation & rename Pre Rendered Filter Operations. ( PR Add OrFilter operations. (Issue 392) #415 )
  • Add helper functions inside RequestModel & QueryGenerator for multi-type filter handling. This includes Filter Field validation (check for dim/fact column existence & throw proper error on duplicates), as well as filter rendering into where or having clauses (and analog in Druid). ( In Progress )
  • Implement Helper functions into RequestModel & Query Generator Engines, replacing existing "filter.field" functionality.
  • Formally add OrFilter support into Query Engines.

Future TODOs (Features out of scope):

  • Add support for nesting AND/OR filters.
  • Add AND/OR filter types to Cost function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
@patelh @pranavbhole @ryankwagner and others