Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SceneQueryRunner: decouple time range comparisons #587

Merged
merged 15 commits into from
Jun 5, 2024

Conversation

sd2k
Copy link
Contributor

@sd2k sd2k commented Feb 9, 2024

Prior to this commit, SceneQueryRunner had special handling for
SceneTimeRangeCompare objects, explicitly searching for them in the
scene graph, adding additional queries, and transforming resulting
queries. This made it difficult to re-use the general behaviour of
'running additional queries' in other objects.

This commit introduces a new interface, SceneRequestAdder, which
can be implemented to inform the query runner that it should run
additional requests (returned by getExtraRequests) and transform the
results in some fashion.

Instead of searching the graph for SceneTimeRangeCompare objects,
the query runner searches for implementors of SceneRequestAdder
and uses those instead. The specifics of how it searches for these
is a little bit fuzzy and should probably be improved: it walks up the
graph until it finds at least one adder at the current level or in
any children of the current level; adds any others at that level
or in the children; then returns.

SceneTimeRangeCompare has been refactored to make use of this new
interface. I've also got a separate object which also implements
it which is working well including when both are enabled.

Relates to this Slack thread.

@sd2k sd2k force-pushed the decouple-comparisons-from-queryrunner branch from d35c5f3 to 7f75cef Compare February 9, 2024 22:00
@sd2k sd2k marked this pull request as ready for review February 11, 2024 17:19
@sd2k sd2k requested a review from torkelo February 14, 2024 10:59
@torkelo
Copy link
Member

torkelo commented Feb 14, 2024

@sd2k I think this looks interesting, sorry for late review. so much going on now with dashboard => scenes migration, and a customer project. We will try to review this in depth shortly, I think with some minor refactoring and name changes it looks overall promising / close, but I just did a very quick code scan

@sd2k
Copy link
Contributor Author

sd2k commented Feb 14, 2024

@sd2k I think this looks interesting, sorry for late review. so much going on now with dashboard => scenes migration, and a customer project. We will try to review this in depth shortly, I think with some minor refactoring and name changes it looks overall promising / close, but I just did a very quick code scan

No worries, it's not urgent, just wanted to make sure it didn't fall too far off the radar 👍

I'll try and add an example of how this can be used somewhere so it's a bit less abstract.

@sd2k sd2k force-pushed the decouple-comparisons-from-queryrunner branch from 7f75cef to e12a606 Compare February 16, 2024 10:39
@sd2k
Copy link
Contributor Author

sd2k commented Feb 16, 2024

FYI I've added an example of using this to the scenes-baseliner branch branch along with a demo, should be easy to run but you'll need to use a version of Grafana with grafana/grafana#82299 merged in to load the plugin correctly since it depends on a WASM module.

I don't think that'd necessarily make it into the core repo since it's quite heavyweight but it's a decent example!

Also a live demo of a custom application observability build with this enabled.

@sd2k
Copy link
Contributor Author

sd2k commented Mar 12, 2024

One thing I bumped into while using this in other projects was that I can't see a way to turn off the extra SceneRequestAdders per-query runner. For example, in the live demo above the ScenesBaseliner runs for the various histogram panels as well as for the time series panels, but it can't really do much in those cases.

I imagine the solution would be to have a separate embedded scene with the baseline controls or something, but it might be nice to be able to say 'this SceneRequestAdder should only run for these queries' somehow 🤔

Edit: I just rebased on main and can see that #650 added the ability to opt out of time range comparisons; it's a tiny bit special-casey for now but I imagine we could do something similar for other RequestAdders (e.g. baselines: false).

@sd2k
Copy link
Contributor Author

sd2k commented Mar 21, 2024

Hey @torkelo, do you think you (or someone else) will have time to take a look at this in the next week or so? Happy to walk through it if you like.

@sd2k sd2k force-pushed the decouple-comparisons-from-queryrunner branch from e12a606 to 4cc87f3 Compare March 21, 2024 20:13
@torkelo
Copy link
Member

torkelo commented Mar 22, 2024

@sd2k sorry, it's a bit of a bad time now with G11 / GrafanaCon crunch.

@sd2k sd2k force-pushed the decouple-comparisons-from-queryrunner branch 4 times, most recently from 8b74dd9 to 8154573 Compare April 24, 2024 13:52
@sd2k
Copy link
Contributor Author

sd2k commented Apr 24, 2024

I've tried to come up with some better names for the things I wasn't happy with ('adder' because it kinda sucks, and 'transform' because it's overused). The new names are maybe slightly better?

The best way to test this out, if anyone's interested, is to check out the Baselines demo in the demo app in the downstream scenes-baseliner branch which makes use of this PR to provide a second control similar to the time range comparison.

@sd2k sd2k force-pushed the decouple-comparisons-from-queryrunner branch from 8154573 to 79d35df Compare May 2, 2024 08:38
sd2k and others added 2 commits May 16, 2024 11:41
Prior to this commit, `SceneQueryRunner` had special handling for
`SceneTimeRangeCompare` objects, explicitly searching for them in the
scene graph, adding additional queries, and transforming resulting
queries. This made it difficult to re-use the general behaviour of
'running additional queries' in other objects.

This commit introduces a new interface, `SceneRequestAdder`, which
can be implemented to inform the query runner that it should run
additional requests (returned by `getExtraRequests`) and transform the
results in some fashion.

Instead of searching the graph for `SceneTimeRangeCompare` objects,
the query runner searches for implementors of `SceneRequestAdder`
and uses those instead. The specifics of how it searches for these
is a little bit fuzzy and should probably be improved: it walks up the
graph until it finds at least one adder at the current level or in
any children of the current level; adds any others at that level
or in the children; then returns.

`SceneTimeRangeCompare` has been refactored to make use of this new
interface. I've also got a separate object which also implements
it which is working well including when both are enabled.
@sd2k sd2k force-pushed the decouple-comparisons-from-queryrunner branch from 79d35df to 4f58dee Compare May 16, 2024 10:41
Not sure these names are much better but they're possibly less ambiguous.
@sd2k sd2k force-pushed the decouple-comparisons-from-queryrunner branch from 4f58dee to 9aa6700 Compare May 16, 2024 10:44
sd2k added a commit that referenced this pull request May 24, 2024
PR #587 is quite a large change and introduces more concepts to scenes, so
isn't easy to justify merging right now. As a workaround, I'd like to be
able to subclass and override some methods of the SceneQueryRunner to
implement some similar behaviour in a different library, but I need
various fields and methods to be accessible to the subclass, which this
PR does by making them protected rather than private.

Note: I can _almost_ just completely copy/paste the class into my own
code rather than extending it, but the [`getQueriesForVariables`] function
looks for all _instances_ of SceneQueryRunner, which my copy/pasted class
would not match (unlike a subclass, which should), so I think some
ad-hoc variable support would break :(

`getQueriesForVariables`: https://github.com/grafana/scenes/blob/07c5dc66746d36208cd121b938883db8ef2243f7/packages/scenes/src/variables/utils.ts#L88-L94
Copy link
Member

@dprokop dprokop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sd2k - I' really sorry for the late review, but your PR (#745) made me realise i forgot about this one.

Overall - I'm in favor of proceeding with this one, as opposite to #745 which literally invites inheritance over composition.

I can't see of any objections why we would not accept this refeactor, it's a good idea that I've been thinking when first implementing the time range comparison.

The minor thing I'm missing is generalised test for supplementary requests - all tests right now are performed in the context of time range comparison so they don't neessarily show the idea behind the supplementary requests when going through spec.

Also, please condsider my rename suggestion.

packages/scenes/src/querying/SceneQueryRunner.ts Outdated Show resolved Hide resolved
packages/scenes/src/querying/SceneQueryRunner.ts Outdated Show resolved Hide resolved
packages/scenes/src/querying/SceneRequestAdder.ts Outdated Show resolved Hide resolved
@sd2k
Copy link
Contributor Author

sd2k commented May 24, 2024

@sd2k - I' really sorry for the late review, but your PR (#745) made me realise i forgot about this one.

Overall - I'm in favor of proceeding with this one, as opposite to #745 which literally invites inheritance over composition.

Agreed 👍 I think the main concerns I had were the introduction of even more interfaces and concepts to the library but I'll make sure they're documented.

I can't see of any objections why we would not accept this refeactor, it's a good idea that I've been thinking when first implementing the time range comparison.

The minor thing I'm missing is generalised test for supplementary requests - all tests right now are performed in the context of time range comparison so they don't neessarily show the idea behind the supplementary requests when going through spec.

Understood, I'll try and add something to this effect.

Also, please condsider my rename suggestion.

💯 - I much prefer yours!


One side note: in another branch I've been exploring actually using this functionality with some implementations of SceneRequestSupplementer that run some ML algorithms on the secondary requests. The controls for the components allow the user to adjust things like the confidence interval and other hyperparameters of the ML algorithms. When the user changes those, we need to rerun the processors, but we don't always need to rerun the queries.

My solution to this has been to change the shouldRerun method to return an object like { query: boolean; processor: boolean }, which the implementation can use to tell the query runner whether it should rerun the query and the processor or just the processor. This is pretty essential for some UXs like this:

2024-05-24.11-42-17.mp4

since otherwise the query would need to be reissued every time the processor wanted to rerun.

I mention this because I don't want to lock us into a specific API in SceneRequestSupplementer if/when I try to upstream those changes (in a separate PR). The options I see are:

  1. Keep the current signature (shouldRerun(prev, next): boolean) in this PR, and don't ever allow implementations to specify which of query/processor to run
  2. Keep the current signature (shouldRerun(prev, next): boolean) in this PR, and if we allow queries/processors to run separately, release an API-incompatible change by changing the signature to shouldRerun(prev, next): { query: boolean; processor: boolean; }
  3. Keep the current signature (shouldRerun(prev, next): boolean) in this PR, and if we allow queries/processors to run separately, release an API-compatible change by changing the signature to shouldRerun(prev, next): boolean | { query: boolean; processor: boolean; }, which is a bit more complex but possibly easier for basic use cases
  4. Switch the signature to either shouldRerun(prev, next): { query: boolean; processor: boolean; } or shouldRerun(prev, next): boolean | { query: boolean; processor: boolean; } now.

Sorry for the long response, I've spent quite a bit of time working on this other branch the last few days! 😄

We need to use a map during the search to avoid including duplicates of
any given type of `SceneRequestSupplementer`, but there's no need to
return the map itself; we only ever care about the values.
@sd2k
Copy link
Contributor Author

sd2k commented May 24, 2024

I've done the renaming, simplified a couple of things, added more comments and added a couple of tests for the queryrunner's handling of SupplementaryRequestProvider 👍

I see. But I'm wondering if you really need a processor for this implementation. What if you implemented the processor as a custom transformation (grafana.com/developers/scenes/transformations#add-custom-transformations) and attach a behaviour to it, that would react to the controls changes and call reprocessTransformations ? It's a bit more complex, but it would keep the interface of the SupplementaryRequestProvider simpler and actually focused on query execution.

This may be possible, I've not used behaviors much so would need to investigate. In particular I'm not sure how I'd link together the behaviors, controls, transformations and supplementary queries 🤔

FWIW, the nice thing about using the processor is being able to package up the controls, the extra queries, and the processing all in one reusable component so that downstream devs can just do this:

controls: [
  new SceneBaseliner({}),
  new SceneChangepointDetector({}),
],

and everything just works, similar to the time range comparisons.

sd2k added a commit that referenced this pull request May 24, 2024
…un separately

Followup from [this comment](#587 (comment))
- this adds the option for SupplementalRequestProviders to state that they
only wish to have their processor rerun (probably with different state),
rather than the QueryRunner rerunning both the query _and_ the processor.
sd2k added a commit that referenced this pull request May 24, 2024
…un separately

Followup from [this comment](#587 (comment))
- this adds the option for SupplementalRequestProviders to state that they
only wish to have their processor rerun (probably with different state),
rather than the QueryRunner rerunning both the query _and_ the processor.
@sd2k
Copy link
Contributor Author

sd2k commented May 24, 2024

I added a draft of how an alternate API for shouldRerun could look with the reprocessing handled by the query runner, in case it's interesting 🙂

sd2k added a commit that referenced this pull request May 24, 2024
…un separately

Followup from [this comment](#587 (comment))
- this adds the option for SupplementalRequestProviders to state that they
only wish to have their processor rerun (probably with different state),
rather than the QueryRunner rerunning both the query _and_ the processor.
Copy link
Member

@torkelo torkelo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sd2k @dprokop yea, took a look at this in more detail now, I think it's direction looks pretty good and does not add that much new complexity, let's try to get this into mergable state

Added some rename suggestions, not 100% sure about them so happy for alternatives / thoughts

packages/scenes/src/querying/SceneQueryRunner.ts Outdated Show resolved Hide resolved
packages/scenes/src/querying/SceneQueryRunner.ts Outdated Show resolved Hide resolved
sd2k added a commit that referenced this pull request May 28, 2024
…un separately

Followup from [this comment](#587 (comment))
- this adds the option for SupplementalRequestProviders to state that they
only wish to have their processor rerun (probably with different state),
rather than the QueryRunner rerunning both the query _and_ the processor.
@sd2k
Copy link
Contributor Author

sd2k commented May 28, 2024

I've addressed those issues (went with ExtraQueryProcessor for the processor name but can update it if you like), let me know if you have further comments 👍

// some more advanced processing such as fitting a time series model on the secondary data.
//
// See the docs for `extraQueryProcessingOperator` for more information.
export type ExtraQueryDataProcessor = (primary: PanelData, secondary: PanelData) => PanelData;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think it's suitable for this PR but if I want to offload my processing to a web worker this is going to need to return a Promise instead...

Suggested change
export type ExtraQueryDataProcessor = (primary: PanelData, secondary: PanelData) => PanelData;
export type ExtraQueryDataProcessor = (primary: PanelData, secondary: PanelData) => Promise<PanelData>;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't introduce this now, it's going to be a breaking change in the future. I think it may be worth doing this straight away, wdyt @torkelo ?

Copy link
Contributor Author

@sd2k sd2k Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we could change it to export type ExtraQueryDataProcessor = (primary: PanelData, secondary: PanelData) => PanelData | Promise<PanelData>; in future but it is clunky, I agree...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be an observable (like the current post processing) that will be much easier to slot in to the rxjs pipeline

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, +1 for that!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the type to export type ExtraQueryDataProcessor = (primary: PanelData, secondary: PanelData) => Observable<PanelData>; in bdab9f4, is that what you meant? 🤔 Think I got the types right but I might have misunderstood.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think that's exactly it @sd2k

Copy link
Member

@dprokop dprokop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good now, the rename to Extra... is 👌

// some more advanced processing such as fitting a time series model on the secondary data.
//
// See the docs for `extraQueryProcessingOperator` for more information.
export type ExtraQueryDataProcessor = (primary: PanelData, secondary: PanelData) => PanelData;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't introduce this now, it's going to be a breaking change in the future. I think it may be worth doing this straight away, wdyt @torkelo ?

@sd2k
Copy link
Contributor Author

sd2k commented Jun 4, 2024

@torkelo are there any changes you'd like to see to this before merging?

Copy link
Member

@torkelo torkelo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for sticking with this slow PR review process, hopefully the next one will not take months to get approved!

❤️

@sd2k sd2k merged commit 207b3f5 into main Jun 5, 2024
3 checks passed
@sd2k sd2k deleted the decouple-comparisons-from-queryrunner branch June 5, 2024 10:03
sd2k added a commit that referenced this pull request Jun 5, 2024
…rately

Followup from [this comment](#587 (comment))
- this adds the option for ExtraQueryProviders to state that they
only wish to have their processor rerun (probably with different state),
rather than the QueryRunner rerunning both the query _and_ the processor.

This is really useful for some ML-based providers which need to run an
extra query then transform the results, and also include interactivity
such as a slider, but _don't_ need to rerun the query as part of the
interactivity - just the processing.

There are some downsides here, most notably the extra complexity:

- the `ExtraQueryProvider` interface is more flexible but more complex
- the `SceneQueryRunner` needs another subscription and
  `ReplaySubject` in order to be able to re-send the latest
  unprocessed data to the processors again
  - I think this will also increase memory usage?

but also it feels a bit like this is already being done by
transformations somehow...

I'm probably missing something
sd2k added a commit that referenced this pull request Jun 5, 2024
…rately

Followup from [this comment](#587 (comment))
- this adds the option for ExtraQueryProviders to state that they
only wish to have their processor rerun (probably with different state),
rather than the QueryRunner rerunning both the query _and_ the processor.

This is really useful for some ML-based providers which need to run an
extra query then transform the results, and also include interactivity
such as a slider, but _don't_ need to rerun the query as part of the
interactivity - just the processing.

There are some downsides here, most notably the extra complexity:

- the `ExtraQueryProvider` interface is more flexible but more complex
- the `SceneQueryRunner` needs another subscription and
  `ReplaySubject` in order to be able to re-send the latest
  unprocessed data to the processors again
  - I think this will also increase memory usage?

Perhaps there's a way to do this using transformations instead?
sd2k added a commit that referenced this pull request Jun 5, 2024
…rately

Followup from [this comment](#587 (comment))
- this adds the option for ExtraQueryProviders to state that they
only wish to have their processor rerun (probably with different state),
rather than the QueryRunner rerunning both the query _and_ the processor.

This is really useful for some ML-based providers which need to run an
extra query then transform the results, and also include interactivity
such as a slider, but _don't_ need to rerun the query as part of the
interactivity - just the processing.

There are some downsides here, most notably the extra complexity:

- the `ExtraQueryProvider` interface is more flexible but more complex
- the `SceneQueryRunner` needs another subscription and
  `ReplaySubject` in order to be able to re-send the latest
  unprocessed data to the processors again
  - I think this will also increase memory usage?

Perhaps there's a way to do this using transformations instead?
sd2k added a commit that referenced this pull request Jun 5, 2024
…rately

Followup from [this comment](#587 (comment))
- this adds the option for ExtraQueryProviders to state that they
only wish to have their processor rerun (probably with different state),
rather than the QueryRunner rerunning both the query _and_ the processor.

This is really useful for some ML-based providers which need to run an
extra query then transform the results, and also include interactivity
such as a slider, but _don't_ need to rerun the query as part of the
interactivity - just the processing.

There are some downsides here, most notably the extra complexity:

- the `ExtraQueryProvider` interface is more flexible but more complex
- the `SceneQueryRunner` needs another subscription and
  `ReplaySubject` in order to be able to re-send the latest
  unprocessed data to the processors again
  - I think this will also increase memory usage?

Perhaps there's a way to do this using transformations instead?
@grafanabot
Copy link
Contributor

🚀 PR was released in v4.26.1 🚀

sd2k added a commit that referenced this pull request Jun 7, 2024
…rately

Followup from [this comment](#587 (comment))
- this adds the option for ExtraQueryProviders to state that they
only wish to have their processor rerun (probably with different state),
rather than the QueryRunner rerunning both the query _and_ the processor.

This is really useful for some ML-based providers which need to run an
extra query then transform the results, and also include interactivity
such as a slider, but _don't_ need to rerun the query as part of the
interactivity - just the processing.

There are some downsides here, most notably the extra complexity:

- the `ExtraQueryProvider` interface is more flexible but more complex
- the `SceneQueryRunner` needs another subscription and
  `ReplaySubject` in order to be able to re-send the latest
  unprocessed data to the processors again
  - I think this will also increase memory usage?

Perhaps there's a way to do this using transformations instead?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants