Skip to content

Add earliest and latest in aggregation and window function #3640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

xinyual
Copy link
Contributor

@xinyual xinyual commented May 20, 2025

Description

This pr add latest/earliest as two aggregation function, it will calculate the earliest/latest time for the time related column. Since we support time related type as string (timestamp/time/date), currently we directly reuse max/min to do it.

Related Issues

Resolves #[Issue number to be closed when this PR is merged]
#3639

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

xinyual added 4 commits May 20, 2025 13:39
Signed-off-by: xinyual <[email protected]>
Signed-off-by: xinyual <[email protected]>
Signed-off-by: xinyual <[email protected]>
@penghuo
Copy link
Collaborator

penghuo commented May 20, 2025

  • update docs please.
  • what is expectation of earliest and latest? Do we support it in where command?

Copy link
Collaborator

@dai-chen dai-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In SPL doc, these functions are applied to _raw field, but in our impl it's just an alias of min/max?

@LantaoJin LantaoJin added the calcite calcite migration releated label May 22, 2025
Signed-off-by: xinyual <[email protected]>
@xinyual
Copy link
Contributor Author

xinyual commented May 22, 2025

  • update docs please.
  • what is expectation of earliest and latest? Do we support it in where command?

Add the doc.
This PR only add them as the aggregation. Will raise another PR as the boolean function.

@xinyual
Copy link
Contributor Author

xinyual commented May 22, 2025

In SPL doc, these functions are applied to _raw field, but in our impl it's just an alias of min/max?

In aggregation, they just calculate the latest/earliest time. Since we implement time related column as string, we can directly reuse min/max for them. Also in SPL, _raw is regarded as a string field.

@penghuo
Copy link
Collaborator

penghuo commented May 22, 2025

Please fix IT

xinyual added 2 commits May 23, 2025 13:08
Signed-off-by: xinyual <[email protected]>
Signed-off-by: xinyual <[email protected]>
@xinyual
Copy link
Contributor Author

xinyual commented May 27, 2025

Please fix IT

Already fix it by fixing the doc.

@xinyual
Copy link
Contributor Author

xinyual commented May 28, 2025

@penghuo @dai-chen I also implement earliest/latest as two condition function. Please review it.

Copy link
Member

@LantaoJin LantaoJin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make sure the cases you added could pass in doc-test with calcite enabled in local then change the os> to PPL>

xinyual added 7 commits May 29, 2025 17:39
Signed-off-by: xinyual <[email protected]>
Signed-off-by: xinyual <[email protected]>
Signed-off-by: xinyual <[email protected]>
Signed-off-by: xinyual <[email protected]>
Signed-off-by: xinyual <[email protected]>
Signed-off-by: xinyual <[email protected]>
Comment on lines +639 to +640
"1970-01-18 20:22:32",
"2018-08-19 00:00:00"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid the current implementation may be wrong, the current implementation treat the earliest and latest as window function with default window frame. IMO, the behavior is odd. Please check again if they require to use other window frame instead default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
calcite calcite migration releated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants