-
Notifications
You must be signed in to change notification settings - Fork 1.3k
feat(sql): ASOF and LT JOIN TOLERANCE support #5713
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jerrinot
wants to merge
63
commits into
master
Choose a base branch
from
jh_asof_tolerance
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
and the slave table does not contain a given symbol at all
todo: more tests
todo: configurable threshold
[PR Coverage check]😍 pass : 489 / 491 (99.59%) file detail
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Enhancement
Enhance existing functionality
Java
Improvements that update Java code
Performance
Performance improvements
SQL
Issues or changes relating to SQL execution
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently, ASOF JOIN matches records from the right table with timestamps that are equal to or earlier than the timestamp in the left table. This PR addresses the feature request to limit how far back in time the join should look for a match.
This enhancement adds a
TOLERANCE
clause to the ASOF and LT JOIN syntax. TheTOLERANCE
parameter accepts a time interval value (e.g., 2s, 100ms, 1d). When specified, a record from the left table t1 at t1.ts will only be joined with a record from the right table t2 at t2.ts if t2.ts <= t1.ts AND t1.ts - t2.ts <= tolerance_value.This provides more fine-grained control over ASOF joins, particularly useful in scenarios with sparse data where a simple "equal or earlier" match might pick records that are too distant in time to be relevant.
Or without keys:
Performance impact
Specifying
TOLERANCE
can also improve performance.ASOF JOIN
execution plans often scan backward in time on the right table to find a matching entry for each left-table row.TOLERANCE
allows these scans to terminate early - once a right-table record is older than the left-table record by more than the specified tolerance - thus avoiding unnecessary processing of more distant records.💥 Breaking Change: TOLERANCE as a new keyword in JOIN
This change introduces
TOLERANCE
as a new keyword specifically within the JOIN clause. This may break existing queries whereTOLERANCE
was used as an unquoted table alias for the right-hand table in a JOIN.Example of affected query:
Reason for breakage:
After this change,
TOLERANCE
in the position above is interpreted as the new keyword. The query will fail because the parser expects an interval value (e.g., 2s) to follow theTOLERANCE
keyword, which is missing in the example.Solution:
To use
TOLERANCE
as a table alias in this context, it must now be enclosed in double-quotes, as per standard SQL for identifiers that are also keywords:Additional optimizations
This PR introduces a performance enhancement for specific keyed ASOF JOIN scenarios, particularly when the join key is of the
SYMBOL
type:The optimization works as follows: When processing a row from the left_table, if its particular
SYMBOL
key value is entirely absent in the right_table's corresponding symbol_key column (meaning no records in right_table share this key value), the improved execution plan can now detect this. By "exiting early" from the search for this non-existent key, the overall query performance can be significantly improved, especially in cases with many such missing keys.Closes #5562
Documentation PR: questdb/documentation#195