-
Notifications
You must be signed in to change notification settings - Fork 583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for PostgreSQL/Redshift geometric operators #1723
Conversation
src/ast/operator.rs
Outdated
/// `+` Addition, also Translation (PostgreSQL/Redshift geometric operator), e.g. `box '((0,0),(1,1))' + point '(2.0,0)'` | ||
Plus, | ||
/// Minus, e.g. `a - b` | ||
/// `-` Subtraction, Translation (PostgreSQL/Redshift geometric operator), e.g. `box '((0,0),(1,1))' - point '(2.0,0)'` | ||
Minus, | ||
/// Multiply, e.g. `a * b` | ||
/// `*` Multiplication, also Scaling (PostgreSQL/Redshift geometric operator), e.g. `box '((0,0),(1,1))' * point '(2.0,0)'` | ||
Multiply, | ||
/// Divide, e.g. `a / b` | ||
/// `/` Division, also Scaling (PostgreSQL/Redshift geometric operator), e.g. `box '((0,0),(2,2))' / point '(2.0,0)'` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we skip this diff? I'm thinking since these operators are fundamental and apply to various scenarios, no need to mention specific dialect semantics given neither the parser nor AST handles them specially as a result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/ast/operator.rs
Outdated
/// `#` Number of points in path or polygon (PostgreSQL/Redshift geometric operator), e.g. `# path '((1,0),(0,1),(-1,0))'` | ||
NumOfPoints, | ||
/// `@-@` Length or circumference (PostgreSQL/Redshift geometric operator), e.g. `@-@ path '((0,0),(1,0))'` | ||
LenOfCircumference, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the doc says length or circumference, is the Of
a typo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a typo, fixed
src/ast/operator.rs
Outdated
@@ -53,6 +53,16 @@ pub enum UnaryOperator { | |||
PGAbs, | |||
/// Unary logical not operator: e.g. `! false` (Hive-specific) | |||
BangNot, | |||
/// `#` Number of points in path or polygon (PostgreSQL/Redshift geometric operator), e.g. `# path '((1,0),(0,1),(-1,0))'` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we include a link to the docs where the operators are described? that would be helpful for folks that come across it and need to find more context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/ast/operator.rs
Outdated
// `#` PostgreSQL Bitwise XOR, also Intersection (PostgreSQL/Redshift geometric operator), e.g. `box '((1,-1),(-1,1))' # box '((1,1),(-2,-2))'` | ||
PGBitwiseXor, | ||
/// Bitwise shift left, e.g. `a << b` (PostgreSQL-specific) | ||
/// `<<` PostgreSQL Bitwise Shift Left, also Left of? (PostgreSQL/Redshift geometric operator), e.g. circle '((0,0),1)' << circle '((5,0),1)' | ||
PGBitwiseShiftLeft, | ||
/// Bitwise shift right, e.g. `a >> b` (PostgreSQL-specific) | ||
/// `>>` PostgreSQL Bitwise Shift Right, Right of? (PostgreSQL/Redshift geometric operator), e.g. circle '((5,0),1)' >> circle '((0,0),1)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually same with these, I think just mentioning the operator suffices, without going into semantics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/ast/operator.rs
Outdated
@@ -253,6 +268,40 @@ pub enum BinaryOperator { | |||
/// Specifies a test for an overlap between two datetime periods: | |||
/// <https://jakewheat.github.io/sql-overview/sql-2016-foundation-grammar.html#overlaps-predicate> | |||
Overlaps, | |||
/// `##` Point of closest proximity (PostgreSQL/Redshift geometric operator), e.g. `point '(0,0)' ## lseg '((2,0),(0,2))'` | |||
PointOfClosestProximity, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm actually thinking about it, I feel like the naming of the operators might become problematic, e.g. if another dialect supports ##
operator then calling it PointOfClosestProximity
would be confusing vs if the operator was called DoubleHash
dialect agnostic. Would it make sense to name the operators in such a manner? i.e. by their characters or if theres a well known description that isnt describing the pg behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/parser/mod.rs
Outdated
if dialect_of!(self is PostgreSqlDialect | RedshiftSqlDialect) | ||
&& GEOMETRIC_TYPES.contains(&w.keyword) | ||
{ | ||
self.parse_geometric_type(w.keyword) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this be its own branch within parse_expr_prefix_by_reserved_word
instead?
It looks like we should be able to do something like
kw if self.dialect.supports_pg_geometric_types() && PG_GEOMETRIC_TYPES.contains(kw) => {
self.parse_geometric_expr(kw)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/parser/mod.rs
Outdated
let token_with_span = self.next_token(); // Get the full TokenWithSpan | ||
let value = match token_with_span.token { | ||
// Extract the Token field | ||
Token::SingleQuotedString(s) => Value::SingleQuotedString(s), | ||
_ => return self.expected("SingleQuotedString", token_with_span), // Pass full TokenWithSpan | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let token_with_span = self.next_token(); // Get the full TokenWithSpan | |
let value = match token_with_span.token { | |
// Extract the Token field | |
Token::SingleQuotedString(s) => Value::SingleQuotedString(s), | |
_ => return self.expected("SingleQuotedString", token_with_span), // Pass full TokenWithSpan | |
}; | |
let token_with_span = self.next_token(); | |
let value = match token_with_span.token { | |
Token::SingleQuotedString(s) => Value::SingleQuotedString(s), | |
_ => return self.expected("SingleQuotedString", token_with_span), | |
}; |
thinking we can drop the comments in this case since the code is as descriptive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/parser/mod.rs
Outdated
| tok @ Token::AtAt | ||
| tok @ Token::QuestionMarkDash | ||
| tok @ Token::QuestionPipe | ||
if dialect_of!(self is PostgreSqlDialect | RedshiftSqlDialect) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we replace these with a dialect method? the dialect_of
has been deprecated, we can introduce a method like supports_pg_geometric_types()
or similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/test_utils.rs
Outdated
pub fn only_psql_redshift() -> TestedDialects { | ||
TestedDialects { | ||
dialects: vec![ | ||
Box::new(PostgreSqlDialect {}), | ||
Box::new(RedshiftSqlDialect {}), | ||
], | ||
options: None, | ||
recursion_limit: None, | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by switching to the dialect method, it would be possible to reuse all_dialects_where
in place of this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the code to use all_dialects_where
as to your suggestion
tests/sqlparser_postgres.rs
Outdated
//"&<", // (is strictly to the left of) // Same as OverlapsToLeft, causes duplicate failure | ||
//"&>", // (is strictly to the right of) // Same as OverlapsToRight, causes duplicate failure | ||
"|=|", // distance between A and B trajectories at their closest point of approach | ||
"<<#>>", // n-D distance between A and B bounding boxes | ||
"|>>", // A's bounding box is strictly above B's. | ||
"~=", // bounding box is the same | ||
//"|>>", // A's bounding box is strictly above B's. // Same as IsStrictlyAbove, causes duplicate failure | ||
//"~=", // bounding box is the same // Same as SameAs, causes duplicate failure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I followed why these are commented out, what do we mean by causes duplicate failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commented-out operators are already defined under different names in the code,
so they can no longer be considered custom operators.
If we include these again under their original symbols, it causes a duplicate match failure.
so I have now completely removed them.
b297ccb
to
01f6646
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @benrsatori! Left some comments, the changes look good to me overall
@@ -912,6 +927,10 @@ pub trait Dialect: Debug + Any { | |||
fn supports_array_typedef_size(&self) -> bool { | |||
false | |||
} | |||
/// Returns true if the dialect supports geometric types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a link to the postgres docs here (and maybe an example sql)? it would help folks with context on what we mean by geometric types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/parser/mod.rs
Outdated
@@ -3079,6 +3128,55 @@ impl<'a> Parser<'a> { | |||
Token::QuestionAnd => Some(BinaryOperator::QuestionAnd), | |||
Token::QuestionPipe => Some(BinaryOperator::QuestionPipe), | |||
Token::CustomBinaryOperator(s) => Some(BinaryOperator::Custom(s.clone())), | |||
Token::DoubleSharp if dialect_of!(self is PostgreSqlDialect | RedshiftSqlDialect) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for these clauses as well could we use the introduced self.dialect.supports_geometric_types()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/tokenizer.rs
Outdated
@@ -1308,6 +1356,28 @@ impl<'a> Tokenizer<'a> { | |||
_ => self.start_binop(chars, "||", Token::StringConcat), | |||
} | |||
} | |||
Some('&') if dialect_of!(self is PostgreSqlDialect | RedshiftSqlDialect) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly for these clauses it would be nice to use the self.dialect.supports_geometric_types()
method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -879,3 +882,29 @@ pub enum ArrayElemTypeDef { | |||
/// `Array(Int64)` | |||
Parenthesis(Box<DataType>), | |||
} | |||
|
|||
#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash, PartialOrd, Ord)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a short description to this struct with a link to the docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/ast/operator.rs
Outdated
@@ -53,6 +53,21 @@ pub enum UnaryOperator { | |||
PGAbs, | |||
/// Unary logical not operator: e.g. `! false` (Hive-specific) | |||
BangNot, | |||
/// `#` Number of points in path or polygon (PostgreSQL/Redshift geometric operator) | |||
/// see <https://www.postgresql.org/docs/9.5/functions-geometry.html> | |||
NumOfPoints, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do similar to the binary operator where we name them according to their symbols to be dialect agnostic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -386,6 +386,8 @@ pub enum DataType { | |||
/// | |||
/// [bigquery]: https://cloud.google.com/bigquery/docs/user-defined-functions#templated-sql-udf-parameters | |||
AnyType, | |||
/// geometric type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we include a link to the postgres docs here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
9c62ffc
to
fdc90cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @benrsatori!
cc @alamb
@benrsatori could you take a look at the CI failures? |
Hi @iffyio |
Add support for PostgreSQL and Redshift geometric operators.