Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reconsider approach to Span integration for PostgreSQL in jiff-diesel (and probably jiff-sqlx) #251

Open
BurntSushi opened this issue Feb 11, 2025 · 2 comments
Labels
enhancement New feature or request

Comments

@BurntSushi
Copy link
Owner

Right now, the Span wrapper type in jiff-diesel and jiff-sqlx only supports decoding from a PostgreSQL interval. All such Span values have, at most, non-zero months, days and/or microseconds. All other units are always zero.

Encoding is not provided currently because it has two problems with it:

  1. Since a PostgreSQL interval is only a (months, days, microseconds) and a Jiff Span has many more units than that, one has to convert a Span into a PostgreSQL interval using only months, days and microseconds. I believe this is possible to do in a way that always preserves the exact physical duration (so long as we can assume that there are always 7 days in a week, which Jiff does indeed do). That is, this conversion step never needs to make assumptions like "there are always 30 days in a month" or "there are always 24 hours in a day." This can be done, it just requires a bit of work.
  2. While in (1) we can preserve the physical duration, we cannot preserve the actual units in a Jiff Span when we use a PostgreSQL interval. For example, it would erase the difference between 2 hours and 120 minutes.

So while (1) is solvable, (2) really isn't. (2) is a concern to me because it could be potentially surprising to drop semantic information like that silently. So there is an open question for how to deal with it. Here are some choices:

  1. Don't deal with it at all. This is a status quo.
  2. Accept that people know what they're doing and do the encoding step to a PostgreSQL interval.
  3. Provide two different Span wrappers. One that integrates with PostgreSQL intervals and potentially loses information, and another wrapper that uses a plain TEXT field with the friendly duration format to non-lossily encode and decode a Span. The downside is that you don't get native PostgreSQL interval support.

I'm partial to (3). It makes the wrappers a bit more complicated, but not significantly so IMO. We could name them, SpanInterval and SpanNonLossy or something.

Ref #241 (comment)

@fiadliel
Copy link
Contributor

I'd vote for option 3 here - having a lossless option is important, but some use cases will be fine with lossy.

On how to encode the lossless option, I can think of three other encodings for PostgreSQL (varying in wacky to different degrees). Probably none of these are good enough to be used instead of text, but just to enumerate some possibilities, ordered in ability/performance level:

  1. create and popularize a PostgreSQL extension for the Temporal model (or get the database developers to add it). This is probably out of scope, but if you're looking for blue sky...
  2. encode to multiple fields (e.g. year is one Integer field, month is another). This probably needs significant work to make the DSL easy to use, and it might be confusing for a user to setup. However, it seems like possibly the most performant lossless option.
  3. encode as JSON/JSONB. Not really any upsides compared with encoding to text, but it might be interesting to see if there's any performance benefit compared with text.

@BurntSushi
Copy link
Owner Author

Aye yeah, given those options I'm definitely going to just roll with the friendly format. Thank you for your feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants