Skip to content

Conversation

Manishearth
Copy link
Member

Fixes #7049

I put some effort into making sure that type errors are returned before range errors, which makes it easier for Temporal to use this directly (otherwise it would have to hoist type errors out to be 100% spec compliant).

After discussion with @sffc, I decided not to have this check everywhere: only from_fields and from_codes have this check; calendar-specific constructors do not, and constructing from RataDie does not either. This is fine, it lets people stretch calendars to their limits if they really want.

@Manishearth Manishearth requested a review from sffc as a code owner October 10, 2025 00:19
Copy link

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

@Manishearth Manishearth changed the title Error in Date::try_from_fields for out-of-range year values. Throw error in Date::try_from_fields for out-of-range year values. Oct 10, 2025
@Manishearth Manishearth changed the title Throw error in Date::try_from_fields for out-of-range year values. Throw error in Date::try_from_fields for very large year values. Oct 10, 2025
}

#[test]
fn test_from_fields_regress_7049() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Add MonthDay and YearMonth tests, and missing Day tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't appear to have gotten done. #7083

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, sorry, I have a local branch for this that I didn't finish up and push. I'll do that tomorrow.

Copy link
Member

@robertbastian robertbastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the approach we should be taking. Rejecting years smaller than 10e6 with the justification that higher years produce overflows in our code is a really bad look. 10e6 is a small i32, if those values produce overflows, our code cannot be good.

Further, from_fields is only one of our four constructors, we also have from_iso and from_rata_die, both of which are infallible, so we cannot return a RangeError from these. While this approach might solve problems in from_fields, it will still be possible to create Dates which violate invariants through those other APIs (and because we only fuzz from_fields, it hides these issues).

IIUC this is an issue that is exclusive to the LunarChinese and Hijri calendars. We should tighten the requirements of the Rules trait, e.g. by making the constructor of the year types fallible. For Hijri we can prove that the Tabular calendar observes all our invariants, so then custom Rules can just fall back to that for the extremes to be correct. For Chinese we need to actually fix the simple approximation.

@Manishearth
Copy link
Member Author

IIUC this is an issue that is exclusive to the LunarChinese and Hijri calendars

No, it's not, the fuzzer found overflows in every calendar. I only posted some of them in #7049, but my fuzz test was finding issues for each one.

10e6 is not where our code starts producing overflows, it is the safe range we can make guarantees about, and it's the one we've exhaustively tested. I think it's not a bad look as long as we message it correctly.

@robertbastian
Copy link
Member

No, it's not, the fuzzer found overflows in every calendar. I only posted some of them in #7049, but my fuzz test was finding issues for each one.

Can that not be fixed? I'm curious where this overflow happens, and whether a limit like i32::MAX/2 would be more appropriate.

@Manishearth
Copy link
Member Author

Here's a failing Buddhist value

	FuzzInput {
	    year: -2147483648,
	    month: 134,
	    day: 46,
	    month_interpretation: Ordinal,
	    overflow_constrain: false,
	    cal: Buddhist,
	}

It's 2^31, so yeah, that's probably fixable with a wider range. I guess we can have a per-calendar range check; FromFields can specify a min and max value.

Of course, if this is per-calendar, calendar conversion would mean that you can still take a valid ISO date and convert it into an "out of range" Chinese one.

While this approach might solve problems in from_fields, it will still be possible to create Dates which violate invariants through those other APIs (and because we only fuzz from_fields, it hides these issues).

This was deliberate: the issue can be narrowly solved in from_fields but infallible calendar conversion means that it can't ever be truly eliminated from calendar code completely. I want to fix from_fields and have a narrower guarantee there, and in the future we can consider tweaking the guarantee on the other constructors.

@Manishearth
Copy link
Member Author

Manishearth commented Oct 10, 2025

From fuzzing it seems like Chinese is okay with year values bounded by i32::MAX / 8. So we could still have a cross-calendar bound there.

@Manishearth
Copy link
Member Author

Nope, UAQ still errors on 197001411.

@robertbastian
Copy link
Member

Here's a failing Buddhist value

Yeah because Buddhist is an offset of Gregorian, so the addition/subtraction will overflow. As long as it's just additions (hope we don't mulitply years), MAX/2 should solve that. Such a limit we wouldn't even have to document, or at least not justify with "we don't know how to get rid of overflows other than this".

Nope, UAQ still errors on 197001411.

That's an issue with the year data which I think I fixed in #7065?

@Manishearth
Copy link
Member Author

Manishearth commented Oct 10, 2025

I expanded the range to ±2²⁷, whcih is ±134_217_728. 28 might be possible after the UAQ fix.

@Manishearth
Copy link
Member Author

Manishearth commented Oct 11, 2025

Even with your patch I get the following panic if I reduce the 16 to an 8. So I think ±2²⁷ is the best we can do for now. We can always expand it later if we find ways of making it work.

thread '<unnamed>' (1420452) panicked at /home/manishearth/dev/icu4x/components/calendar/src/cal/hijri.rs:373:39:
attempt to multiply with overflow


	FuzzInput {
	    year: 238236467,
	    month: 51,
	    day: 51,
	    month_interpretation: CodeLeap,
	    overflow_constrain: true,
	    cal: HijriTabularTypeIIFriday,
	}

note that the used year value is year % (i32::MAX / 8), which in this case is still 238236467.

Copy link
Member

@robertbastian robertbastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with your patch I get the following panic

Fixed in 2c525c5

Comment on lines 152 to 153
/// regardless of the calendar, instead returning a [`DateError::Range`]. This is to protect clients
/// from bugs due to arithmetic overflow. Currently, calendar-specific `Date::try_new_calendarname()` constructors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove the overflow justification.

Suggested change
/// regardless of the calendar, instead returning a [`DateError::Range`]. This is to protect clients
/// from bugs due to arithmetic overflow. Currently, calendar-specific `Date::try_new_calendarname()` constructors
/// regardless of the calendar, instead returning a [`DateError::Range`]. Currently, calendar-specific `Date::try_new_calendarname()` constructors

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? I think it makes sense to give people an idea of what we are trying to prevent here (and it makes the order of magnitude of the choice of range relatively clear).

It means that overflow in these APIs is still considered a bug, whereas it won't necessarily be for the other ways of making Dates. Over time we can see if we wish to handle this there too, and docs can be updated accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our code should be/have been written in a way that avoids overflows. if we'd had the range from the beginning, and then used it to justify the safety of our arithmetic, we wouldn't now justify the range with the overflowing of our arithmetic.

I also find it weird to say "protect clients from bugs due to arithmetic overflow". if there's an overflow, that's our bug, not the client's bug. saying "protect clients from debug panics due to arithmetic overflows" would be more correct, but I still don't like that because I'd prefer to give the appearance that our code was written with overflows in mind in the first place

Copy link
Member Author

@Manishearth Manishearth Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it should have been written with overflows in mind in the first place, but that would still look something like this. There are only two possible solutions for overflows: either we have GIGO (usually saturation) behavior, or throw an error. We've picked the error one here. There are different ways for us to do the error thing, but it would still ultimately be "we do not accept all i32 year values".

I don't think "protect clients from bugs due to arithmetic overflow" makes it sound like it's the clients' bug. It protects client applications from having issues due to overflow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Manishearth
Copy link
Member Author

Manishearth commented Oct 13, 2025

(force push was a rebase, only the last commit "review comments" is new)

Copy link
Member

@robertbastian robertbastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine for now, I plan to push the range up in #7065

@Manishearth Manishearth merged commit 0a2a4cd into unicode-org:main Oct 13, 2025
31 checks passed
@Manishearth Manishearth deleted the date-year-range branch October 13, 2025 16:07
@Manishearth
Copy link
Member Author

I have a slight preference for the range-change to be a third PR but we can figure that out separately. Fine to include in that PR for now, I won't be sure until I get a better look at it.

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Post-submit review

}
None => match ordinal_month_as_u8 {
Some(month) => month,
// This is technically unreachable since it's checked early above
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Then debug assert that it is unreachable, please.

(this is feedback I gave orally but didn't write down in a GitHub comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said then I really want to be careful about debug assertions in this code, but I think this case is fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let extended_year_as_year_info = fields
.extended_year
.map(|extended_year| cal.year_info_from_extended(extended_year));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Process Discussion:

This is code that I "care a lot about" because it is code that I wrote a matter of weeks ago, and it is being changed in major ways. My "care a lot about it" would decay over time, but this is still very new code that I spent a lot of time thinking about. Manish reviewed this code so I would expect that he would have been able to correctly identify that status.

I left a comment on an older version of this PR. However, after I left that comment, this function changed in a large way, so it "looks like" I was OK with the PR, even though I hadn't seen these additional changes.

The correct process would have been:

  1. Shane leaves comment on v1 of the PR
  2. Manish updates the PR
  3. Manish re-requests Shane's review (this is the step that doesn't appear to have happened)
  4. Shane leaves a re-review, which he can see in his GitHub open review requests queue which he endeavors to clear out every 24 hours (this didn't happen because it wasn't in my open review requests), and if Shane doesn't reply within a reasonable statute of limitations, then the PR could be merged despite his lack of response. This sat for 1 business day without a response, which doesn't exceed the statute of limitations.

The fail-safe would be for Manish or Robert to flag this function as one Shane cares about, notice that Shane didn't see the latest changes, and ask for explicit LGTM. It seems the fail-safe also didn't happen.

Copy link
Member Author

@Manishearth Manishearth Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manish reviewed this code so I would expect that he would have been able to correctly identify that status.

In that case I don't think I have a good mental model about the type of code you would care about if you care about this, after our discussions about this PR.

It's generally hard to know in advance if some code is something an author has spent a lot of time thinking about unless it that thinking happened through the review process.

We explicitly had an in-person discussion about the strategy to use here, and I even wrote some of this code in front of you, checking with you on some of the stages. The code here did not change that much since that discussion, with the main change being to comments. My expectation was that we were mostly aligned on the general shape of this code and it was Robert whom I would need to get some consensus with, which is what I pursued. I think c0283f3 is the only actual change, the bulk of the work here was discussion with Robert. 1c67bd3 was a refactor I did in front of you.

I think it is unreasonable to expect me to know you would want to be a blocking reviewer on this. I expected you'd want to look at it eventually, but not as a blocking review.

(In this specific case, I merged once I had Robert's review because he's also working on improving the range check situation, by merging it I enabled a lot of progress on his PR #7065 during the time we were both online)

this didn't happen because it wasn't in my open review requests

I guess when you leave a partial review it dismisses the request? Perhaps for partial reviews you should re-request review of yourself if you use that UI. This is just like re-marking an inbox item as unread/un-"done" after partially-but-not-completely consuming it. I can't know just from looking at it whether a review is a partial review or a full one.

My norm is to only re-request review of reviewers whose comments I'm addressing, unless I'm pushing a large enough change that I think I need to re-request review from others. The changes I made since your review were extremely minor, mostly comments.

Personally I don't use "open review requests" for this since I use notification filters. So I don't know the details of what does and doesn't work.

A slight mistake I made here was starting the work for #7062 (comment) as a separate thing and then not leaving a comment saying I was planning on doing that, or waiting for a response here. That was my bad: I thought I had left a comment there or or maybe said something in person and it just hadn't happened.

}

#[test]
fn test_from_fields_regress_7049() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't appear to have gotten done. #7083

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Attempted to multiply with overflow in chinese calendar construction code

3 participants