Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib] Add UInt debug_assert(value >= 0) for Int implicit conversion #3753

Open
wants to merge 11 commits into
base: nightly
Choose a base branch
from

Conversation

martinvuyk
Copy link
Contributor

@martinvuyk martinvuyk commented Nov 6, 2024

Add UInt debug_assert(value >= 0) for Int implicit conversion. This PR also introduces a new uint() -> UInt builtin function as alternative for constructing from negative values explicitly.

@martinvuyk martinvuyk requested a review from a team as a code owner November 6, 2024 18:54
@martinvuyk martinvuyk marked this pull request as draft November 6, 2024 19:00
Copy link
Collaborator

@JoeLoser JoeLoser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a reasonable stop gap solution to avoid the problems caused with the implicit construction right now. WDYT @ConnorGray?

@martinvuyk martinvuyk marked this pull request as ready for review November 7, 2024 13:06
Signed-off-by: martinvuyk <[email protected]>
Signed-off-by: martinvuyk <[email protected]>
@JoeLoser
Copy link
Collaborator

JoeLoser commented Nov 9, 2024

!sync

@modularbot modularbot added the imported-internally Signals that a given pull request has been imported internally. label Nov 9, 2024
@jackos
Copy link
Collaborator

jackos commented Nov 11, 2024

Hi @martinvuyk to keep you updated, the debug_assert for the IntLiteral is all good, but there is discussion on the internal PR about the uint func for initializing a Uint. The plan was to eventually remove the equivalent int fn once we had the tools to move all construction to Int without having unwanted implicit conversions (which I think we can do now), would you be OK if we merge the debug_assert and remove the uint function from the internal PR?

@martinvuyk
Copy link
Contributor Author

Thanks for the update @jackos. I figured it's a way to express explicitly that you want to build from a negative int while allowing for APIs to get implicitly converted and python devs not having to think too much with regards to int vs uint types. IMO Int -> UInt should remain implicit. We could also make the debug assert be always on in release mode as well so the implicit conversion from negative value becomes impossible if not using the builtin uint, at the cost of a trivial amount of perf for those using the implicit conversion.

@JoeLoser JoeLoser assigned jackos and unassigned JoeLoser Nov 25, 2024
@JoeLoser JoeLoser requested review from jackos and removed request for JoeLoser November 25, 2024 18:13
@JoeLoser
Copy link
Collaborator

FYI now that we just gained support for explicit constructor, @jackos has an internal design doc that we're working toward around safer numeric types. I'll let him run with this PR, but I suspect we'll just incorporate some ideas in this PR in the bigger picture of his set of changes/design. The TLDR though is we'll be making the narrowing conversion of UInt to Int explicit (by removing the @implicit decorator).

@martinvuyk
Copy link
Contributor Author

Hi @jackos and @JoeLoser, thanks for keeping me updated, I really appreciate it. And kudos to jack, the @implicit decorator is very nice and I like the direction this is going.

I'll be a bit insistent again and stand for my opinion. Int and UInt will be two of the most used types (and converted to and fro) in the language. I think conversion from UInt <-> Int should remain implicit, code and APIs in several libraries will have wildly different conventions, the code using them will look garishly un-pythonic if it has to use constructors at every step. IMO sacrificing perf for implicit conversion (using debug_assert[assert_mode="safe"]) in exchange for safety is a solid alternative, and providing the builtin for alternative construction of negative values (and without the assert overhead) rounds it nicely IMO. I can only imagine the amount of functions in Modular code that receive Int and check for >= 0, changing that to UInt would then require explicit construction at every call-site; now imagine that pain for the whole ecosystem.

@jackos
Copy link
Collaborator

jackos commented Nov 25, 2024

Hi @martinvuyk when making it explicit there aren't many places throughout the codebase that implicitly convert a UInt to an Int and vice-versa, the main place that it happens is in the hashing alg.

Python doesn't have the concept of a uint, it's only when the programmer is using the type system outside of what a python programmer would do. There are unexpected consequences to having the implicit conversion, e.g. in operations like __radd__. For potentially unsafe operations like converting a negative Int to UInt, it shouldn't happen behind the programmer's back, and having a debug_assert fail in debug builds showing how to fix it if they intended to bitcast the sign is very easy to fix for the programmer.

I looked at modern strictly typed languages like Zig, Go, Swift, and Rust and they all have protections for this, I think it's a good tradeoff.

@martinvuyk
Copy link
Contributor Author

There are unexpected consequences to having the implicit conversion, e.g. in operations like radd.

Hadn't thought about that one, yeah that would be an issue. I've myself done += negative_num for the sake of aesthetics or to spare an if else branch 😆.

Python doesn't have the concept of a uint

My main motivation is that I don't want them to have to think too much about that difference. But it's hard while also trying to not surprise programmers and having high perf people using the same language as someone hacking together some very shady script logic.

having a debug_assert fail in debug builds showing how to fix it if they intended to bitcast the sign is very easy to fix for the programmer.

We'll keep the debug_assert in the constructor even though it's an explicit constructor? So you want to add a from_negative() method or something?

@jackos
Copy link
Collaborator

jackos commented Nov 25, 2024

But it's hard while also trying to not surprise programmers and having high perf people using the same language as someone hacking together some very shady script logic

Yeah that's where we differ from other languages, with our scalar types being SIMD types of width 1 it should be ergonomic for kernel engineers and also protect people writing simple programs. I raised if we should add the debug_assert to potentially unsafe SIMD conversions, but no consensus on that yet.

We'll keep the debug_assert in the constructor even though it's an explicit constructor? So you want to add a from_negative() method or something?

I want this to cause the debug_assert to fail:

UInt(-1)

With the error saying to do this if you intended to bitcast the sign:

UInt(from_bytes=-1)

Or a method:

UInt.from_bytes(-1)

Want to get a consensus before doing this though, there's an argument that the explicit constructor shouldn't have a debug_assert and just allow the sign to be bitcast. Also the name should be the same to allow for truncating values / overflowing e.g. in Int(UInt.MAX) so unsafe= might make more sense.

@martinvuyk
Copy link
Contributor Author

I want this to cause the debug_assert to fail:

UInt(-1)

With the error saying to do this if you intended to bitcast the sign:

UInt(from_bytes=-1)

Or a method:

UInt.from_bytes(-1)

Want to get a consensus before doing this though, there's an argument that the explicit constructor shouldn't have a debug_assert and just allow the sign to be bitcast. Also the name should be the same to allow for truncating values / overflowing e.g. in Int(UInt.MAX) so unsafe= might make more sense.

To me the debug_assert and constructor combo sounds fine. I would however change UInt(from_bytes=-1) to something along the lines of UInt(bitwise=-1).

This all starts sounding to me like we need to step back and think about .bitcast(), .cast() and constructors for numeric types as a whole. We could also go for something like

struct Int:
    ...
    fn cast[T: AnyType](self) -> T:
        constrained[DType.is_scalar[T]() or _type_is_eq[T, UInt]()]()
        ... # cast to that type

to keep the same API for Int and UInt as we have for SIMD. (DType.is_scalar() is from PR #3577 which I could split off into it's own thing, there are many places where it would allow some optimized logic).

Using an explicit cast would have no warnings or asserts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
imported-internally Signals that a given pull request has been imported internally.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants