Clarify requirements for types in device code #733

Pennycook · 2025-02-18T15:48:03Z

The goal of this PR is to clarify:

Which types are guaranteed to be available in device code; and
What guarantees exist for consistency of types across host and device.

As part of this change I removed the table of fundamental types, which duplicated a lot of information defined in C++. Just saying that we support the types with the same meaning as C++ seems simpler. I kept the list of types, but long-term I'd prefer to be able to say that SYCL supports all the C++ fundamental types and leave it at that.

Closes #372 and closes #373.

I can't assign this to you, @tahonermann, but since you originally opened these issues I'd appreciate it if you could take a look at this proposed fix.

I've opened it as a draft because it currently doesn't build, due to a broken reference to the table I removed. That reference will ve removed if we merge #730, though, so I wanted to open a PR and get some feedback.

nliber · 2025-02-18T23:55:28Z

Still exhausted from the WG21 meeting, so hopefully this is coherent:

If a type is an extended integer type, is that guaranteed to be supported on both the host and device? Same question goes for extended floating point types.

tahonermann

Thanks, @Pennycook. Please see inline comments.

adoc/chapters/device_compiler.adoc

tahonermann · 2025-02-19T00:30:20Z

adoc/chapters/device_compiler.adoc

+All types which are available in device code must have the same size, alignment
+requirements, and representation on both host and device.


I think enumeration types should be addressed in this section as well. There is an implementation requirement as well as a user requirement. For enumerations with a fixed underlying type (e.g., enum X : char), there should be a requirement that the same fixed type be specified for host and device code. Similarly, for enumerations without a fixed underlying type, there should be a requirement that the implementation select the same underlying type for host and device code.

That makes sense to me. I've added wording very similar to what you wrote here in f4373e2.

The updates for enumeration types looks good to me.

I'm unresolving this conversation. It seems confusing to me to say that applications are required to choose the same underlying type for enumerations. Why state this specifically when there are other things an application could do that cause types to be different on host and device? For example, if an application defines members of a struct to be different on host and device, you would have a similar problem.

If we feel we need to say something about this, perhaps we could say it more generally like:

Applications that define types to be have different size, alignment requirement, or representation on host vs. device via the __SYCL_DEVICE_ONLY__ macro have undefined behavior.

Or maybe it should be "... implementation defined behavior"?

I think it is useful to specifically list __SYCL_DEVICE_ONLY__ here because that is the only way that an application possibly could define types differently. At least I can't think of any other way. You might point out that there are other macros like this in DPC++, but those are vendor extensions and are therefore out of scope of this spec.

I think it is reasonable to relax the requirement that (user specified) fixed underlying types must match so long as device copyability rules retain a requirement for the same (or at least a compatible) fixed underlying type.

I think a general note like the one @gmlueck suggests would be useful for application developers, but I didn't think that was what @tahonermann was talking about here.

The wording I added in f4373e2 is in the "Device compiler" section, and I didn't intend it to be advice for application developers. The intent was to clarify that the implementation must guarantee that enumerations have matching types. The first sentence just says that enumerations with different fixed underlying types are invalid -- that means a device compiler (and more broadly, the implementation) shouldn't introduce any of those. The second sentence is probably the more important one, because it means that a device compiler may require specific logic to ensure that it selects the same underlying type as the host compiler if an enumeration has no fixed underlying type.

With regard to explicit fixed types, there is both an implementation and a requirement on application developers.

I expect device code restrictions to be relaxed over time to allow more use of the C and C++ standard libraries in device code. Device code can already link with device-only TUs via SYCL_EXTERNAL. Maintaining a single C and C++ standard library that works for both host and device code isn't particularly realistic, especially when the host implementations are provided by, e.g., GNU C, libstdc++, Microsoft, etc... This increases the likelihood that enumeration types defined in these libraries will be differently specified for the host and the device. This imposes a requirement on the SYCL implementation to ensure such enumeration types are consistently defined if objects of those types might transition the host/device divide. We can either require all such types to be consistently defined, or we can require specific ones to be consistently defined so that those specific types satisfy device copyability rules, or we can say its all implementation-defined and code that uses such types is non-portable.

Likewise, user-defined enumeration types must be consistently defined to be copied between the host and device. For those cases, we can require all such types to be consistently defined, or we can place the requirement in the device copyable restrictions (with or without a compatible size/alignment allowance and corresponding std::underlying_type observability).

Pennycook · 2025-02-19T09:18:28Z

If a type is an extended integer type, is that guaranteed to be supported on both the host and device? Same question goes for extended floating point types.

I'm not very familiar with these extended types.

Wouldn't this be very difficult for us to guarantee? It looks like these types are implementation defined, so there might be types that are defined by "a C++ implementation" (i.e., a standard library or host compiler) but that can't be made compatible with "a SYCL implementation" (i.e., a device compiler) or a specific device.

Could we say something to the effect of: "Whether extended integer and/or extended floating-point types are available in device code is implementation-defined. However, if these types are available, the size, alignment, etc must match the host."

tahonermann · 2025-02-19T21:17:10Z

Could we say something to the effect of: "Whether extended integer and/or extended floating-point types are available in device code is implementation-defined. However, if these types are available, the size, alignment, etc must match the host."

That makes sense to me.

A list of types that cannot be used in device code is clearer than what we had before, and will help to ensure that we give due consideration to additional types introduced in future versions of ISO C++.

These are implementation-defined, so whether they are available in device code should also be implementation-defined.

This gives us a place to talk about things like: - Which C++ standard library features (e.g., type aliases) are guaranteed to work on the device; and - Whether there are any additional restrictions on C++ standard library behavior.

adoc/chapters/device_compiler.adoc

There is only one long double type. Co-authored-by: Tom Honermann <[email protected]>

TApplencourt · 2025-02-24T15:32:52Z

Just to add a useless grain of salt, a few of our users are already using long-double on GPU. They just use -mlong-double-64 and hope for the best...

So I like the

Could we say something to the effect of: "Whether extended integer and/or extended floating-point types are available in device code is implementation-defined. However, if these types are available, the size, alignment, etc must match the host."

Pennycook · 2025-02-24T15:44:25Z

Just to add a useless grain of salt, a few of our users are already using long-double on GPU. They just use -mlong-double-64 and hope for the best...

This is entirely valid under "undefined behavior". If an implementation wants to say that it supports long double, it can. If an implementation wants to say nothing but it happens to work, that's fine.

The reason I'm suggesting we don't go with "implementation-defined" is that all existing implementations would then have to update their documentation to say something about long double. We'd then merge a PR (hopefully soon) which adds something like aspect::fp128 which officially adds a way to query for long double support, and implementations would have to update their documentation again.

TApplencourt · 2025-02-24T15:48:00Z

Oh yes, sorry, It was not clear. It was 100% comment to say that I liked your clarification of new wording to use "undefined behavior"!

This is nitpicking, but as you know, IEEE doesn't require long double to be 128 bits (but users can always check their long double size on the host and then check to correct aspect on the GPU, and if the device has it be happy because that mean (by some SYCL rules that I'm sure exist but I don't know -- triviality copyable?) they can use long-double on the GPU.

nliber · 2025-02-24T17:59:30Z

It would be nice in the future if we could narrow down the possible behaviors. I can think of three possibilities:

Works
Doesn't compile
Random results

Is random results really a possibility?

tahonermann

This looks good to me, so I'm giving it my stamp of approval for whatever that is worth. Thanks @Pennycook!

Pennycook · 2025-02-25T09:18:35Z

It would be nice in the future if we could narrow down the possible behaviors. I can think of three possibilities:

Works

Doesn't compile

Random results

Is random results really a possibility?

I might be wrong about this, but I convinced myself that random results could really be a possibility with existing SYCL 2020 implementations.

Prior to this change, we didn't say anything about long double at all, which means that there was no requirement for implementations to diagnose use of long double or to ensure that the host and device compilers agreed on what long double means. Some compilers provide options to control what long double means (e.g., clang provides -mlong-double-128, -mlong-double-64 and -mlong-double-80), so I'm worried it would be very easy to get in a situation where the host and device disagree about sizeof(long double) and things fail spectacularly.

I agree that we don't want this to be the behavior forever, but I think the right way to fix things will be via a KHR or SYCL-Next feature. There are a few things to work out (e.g., whether we'd need separate aspects for fp80 and fp128, how to query which is used if a device supports both...)

gmlueck · 2025-02-27T21:12:38Z

adoc/chapters/device_compiler.adoc

+All types which are available in device code must have the same size, alignment
+requirements, and representation on both host and device.


I'm unresolving this conversation. It seems confusing to me to say that applications are required to choose the same underlying type for enumerations. Why state this specifically when there are other things an application could do that cause types to be different on host and device? For example, if an application defines members of a struct to be different on host and device, you would have a similar problem.

If we feel we need to say something about this, perhaps we could say it more generally like:

Applications that define types to be have different size, alignment requirement, or representation on host vs. device via the __SYCL_DEVICE_ONLY__ macro have undefined behavior.

Or maybe it should be "... implementation defined behavior"?

I think it is useful to specifically list __SYCL_DEVICE_ONLY__ here because that is the only way that an application possibly could define types differently. At least I can't think of any other way. You might point out that there are other macros like this in DPC++, but those are vendor extensions and are therefore out of scope of this spec.

adoc/chapters/device_compiler.adoc

gmlueck · 2025-02-27T21:21:25Z

adoc/chapters/device_compiler.adoc

@@ -204,6 +204,7 @@ Amongst other things, this restriction makes it illegal for a
    However, a function may be defined in another translation unit if the
    implementation defines the [code]#SYCL_EXTERNAL# macro as described in
    <<subsec:syclexternal>>.
+  * Use of the [code]#long double# type results in undefined behavior.


FWIW, I'd say:

Device code must not use the long double type.

Statements in this list are inconsistent. Some say that using a prohibited feature causes undefined behavior, while others say that application should not use the feature. It seems to me that we should consistently say that device code must not use these features.

You may argue that an extension might lift some of these restrictions. That's fine. An extension is allowed to lift any prohibition in the spec and assign some defined meaning. There is no need to use the "undefined behavior" wording to allow an extension to do this.

I had previously suggested that we address long double via #371. I think undefined behavior is the status quo.

Does "device code must not use these features" mean that such code is ill-formed? I think ISO standards reserve "must" for use in implication (A must be B because of C). I understand that other specifications use the term to specify requirements (e.g., RFC 2119), but I think its use can be confusing. I think "shall" communicates the intent better.

It wouldn't be a bad thing for the SYCL specification to explicitly state its requirement terminology; perhaps via reference to RFC 2119 or similar.

I had previously suggested that we address long double via #371. I think undefined behavior is the status quo.

I still like the direction proposed in #371 -- that is that we make long double an optional device feature that is tied to some new aspect. However, that is probably something that needs to wait for SYCL-Next.

I think the status quo in DPC++ is that use of long double in device code is ill-formed. I think the compiler diagnoses a compile-time error for this. (Sorry, haven't checked lately, but this is my memory.)

I think the status quo in DPC++ is that use of long double in device code is ill-formed. I think the compiler diagnoses a compile-time error for this. (Sorry, haven't checked lately, but this is my memory.)

I was curious about this so I just went to check.

By default, DPC++ throws an error because long double defaults to 128-bit, which isn't supported in SPIR-V:

test.cpp:10:51: error: 'max' requires 128 bit size 'long double' type support, but target 'spir64-unknown-unknown' does not support it 10 | ld[0] = std::numeric_limits<long double>::max();

You can circumvent that check by adding the -mlong-double-64 flag. In other words, DPC++ doesn't check for long double, it just fails if you try to use 128- or 80-bit data types.

SimSYCL also doesn't do any checks for use of long double, and because it's a header-only implementation I can't immediately think of a way they could add these checks. It's also worth noting that device code using long double always works in SimSYCL, because the code is running on the host CPU.

So I think the status quo is undefined behavior. Sometimes it's ill-formed, sometimes you can make it work (by accident), and sometimes it works.

You can circumvent that check by adding the -mlong-double-64 flag. In other words, DPC++ doesn't check for long double, it just fails if you try to use 128- or 80-bit data types.

Yes, and some people are Argonne is using this flag in production

I'm confused. What SPIR-V does DPC++ generate if you compile with -mlong-double-64?

It just treats every instance of long double as if it were double, so everything is generated using OpTypeFloat 64.

I'm not convinced it works properly, because all of the tests I've tried so far have returned nan. But the code compiles without producing any errors or warnings, and runs without producing any errors or warnings.

Oh, I see.

Yes, and some people are Argonne is using this flag in production

Why? Is it just because you have existing code that uses long double and it's easier to pass the flag than to rewrite the code to use double?

Exactly. It begs the question why it even works, and why they long double in the first place, but 🤷🏽

Perhaps Argonne needs it because they are using user-defined literals (UDLs). A request for UDL support is what lead me to file #371 in the first place.

If so, there may be a cleaner solution now that we have changed the spec to allow long double in manifestly constant-evaluated expressions. As I noted in #379, I think an application can define the the UDL operator"" as consteval, and this will cause the evaluation of UDL's to happen at compile time as a constant expression. SYCL will allow this even in device code because we lifted the restriction about long double (and all other "forbidden" C++ features) when they are used in manifestly constant-evaluated expressions.

Of course, consteval requires the application to compile in C++20 mode.

gmlueck · 2025-02-27T21:22:40Z

adoc/chapters/device_compiler.adoc

@@ -216,132 +217,25 @@ The restriction waiver in <<discarded-statement>> or
 <<device-function>>.
 ====

-[[subsec:scalartypes]]


Note that the latest version of the "main" branch has text that refers to this section identifier. (This was part of the recent vec clarification.) You will need to reword that new text somehow to avoid the reference.

gmlueck · 2025-02-28T14:10:01Z

adoc/chapters/device_compiler.adoc


+For enumerations with a fixed underlying type, the same fixed type must be
+specified for host and device.


My comment here might have been confusing. I was referring to this sentence "For enumerations with a fixed underlying type, the same fixed type must be specified for host and device."

This sentence seems to be directed to application developers. If an enumeration has a fixed underlying type, it means the source code explicitly specifies the type. Therefore, there's no possibility that the device compiler might choose some other type -- it has to use the type specified in the application source code.

Therefore, I think this sentence must be telling application developers not to use __SYCL_DEVICE_ONLY__ to specify a different underlying type in host vs. device code. This seems weirdly specific because there are other ways an application can use that macro to cause types to have different representations on host vs. device.

I see where you're coming from.

I think the issue I'm having is that I'm not sure where the device compiler ends and where the implementation begins. There is a lot of source code in headers that is not part of an application, and the device compiler may be responsible for providing the device-compatible parts of the SYCL headers. It was also recommended above that we should add a new "C++ standard library support" subsection to the "Device compiler" section, suggesting that it's the device compiler that handles any required mapping of C++ standard library features from host-compatible ones to device-compatible ones.

My logic was that if the device compiler might be (re)defining types (as part of the SYCL headers or additions to standard C++ headers), then unless we say something here an implementation may assume that it is safe to define device enums with a different underlying fixed type to the equivalent host enum.

IMO we should remove the chapter "SYCL device compiler" and merge it into "SYCL programming interface". The spec should not mandate that there is a distinct "device compiler". Some implementations (using multi pass compiler) might have a "device compiler", but other implementations might not.

Therefore, I think the spec should not try to draw a distinction between "device compiler" and "implementation". If an implementation has a device compiler, it's just part of the implementation.

I agree 100%! I can never figure out where to introduce new sections in the specification, because I don't understand where the boundaries are drawn. Merging all the implementation stuff into one section would be great.

But without a boundary between the device compiler and implementation, I think what I said holds: the implementation includes some headers, and those headers might not (probably won't) be completely identical for the host and device. Do you agree that the specification needs to point out which parts of the headers are required to be identical? If so, why wouldn't we extend this to the underlying types of enums?

Can you give an example of some specific situation you want to prevent by adding this wording to the spec? That would help me understand the purpose of this text.

Perhaps my recent response here would have been more appropriate as a reply to this comment.

The C++ standard doesn't specify many enumeration types for the standard library. Where the C++ standard does specify an enumeration and doesn't explicitly specify an underlying type, I think implementors are free to specify a fixed underlying type if they desire. That is one way that type definitions might diverge.

SYCL 2020 has exactly one match for the term "standard library" and that is a reference to the C++ standard library in a note about differences from SYCL 1.2.1. I guess it is currently undefined which portions of the C++ standard library may be used in device code. Some portions can be inferred as required to be available in device code based on usage in SYCL interfaces (e.g., std::vector). It seems like a lot of work would be required to specify what is actually required to be usable in device code.

A strong +1 for excising the term "device compiler" from the SYCL specification :)

Clarify requirements for types in device code

dac49fd

Pennycook added the editorial Some purely editorial problem label Feb 18, 2025

Pennycook requested review from nliber and gmlueck February 18, 2025 15:48

tahonermann suggested changes Feb 19, 2025

View reviewed changes

Pennycook added 3 commits February 19, 2025 10:46

Remove sentence about device compiler options

22334ab

Fix typo: descrived

d3238a4

Clarify type requirements for enums

f4373e2

Pennycook added 3 commits February 20, 2025 16:41

Replace supported types with unsupported types

e690144

A list of types that cannot be used in device code is clearer than what we had before, and will help to ensure that we give due consideration to additional types introduced in future versions of ISO C++.

Add note about the availability of extended types

ab95d12

These are implementation-defined, so whether they are available in device code should also be implementation-defined.

Add new section for standard library support

7b489a9

This gives us a place to talk about things like: - Which C++ standard library features (e.g., type aliases) are guaranteed to work on the device; and - Whether there are any additional restrictions on C++ standard library behavior.

tahonermann reviewed Feb 22, 2025

View reviewed changes

adoc/chapters/device_compiler.adoc Outdated Show resolved Hide resolved

Fix incorrect reference to long double types

6f04c82

There is only one long double type. Co-authored-by: Tom Honermann <[email protected]>

tahonermann approved these changes Feb 24, 2025

View reviewed changes

nliber approved these changes Feb 24, 2025

View reviewed changes

gmlueck reviewed Feb 27, 2025

View reviewed changes

gmlueck removed the editorial Some purely editorial problem label Feb 27, 2025

gmlueck reviewed Feb 28, 2025

View reviewed changes

		All types which are available in device code must have the same size, alignment
		requirements, and representation on both host and device.


		For enumerations with a fixed underlying type, the same fixed type must be
		specified for host and device.

Clarify requirements for types in device code #733

Are you sure you want to change the base?

Clarify requirements for types in device code #733

Conversation

Pennycook commented Feb 18, 2025

nliber commented Feb 18, 2025

tahonermann left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Pennycook commented Feb 19, 2025

tahonermann commented Feb 19, 2025

TApplencourt commented Feb 24, 2025

Pennycook commented Feb 24, 2025

TApplencourt commented Feb 24, 2025 • edited Loading

nliber commented Feb 24, 2025

tahonermann left a comment

Choose a reason for hiding this comment

Pennycook commented Feb 25, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TApplencourt commented Feb 24, 2025 •

edited

Loading