Elem::Bool support #304

ArthurBrussee · 2024-11-25T15:09:06Z

WIP: Use actual Elem::Bool instead of rewriting them to u32 early on.

nathanielsimard

For most types, we actually don't want to perform any allocations. It seems we are using to_elem_data everywhere instead of cast_slice. For large tensors, that will be a problem.

I'm not sure we actually want to force the ability to store booleans in global memory. Users should use another type for that and perform the correct conversion.

nathanielsimard · 2024-11-25T17:26:08Z

crates/cubecl-core/src/ir/kernel.rs

@@ -171,7 +171,8 @@ impl Elem {
                UIntKind::U32 => core::mem::size_of::<u32>(),
                UIntKind::U64 => core::mem::size_of::<u64>(),
            },
-            Elem::Bool => core::mem::size_of::<bool>(),
+            // Currently, bools are represented as u32 in the backend.
+            Elem::Bool => core::mem::size_of::<u32>(),


This isn't true! Bools are still boleans in the kernel, we just can't load and save them to global memory. That's why we're using u32 in Burn to represent BoolTensor, but when fusing them, they never actually materialize as u32.

nathanielsimard · 2024-11-25T17:31:06Z

crates/cubecl-core/src/pod.rs

 use crate::{
    flex32,
    ir::{Elem, FloatKind, IntKind, UIntKind},
    prelude::Numeric,
 };

 /// The base element trait for the jit backend.
-pub trait CubeElement: core::fmt::Debug + Send + Sync + 'static + Clone + bytemuck::Pod {
+pub trait CubeElement: core::fmt::Debug + Send + Sync + 'static + Clone + NoUninit {


I don't think it's wise to remove bytemock::Pod. Some primitive types just can't be loaded into global memory like bools. And I don't think we want to force a representation for booleans. At some point we might simply encode 32 bools into a single u32 to save space and using masking to retrieve the right value, but that would require padding. And that part should be done in Burn, like quantization.

nathanielsimard · 2024-11-25T17:31:40Z

crates/cubecl-core/src/pod.rs

+    fn from_elem_data(bytes: Vec<u8>) -> Vec<Self> {
+        bytemuck::cast_slice(bytes.as_slice()).to_vec()


That can be very very slow, I used unsafe in TensorData in Burn to avoid using this.

Elem::Bool support

dfb7b6b

ArthurBrussee mentioned this pull request Nov 25, 2024

Elem bool tracel-ai/burn#2537

Closed

nathanielsimard reviewed Nov 25, 2024

View reviewed changes

ArthurBrussee marked this pull request as draft November 25, 2024 17:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Elem::Bool support #304

Elem::Bool support #304

ArthurBrussee commented Nov 25, 2024 •

edited

Loading

nathanielsimard left a comment

nathanielsimard Nov 25, 2024

nathanielsimard Nov 25, 2024

nathanielsimard Nov 25, 2024

		fn from_elem_data(bytes: Vec<u8>) -> Vec<Self> {
		bytemuck::cast_slice(bytes.as_slice()).to_vec()

Elem::Bool support #304

Are you sure you want to change the base?

Elem::Bool support #304

Conversation

ArthurBrussee commented Nov 25, 2024 • edited Loading

nathanielsimard left a comment

Choose a reason for hiding this comment

nathanielsimard Nov 25, 2024

Choose a reason for hiding this comment

nathanielsimard Nov 25, 2024

Choose a reason for hiding this comment

nathanielsimard Nov 25, 2024

Choose a reason for hiding this comment

ArthurBrussee commented Nov 25, 2024 •

edited

Loading