Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elem::Bool support #304

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

ArthurBrussee
Copy link
Contributor

@ArthurBrussee ArthurBrussee commented Nov 25, 2024

WIP: Use actual Elem::Bool instead of rewriting them to u32 early on.

Copy link
Member

@nathanielsimard nathanielsimard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For most types, we actually don't want to perform any allocations. It seems we are using to_elem_data everywhere instead of cast_slice. For large tensors, that will be a problem.

I'm not sure we actually want to force the ability to store booleans in global memory. Users should use another type for that and perform the correct conversion.

@@ -171,7 +171,8 @@ impl Elem {
UIntKind::U32 => core::mem::size_of::<u32>(),
UIntKind::U64 => core::mem::size_of::<u64>(),
},
Elem::Bool => core::mem::size_of::<bool>(),
// Currently, bools are represented as u32 in the backend.
Elem::Bool => core::mem::size_of::<u32>(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't true! Bools are still boleans in the kernel, we just can't load and save them to global memory. That's why we're using u32 in Burn to represent BoolTensor, but when fusing them, they never actually materialize as u32.

use crate::{
flex32,
ir::{Elem, FloatKind, IntKind, UIntKind},
prelude::Numeric,
};

/// The base element trait for the jit backend.
pub trait CubeElement: core::fmt::Debug + Send + Sync + 'static + Clone + bytemuck::Pod {
pub trait CubeElement: core::fmt::Debug + Send + Sync + 'static + Clone + NoUninit {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's wise to remove bytemock::Pod. Some primitive types just can't be loaded into global memory like bools. And I don't think we want to force a representation for booleans. At some point we might simply encode 32 bools into a single u32 to save space and using masking to retrieve the right value, but that would require padding. And that part should be done in Burn, like quantization.

Comment on lines +49 to +50
fn from_elem_data(bytes: Vec<u8>) -> Vec<Self> {
bytemuck::cast_slice(bytes.as_slice()).to_vec()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can be very very slow, I used unsafe in TensorData in Burn to avoid using this.

@ArthurBrussee ArthurBrussee marked this pull request as draft November 25, 2024 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants