-
Notifications
You must be signed in to change notification settings - Fork 63
Open
Description
Issue automatically imported from old repo: EmbarkStudios/rust-gpu#8
Old labels: t: design
Originally creatd by Jasper-Bekkers on 2020-08-18T17:33:10Z
We want to have the same types of safety on the GPU as we do on the CPU. One of those areas that make it easy to introduce race conditions is with memory barriers. HLSL and GLSL require the programmer to explicitly put them in the right locations without any formal verifiction.
This is an attempt at solving that within the Rust type system (attempt still has racy bugs and is largely incomplete).
use core::ops::Deref;
use core::ops::DerefMut;
use core::ops::Index;
use core::ops::IndexMut;
struct MyFoo {
test: GroupSharedArray<u32>,
}
struct GroupSharedArray<T>
where
T: Default + Copy,
{
data: Box<[T]>,
size: usize,
}
impl<T: Default + Copy> GroupSharedArray<T> {
fn new() -> Self {
Self {
size: 100,
data: Box::new([T::default(); 100]),
}
}
}
struct GroupSharedWriter<T> {
data: T,
}
impl<T> Deref for GroupSharedWriter<T> {
type Target = T;
fn deref(&self) -> &Self::Target {
&self.data
}
}
impl<T> DerefMut for GroupSharedWriter<T> {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.data
}
}
impl<T> GroupSharedWriter<T> {
fn new(data: T) -> Self {
Self { data }
}
fn barrier(self) -> GroupSharedReader<T> {
GroupSharedReader { data: self.data }
}
}
struct GroupSharedReader<T> {
data: T,
}
impl<T> Deref for GroupSharedReader<T> {
type Target = T;
fn deref(&self) -> &Self::Target {
&self.data
}
}
impl<T> GroupSharedReader<T> {
fn barrier(self) -> GroupSharedWriter<T> {
GroupSharedWriter { data: self.data }
}
}
impl<T: Default + Copy> Index<Uniform<u32>> for GroupSharedArray<T> {
type Output = T;
fn index(&self, index: Uniform<u32>) -> &Self::Output {
&self.data[index.data as usize]
}
}
impl<T: Default + Copy> IndexMut<Uniform<u32>> for GroupSharedArray<T> {
fn index_mut(&mut self, index: Uniform<u32>) -> &mut Self::Output {
&mut self.data[index.data as usize]
}
}
struct Uniform<T> {
data: T,
}
impl<T> Uniform<T> {
fn new(data: T) -> Self {
Self { data }
}
}
fn thread_idx() -> Uniform<u32> {
Uniform::new(0)
}
fn main() {
let mut data = GroupSharedWriter::new(MyFoo {
test: GroupSharedArray::new(),
}); // new does barrier
data.test[thread_idx()] = thread_idx().data;
data.test[thread_idx()] = 666; // should be illegal
let value = data.test[thread_idx()]; // should be illegal
//data.test[0] = thread_idx().data; // what happens in this case? it's unsafe/unsound. race cnd over contents of `test`
let other_data = data.barrier(); // `barrier` return GroupSharedReader
// some race-y cases handled correctly:
// data.test[thread_idx()] = 666; // is correctly marked illegal
// other_data.test[thread_idx()] = 666; // is correctly marked illegal
let test = other_data.data.test[thread_idx()];
}
// tl;dr
// - use move semantics to enforce barriers betwen reading and writing
// - broken right now because DerefMut and IndexMut require Deref and Index to be implemented
// - broken right now because we can write to the same location multiple times (data race)
Activity
rust-gpu-bot commentedon Nov 13, 2024
Comment from Jasper-Bekkers (CONTRIBUTOR) on 2020-08-18T17:34:50Z
The
Uniform
type currently is very limited, but it prevents neighboring threads from writing into each other's LDS buckets concurrently (because we can't do any sort of math on it). However, we should investigate if we can relax this a bit, especially when reading it should be totally fine to index into a neighbors groupshared memory.rust-gpu-bot commentedon Nov 13, 2024
Comment from Jasper-Bekkers (CONTRIBUTOR) on 2020-08-19T10:16:31Z
Examples from @h3r2tic and others using group shared memory:
id * 2
andid * 2 + 1
): https://github.com/h3r2tic/rtoy-samples/blob/6aeb26230ea3f4b98ef34d522b38fc7be865a2d6/assets/shaders/hybrid-adaptive-shadows/prefix_scan_2d_vertical.glslrust-gpu-bot commentedon Nov 13, 2024
Comment from Tobski on 2020-10-23T00:00:20Z 👍: 1
We had the following discussion in the Discord about trying to bring some of Rust's "fearless concurrency" to rust-gpu, by tweaking the way that accesses to shared memory (Buffers/UAVs, Workgroup memory, etc.) is managed when translating to SPIR-V. Just so we don't lose the conversation, I've pasted it below - can try to tease out more details later but for now we just want to be able to stow this so we don't lose it...
rust-gpu-bot commentedon Nov 13, 2024
Comment from Tobski on 2020-10-23T00:09:28Z
One additional detail that got discussed after this little back and forth was that the aim here is to get borrow checking working across GPU threads - being able to know explicitly that indexing into the buffer is safe. For instance, if you're mutably borrowing from a buffer via a unique index (say indexing by the workgroup ID per workgroup), then a compile time check should be able to tell you your access is safe. For debugging, it should also be possible to automatically insert (somewhat expensive) checks that invocations aren't racing. There's probably also a place in there somewhere for something akin to rust's explicit lifetimes to allow more complex indexing to be compile-time checked.
From the SPIR-V side the aim is to make painless things that typically developers get frustrated over or make mistakes with - acquire/release/barriers are a pain in the prominent shading languages, and experience shows developers very much tend to get them wrong. The proposal discussed above takes that decision making away from the developer and hard codes what the most useful case as a default behavior - unlike GLSL/HLSL. We may want to provide hooks for less common patterns further down the line, but for now this behaviour should cover the needs of any developer that isn't trying to experiment with synchronization patterns...
rust-gpu-bot commentedon Nov 13, 2024
Comment from Tobski on 2020-10-23T13:46:29Z
Putting this on the correct issue this time....
Ok so turning this into something a bit more concrete, I think I'd propose three new attributes that affect struct definitions declared as storage buffers/images/texel buffers (UAVs), shared memory (LDS), or anything else we somehow end up being able to touch from multiple threads:
Each of these attributes modifies what happens during borrow and drop, and memory semantics used for OpLoad/OpStore when accessing references to these structs.
Note: If none of these attributes are present it should be a compiler error to borrow a mutable reference from the struct, (unless the code is declared unsafe?).
Scope
The word before "Visible" in the attribute determines the scope at which writes to the structure are made visible to other shader invocations.
So "SubgroupVisible" indicates that other invocations in the same subgroup can see values written by other invocations, assuming correct synchronization.
This scope is used as the
%scope
operand for the instructions detailed below.Acquire on Borrow
When a reference is borrowed from the structure, an OpMemoryBarrier should be performed as follows:
Release on Drop
When a reference to the structure is dropped, an OpMemoryBarrier should be performed as follows:
Loads
When a reference to the structure is read from, an OpLoad should be generated with the following Memory Operands:
or similarly for images:
Stores
When a reference to the structure is written to, an OpStore should be generated with the following Memory Operands:
or similarly for images:
rust-gpu-bot commentedon Nov 13, 2024
Comment from khyperia (CONTRIBUTOR) on 2020-10-26T08:26:47Z
A couple issues:
OpAccessChain
to a field on the struct? All attributes for the resulting pointer would be dropped, and we would no longer be able to emit the proper memory operands for load/store.rust-gpu-bot commentedon Nov 13, 2024
Comment from Tobski on 2020-10-26T12:32:00Z
@khyperia Thanks for looking at this!
My understanding is that it's possible to override borrow/drop in the language itself, rather than in the backend, so I was imagining this be dealt with at least partially above the compiler; perhaps emitting some sort of intrinsic that the backend would turn into the right memory barriers. I was experimenting with this on the rust playground and made some decent progress in getting a struct in regular rust to output something (println! with the relevant instruction) at borrow/drop, but like a fool I hadn't saved it locally and lost it all when my browser tab crashed 🤦♀️
Ok so again I'm not necessarily expecting this to be a wholly backend issue. My hope here is that we can coax the front-end into tagging the variables (and/or the variable accesses), such that when it gets to the backend all you'd need to do is check if a relevant tag is there or not. Unlike the first issue you pointed out, I don't have a handle on how this might be done with rust's attribute system yet - so this is kind of an open issue that I'm hoping we can have more discussion on.
rust-gpu-bot commentedon Nov 13, 2024
Comment from Tobski on 2020-10-27T17:20:36Z 👍: 1
So I've been thinking about the implementation complexity, and given the newness of this project and that it's probably useful to bootstrap this in the short term, maybe there should be two phases to this? The initial phase will have dedicated acquire/release and load/store ops, and then the more complex/ergonomic thing here can be a future investigation? Unless anyone thinks that's a terrible idea I'll write this into the draft RFC I'm working on.