-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Tracking issue for vec_into_raw_parts
#65816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Should the returned pointer be a |
Add {String,Vec}::into_raw_parts Aspects to address: - [x] Create a tracking issue - rust-lang#65816
Add {String,Vec}::into_raw_parts Aspects to address: - [x] Create a tracking issue - rust-lang#65816
Should these functions be associated functions like |
Here's an example where this might have helped with avoiding UB: confio/go-rust-demo#1 (comment). There, the vector was destructed manually and |
Updated tracking issue number Added safeguards for transmute_vec potentially being factored out elsewhere Clarified comment about avoiding mem::forget Removed unneeded unstable guard Added back a stability annotation for CI Minor documentation improvements Thanks to @Centril's code review Co-Authored-By: Mazdak Farrokhzad <[email protected]> Improved layout checks, type annotations and removed unaccurate comment Removed unnecessary check on array layout Adapt the stability annotation to the new 1.41 milestone Co-Authored-By: Mazdak Farrokhzad <[email protected]> Simplify the implementation. Use `Vec::into_raw_parts` instead of a manual implementation of `Vec::transmute`. If `Vec::into_raw_parts` uses `NonNull` instead, then the code here will need to be adjusted to take it into account (issue rust-lang#65816) Reduce the whitespace of safety comments
If we view this as being the opposite of
|
I think it’s not at all obvious that deliberately making this method "weird" is a good thing. |
The method is weird either way. "This uses the vec by |
I don’t see how that makes it weird. |
But into_boxed_slice doesn't leak the memory |
Well,the documentation does a great job telling you how to avoid a leak and also leaking it is not unsafe. |
I don't disagree with any of that, but also my opinion is unchanged. It's not a big deal to have this as associated function vs method, but But again it's a totally unimportant difference either way, and we should just stabilize either form and get on with the day. |
In my opinion functions that start with |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@rust-lang/libs Any thoughts on the unresolved questions in the issue description? I’d be inclined to say:
|
@SimonSapin SGTM. The missing parallelism between definitions of |
Not quite; the slice API has a few of them:
To be fair, none of those methods have the same length vs. capacity confusion.
Most of these are opaque types designed only to carry trait impls such as |
@LegionMammal978 Fair would be to say it's just 2, variations doesn't count,
My first point was to say add a struct is common. I don't understand the rest of your point look like HS.
The crate raw-parts clearly show advantage, have a "Raw builder" API for Vec allow to help user avoid mistake using from_vec, better this allow to add a nice into_vec that also remove the potentially mistake of This mean a user can get the struct, mutate only what needed, and use it to reconstruct the vec. That a BIG plus when the feature is mean to use unsafe to have every small help you can. |
Hi everyone, just wanted to ask what the status on this issue was, and if the agreed-upon suggestion of this thread was to use the raw-parts crate for stable rust toolchains. |
Seems to work. Don't know what the tradeoffs are. |
In stable Rust, you can safely decompose a use std::mem::ManuallyDrop;
pub fn into_raw_parts<T>(vec: Vec<T>) -> (*mut T, usize, usize) {
let mut vec = ManuallyDrop::new(vec);
let length = vec.len();
let capacity = vec.capacity();
(vec.as_mut_ptr(), length, capacity)
} The main downside (aside from verbosity) is that you have to be very careful: after calling Removing the need for this incantation is the purpose of the proposed |
I'm sorry, what? UB? Can't touch? Discussing such things as UB or not-UB is out of scope for the issue. |
|
That
The signature of |
I don’t quite see how what you wrote invalidates what I wrote. So long as you don’t modify vector after taking the pointer, you can do pretty much whatever you want with the vector and the pointer remains valid. And to put it bluntly, if the following: let mut vec = vec![1];
let ptr = vec.as_mut_ptr();
let len = vec.size();
core::mem::forget(vec);
println!("{}", unsafe { *ptr }); is undefined behaviour than that’s the language defect that needs to be addressed regardless of |
Yes, I agree. The simple solution is for the stdlib to prevent the returned pointer from carrying However, this is outside the scope of the issue because Fwiw, Legion is likely alluding to https://llvm.org/docs/LangRef.html#noalias when applied to function return types. Yielding a noalias raw pointer from But again, this isn't what's currently happening today. And it also doesn't really matter for this particular API because the usage is largely the same. |
This is false regardless of fn main() {
let mut vec = vec![1];
let ptr = vec.as_mut_ptr();
let slice = vec.as_mut_slice();
println!("{}", unsafe { *ptr });
let _slice = slice; // UB
} Forming and reborrowing references is always a potentially dangerous operation under Stacked Borrows, when they are mixed with raw pointers.
Then take it up with the UCG WG in rust-lang/unsafe-code-guidelines#326 or on Zulip.
Sure, it's just that someone was asking how to perform the equivalent of |
This is UB because it creates two mutable references to the same object: one through
Which doesn’t change the fact that after calling
Well, Miri doesn’t complain about the code so I don’t think I need to bring it up since the code appears to be sound. |
That's because the current implementation does not, in fact, implement In fact, PR #94421 experimented with making this exact change; unfortunately, it cannot be demonstrated directly, since Miri at the time of the PR did not yet support recursively retagging references in fn main() {
let mut boxed_slice = vec![1].into_boxed_slice();
let ptr = boxed_slice.as_mut_ptr();
std::mem::forget(boxed_slice);
println!("{}", unsafe { *ptr }); // UB
} error: Undefined Behavior: attempting a read access using <3158> at alloc1501[0x0], but that tag does not exist in the borrow stack for this location
--> src/main.rs:5:29
|
5 | println!("{}", unsafe { *ptr }); // UB
| ^^^^
| |
| attempting a read access using <3158> at alloc1501[0x0], but that tag does not exist in the borrow stack for this location
| this error occurs as part of an access at alloc1501[0x0..0x4]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <3158> was created by a SharedReadWrite retag at offsets [0x0..0x4]
--> src/main.rs:3:15
|
3 | let ptr = boxed_slice.as_mut_ptr();
| ^^^^^^^^^^^^^^^^^^^^^^^^
help: <3158> was later invalidated at offsets [0x0..0x4] by a Unique retag
--> src/main.rs:4:22
|
4 | std::mem::forget(boxed_slice);
| ^^^^^^^^^^^
= note: BACKTRACE (of the first span):
= note: inside `main` at src/main.rs:5:29: 5:33
note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace |
I stand corrected (and baffled at the same time). Thanks. |
The clear solution is for Vec to never be noalias because otherwise it's terrible. But this is the tracking issue for The simple solution is this, if you genuinely have trouble remembering it's length then capacity, just write your own helper. There's no need to pollute the stdlib API itself because a caller can't remember off-hand that it's len-cap, not cap-len. Now, let's stabilize the feature already. |
The discussion above is an argument for stabilizing |
Whatever it takes to get it stabilized. Now, I need to help the unsafe WG realize they shouldn't ruin Vec like Box. |
No one has ever mentioned That being said, I think it is better to return a non-null pointer in this case. It's just that this should be a |
Why would covariance be an issue here? The returned pointer is unique and represents ownership, in which case covariance should be fine, maybe even desirable. (with the very limited understanding I have on this topic, so take this with a grain of salt) And I think this is one of the (two) usecases |
There's a somewhat related discussion in #119206 (comment). |
I don't entirely think NonNull is necessary here. The stdlib already has a precedent of using raw pointers, even when they're intended to be non-null. It's important not to over-engineer and under-deliver. There's already an established convention and users can easily create a NonNull themselves. |
(I’ve discussed this with @shepmaster for a bit before writing this) Stabilization reportImplementation history
API summaryimpl<T> Vec<T, Global> {
pub fn into_raw_parts(vec: Self) -> (*mut T, usize, usize) {…}
}
impl String {
pub fn into_raw_parts(string: Self) -> (*mut u8, usize, usize) {…}
} (Proposed) answers to the open questions
I do not feel strongly about either of these and I’m ready to write a patch to change it if t-libs-api has a different opinion. I’d just like to see this stabilized, because doing this manually is error-prone and can easily lead to UB. Experience reportExample usages
I don’t think I have access to ping the team here, so I would be glad if someone did it for me. I would like to know if someone from t-libs-api feels strongly on either of the open issues, so I can change it if needed and move this forward. |
This feature is a helper method to avoid, as_ptr(), len(), capacity(), and so mistake, return a tuple and not named field will make this feature not more safe than call these 3 methods. I don't really see the strong point of "catch in test".
That on contrary more and more a thing with modern API added to std. I repeat myself, but I can only warn that a tuple with two field of same type will cause trouble as a return value of a function that doesn't give any hint about what is what. |
It is more safe, as you don’t need to suppress drop, which is easy to do incorrectly (e.g. by using
Can you name an example of a similar POD type in std?
It is the first line of its documentation. In any case, I’m happy to change it to a POD type if t-libs-api feels the same way. As I have said, I really don’t care that much about exact API of this function if it helps to get it stabilized. |
This is a chicken and egg problem. Because NonNull is underutilised in the standard library, no new methods are created to work with NonNull, which leads to NonNull being underutilised.
No. Primary use case for NonNull is to indicate that the pointer is not null. This allows encoding additional invariants in the type system.
This isn’t an argument in favour of not returning NonNull. We should have this xor the other feature. Not both of them. Having said that, I don’t have strong opinions here. I would like to more use of NonNull (and similarly better support for NonZero* but that’s a completely separate discussion), but I also see arguments for consistency with existing status quo.
What functions are you referring to? Other
For me, ergonomics of a tuple is just much better at the moment. I really don’t want to write |
@rust-lang/libs-api |
Uh oh!
There was an error while loading. Please reload this page.
#65705 adds:
Things to evaluate before stabilization
NonNull<* mut T>
?The text was updated successfully, but these errors were encountered: