-
Notifications
You must be signed in to change notification settings - Fork 13.9k
Replace NullOp::SizeOf and NullOp::AlignOf by lang items. #147793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Replace NullOp::SizeOf and NullOp::AlignOf by lang items.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (2cefd8f): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -0.3%, secondary -1.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -2.9%, secondary 6.8%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.0%, secondary -0.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 475.105s -> 474.369s (-0.15%) |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
fca4c69
to
27154a0
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Replace NullOp::SizeOf and NullOp::AlignOf by lang items.
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (ab65cea): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -1.3%, secondary -2.9%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 1.7%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.0%, secondary -0.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 473.871s -> 473.854s (-0.00%) |
error: InterpErrorInfo<'tcx>, | ||
) -> ErrorHandled { | ||
let (error, backtrace) = error.into_parts(); | ||
backtrace.print_backtrace(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one here should stay -- it's part of how RUST_CTFE_BACKTRACE
works. By default this is a fairly trivial NOP.
Instead of having the intrinsics const evaluated and requiring an actual body for the consts, we could also hijack const eval of these lang items directly and thus never have to do the work of actually evaluating their body. |
I find the intrinsics route less confusing:
- intrinsics for when the library needs to call into the compiler
- lang items for when the compiler needs to call into the library
However, if it makes a big enough perf difference, we could deviate from that pattern.
|
The perf regression is very small, sub-percent, does this warrant extra investigation? |
I do not think so. Results on real crates are net green. |
r? @oli-obk if you've got time, or reassign |
|
} | ||
|
||
pub const fn size_of<T>() -> usize { | ||
const { intrinsics::size_of::<T>() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no idea if it matters, but should this be
const { intrinsics::size_of::<T>() } | |
<T as SizedTypeProperties>::SIZE |
here too? (And ditto for align.)
let ty = self.monomorphize(ty); | ||
let layout = bx.cx().layout_of(ty); | ||
let val = match null_op { | ||
mir::NullOp::SizeOf => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(No action) I do like that this means the different backends don't need this any more, which is nice since it all needs to match what CTFE did anyway 👍
_2 = SizeOf(S); | ||
_3 = AlignOf(S); | ||
_4 = alloc::alloc::exchange_malloc(move _2, move _3) -> [return: bb1, unwind continue]; | ||
_2 = alloc::alloc::exchange_malloc(const <S as std::mem::SizedTypeProperties>::SIZE, const <S as std::mem::SizedTypeProperties>::ALIGN) -> [return: bb1, unwind continue]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lovely to see this inline and no longer need the locals 👍
+ _4 = const 4_usize; | ||
+ _5 = const 4_usize; | ||
+ _6 = alloc::alloc::exchange_malloc(const 4_usize, const 4_usize) -> [return: bb1, unwind unreachable]; | ||
_4 = alloc::alloc::exchange_malloc(const <i32 as std::mem::SizedTypeProperties>::SIZE, const <i32 as std::mem::SizedTypeProperties>::ALIGN) -> [return: bb1, unwind unreachable]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just confirming: the reason this is still the const, rather than 4_usize
is that this is a GVN unit test? Some other pass is going to evaluate it in the normal flow, right?
Or does it not matter because anything asking for try_eval_target_usize
on the const will get the value either way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's because GVN avoids to evaluate constants in-place. That does not really matter, any code that needs to evaluate it will easily. If we need it, it's a 3-line change in GVN, but I'd rather do it in a separate PR.
let align_def_id = tcx.require_lang_item(LangItem::AlignOf, source_info.span); | ||
let align_const = | ||
Const::from_unevaluated(tcx, align_def_id).instantiate(tcx, &[pointee_ty.into()]); | ||
let alignment = Operand::Constant(Box::new(ConstOperand { | ||
span: source_info.span, | ||
user_ty: None, | ||
const_: align_const, | ||
})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ymmv: since this chunk (or quite similar) happens in a couple places now, would it be worth adding, say mir::Operand::size_of
+ mir::Operand::align_of
? (Looks like it probably could have both of them call a private helper taking LangItem
, or something.)
I think that'd both be easier for the next people who need it to find, as well as more obvious to read in the places that just want the size/align in the code.
/// Returns the size of a value of that type. | ||
SizeOf, | ||
/// Returns the minimum alignment of a type. | ||
AlignOf, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI @celinval
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments along the way, but this looks good to me. r=me
with minor updates to address things if you see fit. (Or wait for oli if you think this should go through someone more knowledgeable about const stuff.)
Part of #146411
Fixes #119729
Keeps #136175 as it involves
offset_of!
which this PR does not touch.r? @ghost