Open
Description
Consider the following code:
async fn foo() {
let mut x = get_future();
dbg!(std::mem::size_of_val(&x));
x.await
}
Today, having the dbg!
line roughly doubles the size of the future returned by foo
. More precisely, it causes us to allocate storage for x
twice (once for x
, and once for the pinned
variable that x
is moved into inside the await
).
This unfortunate state of events is caused by the fact that we cannot "peer into" size_of_val
and see that the address of x
never escapes the function. Without knowing this, we can't optimize away the storage of x
once it has been moved.
This was first discussed here: #59123 (comment). One promising solution (suggested by @RalfJung) is that we inline all occurrences of size_of_val
(and possibly some other intrinsics) in a MIR pass, before we get to the generator transform.
Metadata
Metadata
Assignees
Labels
Area: CoroutinesCategory: An issue proposing an enhancement or a PR with one.Category: An issue highlighting optimization opportunities or PRs implementing suchIssue: Problems and improvements with respect to binary size of generated code.Relevant to the compiler team, which will review and decide on the PR/issue.
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
Centril commentedon Jul 3, 2019
It would be good to elaborate on why
size_of_val
deserves an optimization specifically; you said something about not wanting to have our own "uncertainty principle"?tmandry commentedon Jul 3, 2019
My thinking is that people will often want to use
size_of_val
to debug the size of their futures. For example, they might use it to look at the size of the sub-futures they await on, as in the example above.In particular, since you can't name the type returned by an
async fn
today, you have to usesize_of_val
and borrow the future.It's definitely going to confuse people that adding
size_of_val
to their async fn makes the future returned by the async fn bigger. It makes everything about optimizing for size seem brittle and confusing. I think this might be okay for an MVP (and you can still usesize_of_val
as long as you only look at one "nesting level" at a time), but it definitely isn't desirable from a "maturity" standpoint.Matthias247 commentedon Jul 4, 2019
I'm still not sure why
size_of_val
is the problem? It seems like anything that references a future before it's awaited would cause the issue?Shouldn't this not somehow proved by the borrow-checker?
I'm obviously not into compiler details and therefore don't understand the exact approach, but I'm wondering whether
x
could be treated as a real stack object as long as certain conditions have not been met (e.g. it isn't.await
ed or pinned yet). Opposed to also living in the generator storage.x
could be directly allocated in the generator storage once and then doesn't need another copy. Maybe this doesn't work because some kind of return value optimization is missing?withoutboats commentedon Jul 4, 2019
Why can't
x
be pinned in the location it is at atsize_of_val
? To my naive perspective, it seems like the problem is with await doing an actual memcpy move, instead of merely conceptually consuming the value?tmandry commentedon Jul 8, 2019
@Matthias247 #59087 tracks the problem of borrows disabling optimizations in general. In order to fix this general problem, we probably need
foo(&x)
.For
size_of_val(&x)
, it turns out that we only need (2) and not (1), because once we inlinesize_of_val
the borrow vanishes (we replace the entire expression with a static size).However, this comes with its own challenge (3): we need to compute the size of our type before MIR inlining, or otherwise stick a "placeholder" value in MIR that gets replaced later. We'll have to tackle this challenge regardless of whether we solve the "general" problem, so this issue exists to track this particular case.
EDIT: Now I see that the example in #59087 was originally using
size_of_val
, but I think it's good to have one issue that's tracking the broader case.tmandry commentedon Jul 8, 2019
@withoutboats Yes, that would work, and it might be easier than the solution I'm describing. We can't do this with a desugaring today, though, which AFAIK is how we implement
.await
. @cramertj tried to do something like this withdrop_in_place
, but it requires the original binding to be mutable: #59123 (comment)cramertj commentedon Jul 8, 2019
This is one of the reasons I've been pushing for
let mut
to be a deny-by-default lint rather than a hard error.tmandry commentedon Jul 8, 2019
Actually, could we "inline"
size_of_val(&x)
tosize_of::<T>()
, even thoughT
is not nameable in the surface syntax? That would be a pretty clean solution that sidesteps issues with monomorphization. cc @eddybtmandry commentedon Jul 8, 2019
An alternative to inlining is to eliminate the move in
await
, sidestepping this issue altogether. See #59087 (comment).15 remaining items