Skip to content

Conversation

@AdrianLundell
Copy link
Collaborator

@AdrianLundell AdrianLundell commented Dec 8, 2025

Adds support for broadcasting in the special case where one input contains only channels which are broadcasted onto every spatial element of the other tensor, e.g. [1,C,1,1] + [N, C, H, W] for channel-last tensors.

This is a needed for mobilenet_v3.

cc @freddan80 @per @zingo @oscarandersson8218 @digantdesai

Adds support for broadcasting in the special case where one input
contains only channels which are broadcasted onto every spatial element
of the other tensor, e.g. [1,C,1,1] + [N, C, H, W] for channel-last
tensors.

This is a needed for mobilenet_v3.

Signed-off-by: Adrian Lundell <[email protected]>
Change-Id: I44b1f276b8ef3c4a1a456b402ba8f53d8c870803
@AdrianLundell AdrianLundell requested a review from psiddh December 8, 2025 14:46
@AdrianLundell AdrianLundell added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm ciflow/trunk release notes: none Do not include this in the release notes labels Dec 8, 2025
@pytorch-bot
Copy link

pytorch-bot bot commented Dec 8, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16131

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 984465d with merge base 488d761 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 8, 2025
@psiddh
Copy link
Contributor

psiddh commented Dec 9, 2025

This PR does not implement general broadcasting semantics, but rather adds support for a very specific broadcasting case required for MobileNet v3 (MV3).
In general, should we aim to support full broadcasting semantics?

@AdrianLundell
Copy link
Collaborator Author

My two cents is that full broadcasting support takes engineering effort, code size, and likely performance tradeoffs, and on the other hand the selection of networks which makes sense to run on MCUs is fairly limited and might not use a lot of broadcasting.

So in my view only implementing what is used makes sense here, until we see a large need for full broadcasting support (if ever).

@psiddh
Copy link
Contributor

psiddh commented Dec 9, 2025

My two cents is that full broadcasting support takes engineering effort, code size, and likely performance tradeoffs, and on the other hand the selection of networks which makes sense to run on MCUs is fairly limited and might not use a lot of broadcasting.

So in my view only implementing what is used makes sense here, until we see a large need for full broadcasting support (if ever).

Seems reasonable, if more usecases / models start requiring general broadcasting, we could look at supporting it more broadly

@AdrianLundell AdrianLundell merged commit 0c54fd0 into pytorch:main Dec 10, 2025
297 of 299 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: none Do not include this in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants