Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference time for EfficientFormerV2 on 2080Ti #51

Open
zcbfpramtqs55675 opened this issue Feb 22, 2023 · 3 comments
Open

Inference time for EfficientFormerV2 on 2080Ti #51

zcbfpramtqs55675 opened this issue Feb 22, 2023 · 3 comments

Comments

@zcbfpramtqs55675
Copy link

Hi, I tested EfficientFormerV2-s0 and EfficientFormerV2-s2 on 2080Ti, the input size is 1x3x224x224, and got the result as follows:
EfficientFormerV2-s2: about 24ms/per input,
EfficientFormerV2-s0: about 22ms/per input
Is this reasonable? seems different from A100 results in your paper. Any reply is appreciated, thx.

@giantmonkeyTC
Copy link

I tested V2-s1 on 3090 and got 14ms/per sample.
I don't know why either. If you figure it out, could you please reply to this thread?

@alanspike
Copy link
Collaborator

Hi, we use TensorRT to benchmark the latency. Here is the docker image.

@caixiongjiang
Copy link

caixiongjiang commented Apr 26, 2023

I used GeForce RTX3060 on my segmentation model using the EfficientFormerV2-S0 backbone and the PoolFormer-S12 backbone. The FPS results are 61 frame/s and 108 frame/s. I don't think this backbone is very universal in calculating the time, and it is specially designed for iPhone. It is similar to such segmentation models as ENet, although the number of parameters and the amount of computation are relatively small, the actual computation time is very long, which may be related to the fact that Pytorch has no acceleration for this part of the computation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants