-
Notifications
You must be signed in to change notification settings - Fork 253
Restore Enzyme to CI checks #2807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
wsmoses
wants to merge
1
commit into
master
Choose a base branch
from
wsmoses-patch-1
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUDA.jl Benchmarks
Benchmark suite | Current: 2afe6c6 | Previous: bb0a27f | Ratio |
---|---|---|---|
latency/precompile |
56628328674.5 ns |
56531370914 ns |
1.00 |
latency/ttfp |
8465367383.5 ns |
8348294090 ns |
1.01 |
latency/import |
4530861281 ns |
4485476728 ns |
1.01 |
integration/volumerhs |
9606804.5 ns |
9611273 ns |
1.00 |
integration/byval/slices=1 |
146932 ns |
146807 ns |
1.00 |
integration/byval/slices=3 |
426049 ns |
426057.5 ns |
1.00 |
integration/byval/reference |
145114 ns |
145125 ns |
1.00 |
integration/byval/slices=2 |
286406 ns |
286538 ns |
1.00 |
integration/cudadevrt |
103457 ns |
103588 ns |
1.00 |
kernel/indexing |
14186 ns |
14206 ns |
1.00 |
kernel/indexing_checked |
15006.5 ns |
14985 ns |
1.00 |
kernel/occupancy |
673.343949044586 ns |
669.9113924050633 ns |
1.01 |
kernel/launch |
2168.6666666666665 ns |
2201.5555555555557 ns |
0.99 |
kernel/rand |
17115 ns |
15084 ns |
1.13 |
array/reverse/1d |
20177.5 ns |
20085 ns |
1.00 |
array/reverse/2dL_inplace |
66812 ns |
66806.5 ns |
1.00 |
array/reverse/1dL |
70343 ns |
70229 ns |
1.00 |
array/reverse/2d |
21840 ns |
21647 ns |
1.01 |
array/reverse/1d_inplace |
11477 ns |
9989.5 ns |
1.15 |
array/reverse/2d_inplace |
13255 ns |
13425 ns |
0.99 |
array/reverse/2dL |
73906 ns |
73806 ns |
1.00 |
array/reverse/1dL_inplace |
66901 ns |
66788 ns |
1.00 |
array/copy |
20827 ns |
21013 ns |
0.99 |
array/iteration/findall/int |
156962 ns |
158131 ns |
0.99 |
array/iteration/findall/bool |
139393.5 ns |
139780 ns |
1.00 |
array/iteration/findfirst/int |
161049 ns |
161255 ns |
1.00 |
array/iteration/findfirst/bool |
161777.5 ns |
162292 ns |
1.00 |
array/iteration/scalar |
71086 ns |
74249 ns |
0.96 |
array/iteration/logical |
214952 ns |
215568.5 ns |
1.00 |
array/iteration/findmin/1d |
50399 ns |
50621 ns |
1.00 |
array/iteration/findmin/2d |
96313 ns |
96593 ns |
1.00 |
array/reductions/reduce/Int64/1d |
43812 ns |
43717 ns |
1.00 |
array/reductions/reduce/Int64/dims=1 |
45159.5 ns |
47315 ns |
0.95 |
array/reductions/reduce/Int64/dims=2 |
61368 ns |
61512 ns |
1.00 |
array/reductions/reduce/Int64/dims=1L |
89073 ns |
89114 ns |
1.00 |
array/reductions/reduce/Int64/dims=2L |
87814 ns |
88127 ns |
1.00 |
array/reductions/reduce/Float32/1d |
36495 ns |
38052 ns |
0.96 |
array/reductions/reduce/Float32/dims=1 |
42038 ns |
42212.5 ns |
1.00 |
array/reductions/reduce/Float32/dims=2 |
59995 ns |
59909 ns |
1.00 |
array/reductions/reduce/Float32/dims=1L |
52466 ns |
52382 ns |
1.00 |
array/reductions/reduce/Float32/dims=2L |
72317 ns |
72498 ns |
1.00 |
array/reductions/mapreduce/Int64/1d |
43193 ns |
43848 ns |
0.99 |
array/reductions/mapreduce/Int64/dims=1 |
47375 ns |
55194.5 ns |
0.86 |
array/reductions/mapreduce/Int64/dims=2 |
61631 ns |
61712 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=1L |
89051 ns |
88933 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=2L |
88042 ns |
88319 ns |
1.00 |
array/reductions/mapreduce/Float32/1d |
37264.5 ns |
37106 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=1 |
52288.5 ns |
41792 ns |
1.25 |
array/reductions/mapreduce/Float32/dims=2 |
60482 ns |
60056 ns |
1.01 |
array/reductions/mapreduce/Float32/dims=1L |
53010 ns |
52707 ns |
1.01 |
array/reductions/mapreduce/Float32/dims=2L |
72741 ns |
72500 ns |
1.00 |
array/broadcast |
20175 ns |
20185 ns |
1.00 |
array/copyto!/gpu_to_gpu |
11462 ns |
13288 ns |
0.86 |
array/copyto!/cpu_to_gpu |
214887 ns |
216328 ns |
0.99 |
array/copyto!/gpu_to_cpu |
283937 ns |
284828 ns |
1.00 |
array/accumulate/Int64/1d |
124699 ns |
124343 ns |
1.00 |
array/accumulate/Int64/dims=1 |
83475 ns |
83433 ns |
1.00 |
array/accumulate/Int64/dims=2 |
158043.5 ns |
157726 ns |
1.00 |
array/accumulate/Int64/dims=1L |
1708810 ns |
1710225.5 ns |
1.00 |
array/accumulate/Int64/dims=2L |
966321.5 ns |
966516 ns |
1.00 |
array/accumulate/Float32/1d |
109353 ns |
109351 ns |
1.00 |
array/accumulate/Float32/dims=1 |
80459 ns |
80425 ns |
1.00 |
array/accumulate/Float32/dims=2 |
147423.5 ns |
147561 ns |
1.00 |
array/accumulate/Float32/dims=1L |
1618073 ns |
1619125 ns |
1.00 |
array/accumulate/Float32/dims=2L |
697852.5 ns |
698195 ns |
1.00 |
array/construct |
1300.7 ns |
1266.3 ns |
1.03 |
array/random/randn/Float32 |
47962.5 ns |
45130 ns |
1.06 |
array/random/randn!/Float32 |
25010 ns |
25086 ns |
1.00 |
array/random/rand!/Int64 |
27304 ns |
27356 ns |
1.00 |
array/random/rand!/Float32 |
8783 ns |
9035 ns |
0.97 |
array/random/rand/Int64 |
29896 ns |
29946 ns |
1.00 |
array/random/rand/Float32 |
13150 ns |
13310 ns |
0.99 |
array/permutedims/4d |
59830 ns |
59701 ns |
1.00 |
array/permutedims/2d |
54018 ns |
53857.5 ns |
1.00 |
array/permutedims/3d |
54666 ns |
54922.5 ns |
1.00 |
array/sorting/1d |
2756855 ns |
2758155 ns |
1.00 |
array/sorting/by |
3343979.5 ns |
3344572 ns |
1.00 |
array/sorting/2d |
1080937.5 ns |
1081253.5 ns |
1.00 |
cuda/synchronization/stream/auto |
1021.9 ns |
1040.7 ns |
0.98 |
cuda/synchronization/stream/nonblocking |
7116.9 ns |
7682 ns |
0.93 |
cuda/synchronization/stream/blocking |
814.989247311828 ns |
807.7010309278351 ns |
1.01 |
cuda/synchronization/context/auto |
1161.5 ns |
1187.8 ns |
0.98 |
cuda/synchronization/context/nonblocking |
8780.9 ns |
8672.3 ns |
1.01 |
cuda/synchronization/context/blocking |
895.1458333333334 ns |
915.4772727272727 ns |
0.98 |
This comment was automatically generated by workflow using github-action-benchmark.
Enzyme CI fails. |
@vchuravy looks like your fix missed the tape_type function?
|
79bb632
to
4a5ad9f
Compare
4a5ad9f
to
2afe6c6
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
now that @vchuravy fixed the GPUCompiler compat