-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional backend types to support Base.Threads #61
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #61 +/- ##
==========================================
- Coverage 89.81% 88.34% -1.47%
==========================================
Files 16 16
Lines 481 489 +8
==========================================
Hits 432 432
- Misses 49 57 +8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Is the performance increase a general improvement or only in the mixed Polyester and Threads case? |
@svchb I used the following Benchmark with 64 Threads: using Peridynamics
function sphere_impact(; ns=6, np=3, path)
Ø = 0.15
ΔX0_sphere = Ø / ns
pos_sphere, vol_sphere = uniform_sphere(Ø, ΔX0_sphere; center_z=Ø / 2 + ΔX0_sphere)
sphere = Body(BBMaterial{EnergySurfaceCorrection}(), pos_sphere, vol_sphere)
failure_permit!(sphere, false)
material!(sphere; horizon=3.1ΔX0_sphere, E=210e9, rho=8000, Gc=2000)
velocity_ic!(sphere, :all_points, :z, -20)
lxy, lz = 2.0, 0.1
ΔX0_plate = lz / np
pos_plate, vol_plate = uniform_box(lxy, lxy, lz, ΔX0_plate; center_z=-lz / 2)
plate = Body(BBMaterial{EnergySurfaceCorrection}(), pos_plate, vol_plate)
material!(plate; horizon=3.1ΔX0_plate, E=27e9, rho=2700, Gc=10)
ms = MultibodySetup(:sphere => sphere, :plate => plate)
contact!(ms, :sphere, :plate; radius=min(ΔX0_sphere, ΔX0_plate))
vv = VelocityVerlet(steps=2000)
job = Job(ms, vv; path=path)
@time submit(job)
return nothing
end
##
path = "results/benchmarks/sphere_impact"
rm(path; recursive=true, force=true)
sphere_impact(; path) # compilation
sphere_impact(; ns=20, np=8, path) # work Due to the Polyester-Threads mixing issue I thought it could be a good idea to also use ## Peridynamics.jl v0.3.0 - everywhere using Polyester.@batch
# compilation:
# 89.256836 seconds (47.14 M allocations: 14.914 GiB, 0.66% gc time, 7.14% compilation time)
# work:
# 45.514108 seconds (40.16 M allocations: 128.786 GiB, 8.94% gc time) However, I found out this was the reason my parallel performance for single body simulations is not as good as before. When changing all ## Peridynamics.jl v0.3.1-DEV - everywhere using Threads.@threads :static
## PointNeighbors.jl v0.4.5-dev: update_grid!(...; parallelization_backend=ThreadsStaticBackend())
# compilation:
# 106.717161 seconds (54.34 M allocations: 15.806 GiB, 0.61% gc time, 5.13% compilation time)
# work:
# 39.184317 seconds (48.59 M allocations: 129.752 GiB, 9.53% gc time) @efaulhaber, I also tried specifying ## Peridynamics.jl v0.3.1-DEV - everywhere using Threads.@threads :static
## PointNeighbors.jl v0.4.5-dev: update_grid!(...; parallelization_backend=KernelAbstractions.CPU(static=false))
# compilation:
# 122.136295 seconds (52.11 M allocations: 15.556 GiB, 0.55% gc time, 4.67% compilation time)
# work (2 work runs necessary to bench compiled version)
# 40.164468 seconds (47.48 M allocations: 129.602 GiB, 8.91% gc time, 11.92% compilation time)
# 39.528993 seconds (47.38 M allocations: 129.596 GiB, 8.36% gc time) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this PR ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
As discussed last week at JuliaCon, specifying the multithreading backend could be very beneficial for users. With the current approach of using Polyester in the
@threaded
macro, a user is forced to either use Polyester or only a serial update. The performance impact of mixing different multithreading backends is also described in kaipartmann/Peridynamics.jl#110.This PR adds additional types that could be specified as parallelization backend. I tested the code with
Peridynamics.jl
and noticed a slight performance improvement usingThreads.@threads :static
with 64 Threads compared toPolyester.@batch
.I did not add unit tests, as I was unsure where to include them correctly in your current testing setup. If you point me in the right direction, I will also include them.