-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bit of cleanup and using views
in MultiLayerQG
module
#293
Conversation
views
in MultiLayerQG.energies
views
in MultiLayerQG.energies
verticalfluxes = sum(@views @. params.f₀^2 / params.g′ * (U₁ - U₂) * v₂ * ψ₁; dims=(1, 2)) | ||
|
||
verticalfluxes = sum(params.f₀^2 / params.g′ * (U₁ .- U₂) .* v₂ .* ψ₁) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still allocates memory I suppose, but might be harder to eliminate that (might have to use Tullio or something)
src/multilayerqg.jl
Outdated
@@ -967,13 +972,15 @@ function energies(vars, params, grid, sol) | |||
|
|||
abs²∇𝐮h = vars.uh # use vars.uh as scratch variable | |||
@. abs²∇𝐮h = grid.Krsq * abs2(vars.ψh) | |||
|
|||
LxLyH = grid.Lx * grid.Ly * sum(params.H) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LxLyH = grid.Lx * grid.Ly * sum(params.H) | |
V = grid.Lx * grid.Ly * sum(params.H) |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair point!
src/multilayerqg.jl
Outdated
end | ||
|
||
for j = 1:nlayers-1 | ||
CUDA.@allowscalar PE[j] = 1 / (2 * grid.Lx * grid.Ly * sum(params.H)) * params.f₀^2 / params.g′[j] * parsevalsum(abs2.(vars.ψh[:, :, j+1] .- vars.ψh[:, :, j]), grid) | ||
view(PE, j) .= 1 / (2 * LxLyH) * params.f₀^2 ./ params.g′[j] .* parsevalsum(abs2.(view(vars.ψh, :, :, j) .- view(vars.ψh, :, :, j+1)), grid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the views can also be defined on separate lines for clarity
…jl into ncc/views
This is an attempt to resolve #273.
I benchmarked it. It's not any faster than v0.14.2. But it's cleaner.
Seems like scalar operations mentioned in #273 were not causing any bottleneck after all...
Bench script:
Using this PR
With GeophysicalFlows v0.14.2