Skip to content

Conversation

@xkykai
Copy link
Collaborator

@xkykai xkykai commented Oct 21, 2025

Closes #4875.

This PR aims to extend output writers so that we can pass in schedules which are AveragedSpecifiedTimes(times::Array{<:Number}, window=w::Array{<:Number}, ...) into output writers.

One compelling use case for this (at least for me) is to perform time averaging in Year units when running realistic coupled simulations in ClimaOcean.jl. For example, if I want to compute yearly averages over realistic years, each year would have different number of days, and the time interval between the end of each year is also not uniform. This calls for schedule = AveragedSpecifiedTimes(...), thus making this feature request a compelling one.

@xkykai xkykai added feature 🌟 Something new and shiny output 💾 labels Oct 21, 2025
@xkykai xkykai changed the title first commit for array averagedspecifiedtimes Enable AveragedSpecifiedTimes to be used in output writers Oct 21, 2025
@xkykai
Copy link
Collaborator Author

xkykai commented Oct 22, 2025

Apart from writing some tests I think this is now ready for review.

For a script like:

using Oceananigans
using Oceananigans.OutputWriters: AveragedSpecifiedTimes
using Printf
using Random

arch = CPU()
grid = RectilinearGrid(arch, size=(4, 4, 4), extent=(1, 1, 1))

model = NonhydrostaticModel(grid=grid)

simulation = Simulation(model, Δt=0.0001, stop_time=1)

times = [0.1, 0.25, 0.5, 0.6, 1]
window = [0.1, 0.05, 0.2, 0.1, 0.3]
true_u = vcat([0], times .- window ./ 2)
# window = 0.1

perm = randperm(length(times))
times .= times[perm]
window .= window[perm]

schedule = AveragedSpecifiedTimes(times; window)

wall_time = Ref(time_ns())

function progress(sim)
    msg = @sprintf("iter: %d, time: %s, Δt: %.4f",
                    iteration(sim), prettytime(time(sim)), sim.Δt)

    elapsed = 1e-9 * (time_ns() - wall_time[])

    msg *= @sprintf(", max u: %6.3e, max v: %6.3e, max w: %6.3e, wall time: %s",
                    maximum(sim.model.velocities.u),
                    maximum(sim.model.velocities.v),
                    maximum(sim.model.velocities.w),
                    prettytime(elapsed))

    @info msg
    wall_time[] = time_ns()

    return nothing
end

function set_synthetic_values!(sim)
    t = time(sim)
    set!(sim.model, u=t, v=0, w=0)
    return nothing
end

simulation.callbacks[:progress] = Callback(progress, IterationInterval(1000))
simulation.callbacks[:set_synthetic_values] = Callback(set_synthetic_values!, IterationInterval(1))

simulation.output_writers[:jld2] = JLD2Writer(model, model.velocities;
                                              filename = "averagedspecified_times",
                                              schedule,
                                              overwrite_existing = true)

run!(simulation)

u = FieldTimeSeries("averagedspecified_times.jld2", "u")

@info "Recorded times: ", u.times
@info "Recorded u: ", interior(u, 1, 1, 1, :)
@info "Expected u: ", true_u

it gives an output of

[ Info: ("Recorded times: ", [0.0, 0.1, 0.25, 0.5, 0.6, 1.0])
[ Info: ("Recorded u: ", [0.0, 0.049949999898672104, 0.22495000064373016, 0.3999499976634979, 0.5499500036239624, 0.8499500155448914])
[ Info: ("Expected u: ", [0.0, 0.05, 0.225, 0.4, 0.5499999999999999, 0.85])

Now if window is a number i.e.

window = 0.1

we get results which are

[ Info: ("Recorded times: ", [0.0, 0.1, 0.25, 0.5, 0.6, 1.0])
[ Info: ("Recorded u: ", [0.0, 0.049949999898672104, 0.19994999468326569, 0.4499500095844269, 0.5499500036239624, 0.9499499797821045])
[ Info: ("Expected u: ", [0.0, 0.05, 0.2, 0.45, 0.5499999999999999, 0.95])

which are "correct". These answers are approximate as the u field is only set to t at every iteration, so the averaged results can't be exact. I was trying to come up with exact tests but I couldn't think of one so if people have suggestions I'd happy to use them!

I will write tests like this if people think this is good enough of when we can come up with tests that return correct answers exactly.

Now in terms of edge cases. When the windows overlap e.g. (note that the 4th window overlaps with the 3rd time point

times = [0.1, 0.25, 0.5, 0.6, 1]
window = [0.1, 0.05, 0.2, 0.11, 0.3]

we get an error of

ERROR: ArgumentError: Averaging windows overlap. Ensure that for each specified time tᵢ, tᵢ - windowᵢ  tᵢ₋₁.
Stacktrace:
 [1] AveragedSpecifiedTimes(times::Vector{Float64}, window::Vector{Float64}; kw::@Kwargs{})
   @ Oceananigans.OutputWriters c:\Users\xinle\Dropbox\MIT\main_oceananigans\Oceananigans.jl\src\OutputWriters\windowed_time_average.jl:130
 [2] AveragedSpecifiedTimes
   @ c:\Users\xinle\Dropbox\MIT\main_oceananigans\Oceananigans.jl\src\OutputWriters\windowed_time_average.jl:120 [inlined]
 [3] #AveragedSpecifiedTimes#21
   @ c:\Users\xinle\Dropbox\MIT\main_oceananigans\Oceananigans.jl\src\OutputWriters\windowed_time_average.jl:117 [inlined]
 [4] top-level scope
   @ c:\Users\xinle\Dropbox\MIT\main_oceananigans\Oceananigans.jl\test_averagedspecified_times.jl:23

Similarly when window is a number e.g.

window = 0.11

we get

ERROR: ArgumentError: Averaging window 0.11 is too large and causes overlapping windows. Ensure that for each specified time tᵢ, tᵢ - window  tᵢ₋₁.
Stacktrace:
 [1] AveragedSpecifiedTimes(times::Vector{Float64}, window::Float64; kw::@Kwargs{})
   @ Oceananigans.OutputWriters c:\Users\xinle\Dropbox\MIT\main_oceananigans\Oceananigans.jl\src\OutputWriters\windowed_time_average.jl:135
 [2] AveragedSpecifiedTimes
   @ c:\Users\xinle\Dropbox\MIT\main_oceananigans\Oceananigans.jl\src\OutputWriters\windowed_time_average.jl:135 [inlined]
 [3] #AveragedSpecifiedTimes#21
   @ c:\Users\xinle\Dropbox\MIT\main_oceananigans\Oceananigans.jl\src\OutputWriters\windowed_time_average.jl:117 [inlined]
 [4] top-level scope
   @ c:\Users\xinle\Dropbox\MIT\main_oceananigans\Oceananigans.jl\test_averagedspecified_times.jl:23

In order to use the type dispatch style on window which in the API is a keyword argument I currently use a wrapper

AveragedSpecifiedTimes(times; window, kw...) = AveragedSpecifiedTimes(times, window; kw...)

to reroute the function to the right dispatch with window as positional argument.

Alternatively there is also a method which is currently commented out

# function AveragedSpecifiedTimes(times; window, kw...)
# perm = sortperm(times)
# sorted_times = times[perm]
# time_diff = diff(vcat(0, sorted_times))
# if window isa Vector{Float64}
# length(window) == length(times) || throw(ArgumentError("When providing a vector of windows, its length $(length(window)) must match the number of specified times $(length(times))."))
# sorted_window = window[perm]
# @info "timediff", time_diff
# @info "sortedwindow", sorted_window
# any(time_diff .- sorted_window .< -eps(eltype(window))) && throw(ArgumentError("Averaging windows overlap. Ensure that for each specified time tᵢ, tᵢ - windowᵢ ≥ tᵢ₋₁."))
# return AveragedSpecifiedTimes(SpecifiedTimes(sorted_times); window=sorted_window, kw...)
# elseif window isa Number
# any(time_diff .- window .< -eps(typeof(window))) && throw(ArgumentError("Averaging window $window is too large and causes overlapping windows. Ensure that for each specified time tᵢ, tᵢ - window ≥ tᵢ₋₁."))
# return AveragedSpecifiedTimes(SpecifiedTimes(times); window, kw...)
# else
# throw(ArgumentError("window must be a Float64 or a Vector{Float64}, got $(typeof(window))"))
# end
# end

which checks the type of window in the function loop.

@glwagner @simone-silvestri @navidcy let me know what do you think and give suggestions on better model design!

PS: also why is window forced to be Float64 rin AveragedSpecifiedTimes right now?

mutable struct AveragedSpecifiedTimes <: AbstractSchedule
specified_times :: SpecifiedTimes
window :: Float64
stride :: Int
collecting :: Bool
end

Why was that choice made?

@xkykai xkykai marked this pull request as ready for review October 22, 2025 01:22
@xkykai xkykai requested review from glwagner, navidcy and simone-silvestri and removed request for glwagner October 22, 2025 01:22
A schedule for averaging over windows that precede SpecifiedTimes.
"""
mutable struct AveragedSpecifiedTimes <: AbstractSchedule
mutable struct AveragedSpecifiedTimes{W <: Union{Float64, Vector{Float64}}} <: AbstractSchedule
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mutable struct AveragedSpecifiedTimes{W <: Union{Float64, Vector{Float64}}} <: AbstractSchedule
mutable struct AveragedSpecifiedTimes{W} <: AbstractSchedule

collecting :: Bool
end

const VaryingWindowAveragedSpecifiedTimes = AveragedSpecifiedTimes{Vector{Float64}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const VaryingWindowAveragedSpecifiedTimes = AveragedSpecifiedTimes{Vector{Float64}}
const VaryingWindowAveragedSpecifiedTimes = AveragedSpecifiedTimes{<:Vector}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you want to limit to Float64. That won't work with dates.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we relax AveragedTimeInterval as well?

mutable struct AveragedTimeInterval <: AbstractSchedule
interval :: Float64
window :: Float64
stride :: Int
first_actuation_time :: Float64
actuations :: Int
collecting :: Bool
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! good catch. I can do that in another PR if you like

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the above, interval and window have a "time period" type, while first_actuation_time has a "time" type. These are different when using dates.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there currently a TimePeriod type? I can try to define one.

Should we also make a SpecifiedWindows like SpecifiedTimes in the case for AveragedSpecifiedTimes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want a specific type, because sometimes a "period" is Float64 (the assumption here), but with dates, it may need to be Dates.Second or Dates.Day (for example). However, a "time type" can be Float64 (the assumption here) or DateTime.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we relax it and make it parametric? or constrain it so that interval :: Union{Float64, Dates.Period}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we need to use type parameters, and there need to be two of them.

We don't use the type union approach because it is not concrete; with that approach types can no longer be inferred.

"""
mutable struct AveragedSpecifiedTimes{W <: Union{Float64, Vector{Float64}}} <: AbstractSchedule
mutable struct AveragedSpecifiedTimes{W} <: AbstractSchedule
specified_times :: SpecifiedTimes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since SpecifiedTimes is not a concrete type, this should also be a type parameter

Comment on lines 120 to 128
function determine_epsilon(eltype)
if eltype <: AbstractFloat
return eps(eltype)
elseif eltype <: Period
return Second(0)
else
return 0
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function determine_epsilon(eltype)
if eltype <: AbstractFloat
return eps(eltype)
elseif eltype <: Period
return Second(0)
else
return 0
end
end
determine_epsilon(eltype) = 0
determine_epsilon(eltype::AbstractType) = eps(eltype)
determine_epsilon(::Period) = Second(0)

the name of the function is slightly problematic

Copy link
Collaborator Author

@xkykai xkykai Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you meant

determine_epsilon(eltype) = 0
determine_epsilon(::Type{T}) where T <: AbstractFloat = eps(T)
determine_epsilon(::Period) = Second(0)

right?

I agree the name is terrible, but I haven't figured out a better alternative...

end

function AveragedSpecifiedTimes(times, window::Float64; kw...)
function AveragedSpecifiedTimes(times, window::Union{<:Number, <:Period}; kw...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function AveragedSpecifiedTimes(times, window::Union{<:Number, <:Period}; kw...)
function AveragedSpecifiedTimes(times, window; kw...)

can this be the fallback?

"""
mutable struct AveragedSpecifiedTimes{W} <: AbstractSchedule
specified_times :: SpecifiedTimes
mutable struct AveragedSpecifiedTimes{S<:SpecifiedTimes, W} <: AbstractSchedule
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mutable struct AveragedSpecifiedTimes{S<:SpecifiedTimes, W} <: AbstractSchedule
mutable struct AveragedSpecifiedTimes{S, W} <: AbstractSchedule

not that its bad, but we don't use this style right now in Oceananigans. Basically we want type restrictions to have a specific purpose

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature 🌟 Something new and shiny output 💾

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AverageSpecifiedTimes cannot be used

2 participants