Create containers with map instead of for loops #2070

blegat · 2019-09-23T16:19:15Z

Currently, when creating a container, JuMP macros first allocate the container and then fill it with for loops.
The disadvantage of this approach is that we need to determine the type beforehand so the eltype is for instance not concrete for @constraint and @expression only works for AffExpr (see #525).
With this PR, we rely on map to create the array holding the values of the container hence we don't need to determine the eltype beforehand hence also removing the need for variable_type and constraint_type (but we should keep them to avoid making a breaking change).

The PR also lays out foundations for creating constrained variables as it was needed to update the creation of SDP and symmetric matrix of variables with the new design.

As a proof of concept that the container logic is now completely decoupled from JuMP, the Containers submodule has now a @container.

Fix tests.
Update doc.

Closes #525

Benchmarks

For the test/perf/speed.jl benchmark, before the PR:

julia> include("speed.jl")
P-Median(100 facilities, 100 customers, 5000 locations) benchmark:
BenchmarkTools.Trial: 
  memory estimate:  703.37 MiB
  allocs estimate:  11015593
  --------------
  minimum time:     1.624 s (51.79% GC)
  median time:      2.267 s (57.83% GC)
  mean time:        2.137 s (58.00% GC)
  maximum time:     2.519 s (62.15% GC)
  --------------
  samples:          3
  evals/sample:     1

Cont5(n=500) benchmark:
BenchmarkTools.Trial: 
  memory estimate:  365.94 MiB
  allocs estimate:  5779814
  --------------
  minimum time:     1.061 s (50.03% GC)
  median time:      1.134 s (46.80% GC)
  mean time:        1.142 s (47.05% GC)
  maximum time:     1.206 s (46.29% GC)
  --------------
  samples:          5
  evals/sample:     1

After the PR:

julia> include("speed.jl")
P-Median(100 facilities, 100 customers, 5000 locations) benchmark:
BenchmarkTools.Trial: 
  memory estimate:  703.39 MiB
  allocs estimate:  11016099
  --------------
  minimum time:     1.533 s (57.35% GC)
  median time:      1.706 s (61.08% GC)
  mean time:        1.693 s (60.98% GC)
  maximum time:     1.826 s (62.87% GC)
  --------------
  samples:          4
  evals/sample:     1

Cont5(n=500) benchmark:
BenchmarkTools.Trial: 
  memory estimate:  365.94 MiB
  allocs estimate:  5779820
  --------------
  minimum time:     761.274 ms (40.83% GC)
  median time:      808.268 ms (41.89% GC)
  mean time:        821.060 ms (44.11% GC)
  maximum time:     883.443 ms (48.30% GC)
  --------------
  samples:          7
  evals/sample:     1

The speedup is between 25% and 50% (i.e. 2x) which is nice to see!

	Median	Mean
P-Median	25 %	47 %
Cont5	29 %	28 %

I suspect the speedup to come from the fact that the containers of constraint now have concrete element type. This was the case in JuMP v0.18 IIRC but not anymore for JuMP v0.19 so this PR may help with #1403

For the test/perf/axis_constraints.jl benchmark, before this PR:

julia> dense_axis_constraints(1000)
  0.001575 seconds (8.01 k allocations: 272.016 KiB)
  708.939 μs (13986 allocations: 234.14 KiB)
  700.468 μs (13986 allocations: 234.14 KiB)

julia> dense_axis_constraints(1000)
  0.001671 seconds (8.01 k allocations: 272.016 KiB)
  707.259 μs (13986 allocations: 234.14 KiB)
  711.204 μs (13986 allocations: 234.14 KiB)

julia> sparse_axis_constraints(1000)
  0.001006 seconds (5.83 k allocations: 138.234 KiB)
  361.992 μs (7000 allocations: 117.19 KiB)
  389.133 μs (7193 allocations: 120.20 KiB)

julia> sparse_axis_constraints(1000)
  0.001039 seconds (5.83 k allocations: 138.234 KiB)
  361.963 μs (7000 allocations: 117.19 KiB)
  390.414 μs (7193 allocations: 120.20 KiB)

After this PR:

julia> dense_axis_constraints(1000)
  0.000709 seconds (8.02 k allocations: 272.281 KiB)
  669.011 μs (12987 allocations: 218.53 KiB)
  668.274 μs (12987 allocations: 218.53 KiB)

julia> dense_axis_constraints(1000)
  0.001555 seconds (8.02 k allocations: 272.281 KiB)
  666.685 μs (12987 allocations: 218.53 KiB)
  670.274 μs (12987 allocations: 218.53 KiB)

julia> dense_axis_constraints(1000)
  0.001748 seconds (8.02 k allocations: 272.281 KiB)
  681.249 μs (12987 allocations: 218.53 KiB)
  674.972 μs (12987 allocations: 218.53 KiB)

julia> sparse_axis_constraints(1000)
  0.001240 seconds (5.78 k allocations: 145.906 KiB)
  352.693 μs (6500 allocations: 109.38 KiB)
  351.343 μs (6500 allocations: 109.38 KiB)

julia> sparse_axis_constraints(1000)
  0.001090 seconds (5.78 k allocations: 145.906 KiB)
  347.866 μs (6500 allocations: 109.38 KiB)
  355.565 μs (6500 allocations: 109.38 KiB)

julia> sparse_axis_constraints(1000)
  0.001250 seconds (5.78 k allocations: 145.906 KiB)
  348.161 μs (6500 allocations: 109.38 KiB)
  359.230 μs (6500 allocations: 109.38 KiB)

We see a speedup on computing the sum of the duals and less allocation because the eltype of the containers are concrete.

codecov · 2019-09-24T15:17:16Z

Codecov Report

Merging #2070 into master will decrease coverage by 0.29%.
The diff coverage is 87.04%.

@@           Coverage Diff            @@
##           master   #2070     +/-   ##
========================================
- Coverage    91.4%   91.1%   -0.3%     
========================================
  Files          33      38      +5     
  Lines        4246    4318     +72     
========================================
+ Hits         3881    3934     +53     
- Misses        365     384     +19

Impacted Files	Coverage Δ
src/JuMP.jl	`84.89% <ø> (ø)`	⬆️
src/Containers/Containers.jl	`50% <ø> (ø)`	⬆️
src/shapes.jl	`33.33% <0%> (-6.67%)`	⬇️
src/Containers/no_duplicate_dict.jl	`100% <100%> (ø)`
src/parse_expr.jl	`88.21% <100%> (+0.11%)`	⬆️
src/Containers/vectorized_product_iterator.jl	`46.15% <46.15%> (ø)`
src/Containers/container.jl	`61.53% <61.53%> (ø)`
src/macros.jl	`93.13% <87.35%> (-0.14%)`	⬇️
src/Containers/nested_iterator.jl	`88% <88%> (ø)`
src/sd.jl	`93.02% <89.65%> (-6.98%)`	⬇️
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d50e3fc...1192fd8. Read the comment docs.

mlubin

Looks promising! I expect to be able to do a real review over the weekend.

mlubin · 2019-09-25T11:57:51Z

docs/src/constraints.md

@@ -251,7 +251,7 @@ following.
 One way of adding a group of constraints compactly is the following:
 ```jldoctest constraint_arrays; setup=:(model=Model(); @variable(model, x))
 julia> @constraint(model, con[i = 1:3], i * x <= i + 1)
-3-element Array{ConstraintRef{Model,C,Shape} where Shape<:AbstractShape where C,1}:
+3-element Array{ConstraintRef{Model,MathOptInterface.ConstraintIndex{MathOptInterface.ScalarAffineFunction{Float64},MathOptInterface.LessThan{Float64}},ScalarShape},1}:


This is a pretty bad printing experience.

mlubin

This is great, it addresses a design issue that goes back quite a long time. Could you run benchmarks to make sure there's no significant regression?

src/Containers/container.jl

src/Containers/macro.jl

src/Containers/container.jl

src/Containers/vectorized_product_iterator.jl

src/Containers/nested_iterator.jl

src/Containers/vectorized_product_iterator.jl

mlubin

Looks good, I still recommend some benchmarks.

mlubin · 2019-10-01T12:37:20Z

src/Containers/vectorized_product_iterator.jl

-on the `IteratorSize` of the elements of `prod.iterators`.
+Cartesian product of the iterators `prod.iterators`. It is the same iterator as
+`Base.Iterators.ProductIterator` except that it is independent of the
+`IteratorSize` of the elements of `prod.iterators`.


I think this could benefit from a longer comment about why the behavior of Base.Iterators.ProductIterator wasn't sufficient and required a wrapper. (Maybe outside the docstring.) What breaks if we use Base.Iterators.ProductIterator?

In fact, every test passed with Iterators.ProductIterator but
https://github.com/JuliaOpt/JuMP.jl/blob/1a4f663504606464292caf2db2d60acba6991af3/examples/cutting_stock_column_generation.jl#L149
fails.
Now I have added tests in test/Containers/ covering this feature.

blegat · 2019-10-02T10:20:47Z

I added benchmarks in the initial PR comment.

mlubin · 2019-10-02T12:46:28Z

The numbers from test/perf/speed.jl look great. However, they don't appear to cover SparseAxisArray or DenseAxisArray. Could you also test those?

blegat · 2019-10-02T14:06:07Z

There is a small performance hit for creating SparseAxisArray. The problem is

function container(f::Function, indices,
                   ::Type{SparseAxisArray})
    mappings = map(I -> I => f(I...), indices)
    data = Dict(mappings)
    if length(mappings) != length(data)
        error("Repeated index ...")
    end
    return SparseAxisArray(Dict(data))
end

because we allocate mappings. Instead, we could do

function container(f::Function, indices,
                   ::Type{SparseAxisArray})
    mappings = Base.Generator(I -> I => f(I...), indices)
    data = Dict(mappings)
    if length(mappings) != length(data)
        error("Repeated index ...")
    end
    return SparseAxisArray(Dict(data))
end

but then length(mappings) is not O(1) anymore.

mlubin · 2019-10-02T15:19:21Z

src/Containers/no_duplicate_dict.jl

+Same as `Dict{K, V}` but errors if constructed from an iterator with duplicate
+keys.
+"""
+struct NoDuplicateDict{K, V} <: AbstractDict{K, V}


Do we need to define a new type that's only used in one function? What about a function that takes a generator, constructs a Dict with errors on duplicates, and returns Dict?

constructs a Dict with errors on duplicates

How do you do that ? That's exactly NoDuplicateDict is trying to do

I didn't dig too deeply, but my question was why you can't define a function like:

function dict_no_duplicates{K, V}(it) where {K, V} dict = Dict{K, V}() for (k, v) in it if haskey(dict, k) error(...) else dict[k] = v end end return dict end

This doesn't require defining a new type. Maybe I'm missing something related to dict_with_eltype.

The issue is that we don't know the type of the keys and values. The challenge is that Julia wants to infer it in a way that's friendly with Julia compiler and inference system.
So it starts iterating, assumes the type is always the same so it build the container with the type of the first element and then if a new element does not fit, it creates a new container with a wider type, copy and continue.
This is what is done in map (that DenseAxisArray rely on) and dict_with_eltype (that SparseAxisArray rely on).

blegat · 2019-10-02T15:56:02Z

There seems to be something fishy with NestedIterator:

julia> f() = (it = Containers.nested(() -> 1:1000, condition = iseven); @time map(i -> 1, it); return)
f (generic function with 2 methods)

julia> f()
  0.020841 seconds (79.05 k allocations: 3.991 MiB)

julia> f()
  0.000392 seconds (3.50 k allocations: 125.422 KiB)

julia> f() = (it = Iterators.filter(iseven, 1:1000); @time map(i -> 1, it); return)
f (generic function with 2 methods)

julia> f()
  0.000013 seconds (10 allocations: 8.406 KiB)

julia> f()
  0.000023 seconds (10 allocations: 8.406 KiB)

The fact that eltype(it) is Any in the first case does not explain the difference of allocation.

EDIT should be improved by 1192fd8

blegat · 2019-10-02T20:16:33Z

The performance issue with SparseAxisArray is now resolved, I have updated the benchmarks.

mlubin · 2019-10-03T02:36:46Z

src/Containers/no_duplicate_dict.jl

+Same as `Dict{K, V}` but errors if constructed from an iterator with duplicate
+keys.
+"""
+struct NoDuplicateDict{K, V} <: AbstractDict{K, V}


I didn't dig too deeply, but my question was why you can't define a function like:

function dict_no_duplicates{K, V}(it) where {K, V} dict = Dict{K, V}() for (k, v) in it if haskey(dict, k) error(...) else dict[k] = v end end return dict end

This doesn't require defining a new type. Maybe I'm missing something related to dict_with_eltype.

Move Containers tests to Containers folder

8ce7535

blegat force-pushed the bl/@container branch from ecb7915 to 2d24a94 Compare September 24, 2019 12:30

Create containers with map instead of for loops

05f4242

blegat force-pushed the bl/@container branch from 2d24a94 to 05f4242 Compare September 24, 2019 14:11

Update docstrings

c771e5d

blegat marked this pull request as ready for review September 24, 2019 15:17

mlubin reviewed Sep 25, 2019

View reviewed changes

blegat force-pushed the bl/@container branch from b6b9fa3 to 3a7fa95 Compare September 25, 2019 13:54

Fix for iterators with shape

63ff9cc

blegat force-pushed the bl/@container branch from 3a7fa95 to 63ff9cc Compare September 26, 2019 09:48

mlubin reviewed Oct 1, 2019

View reviewed changes

Address comments

974dfee

mlubin reviewed Oct 1, 2019

View reviewed changes

Add a longer comment

61d5e1a

Add axis_constraints.jl benchmark

bc881f7

Faster SparseAxisArrays construction in macros

37f9a0c

mlubin reviewed Oct 2, 2019

View reviewed changes

Type stability fixes in NestedIterator

1192fd8

mlubin approved these changes Oct 3, 2019

View reviewed changes

mlubin approved these changes Oct 4, 2019

View reviewed changes

blegat merged commit 04735d2 into master Oct 4, 2019

blegat mentioned this pull request Dec 15, 2019

Replace for loops by functional approach jump-dev/MutableArithmetics.jl#26

Closed

mlubin deleted the bl/@container branch December 18, 2019 13:19

blegat mentioned this pull request Feb 21, 2020

Remove variable_type #2174

Closed

Uh oh!

Create containers with map instead of for loops #2070

Create containers with map instead of for loops #2070

Uh oh!

Conversation

blegat commented Sep 23, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Uh oh!

codecov bot commented Sep 24, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mlubin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mlubin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mlubin left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

blegat commented Oct 2, 2019

Uh oh!

mlubin commented Oct 2, 2019

Uh oh!

blegat commented Oct 2, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

blegat commented Oct 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blegat commented Oct 2, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

blegat commented Sep 23, 2019 •

edited

Loading

codecov bot commented Sep 24, 2019 •

edited

Loading

mlubin left a comment •

edited

Loading

blegat commented Oct 2, 2019 •

edited

Loading