Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

latest update the code corresponding to your paper? #37

Open
152215 opened this issue Nov 19, 2024 · 6 comments
Open

latest update the code corresponding to your paper? #37

152215 opened this issue Nov 19, 2024 · 6 comments

Comments

@152215
Copy link

152215 commented Nov 19, 2024

Hello, I hope this message finds you well. I was wondering if the latest update to the code corresponds to the implementation described in your paper: "A New Framework to Generate Lagrangian Cuts in Multistage Stochastic Mixed-Integer Programming"? Thank you for your time and assistance!

@ChrisFuelOR
Copy link
Owner

ChrisFuelOR commented Nov 25, 2024

Hello. Thanks for your interest in this project. The latest update on the branch "CorePointUpdate" corresponds to the implementation mentioned in that paper. I have not merged it to the main branch with a pull request yet.

In general, before the paper is going to be published in its final form I will definitely spend some time to tidy up the code so that it is more accessible for others and also gets a proper documentation. Right now it's still in a preliminary form. Maybe you've also seen that the code uses a very old Julia version - this is because I started this project and also the experiments for the paper a very long time ago, and then did not want to make disruptive changes during my experiments. Once the paper is published, I will update everything to the current version of Julia, JuMP etc.

As there is no proper documentation yet: If you want to understand how the code works, it's probably best to look at the files corresponding to the case studies from the paper (CLSP, CFLP) as well as the file typedefs.jl which covers the definition of all user-controlled parameters. If you have any specific question, just let me know here or via e-mail.

@ChrisFuelOR
Copy link
Owner

ChrisFuelOR commented Nov 26, 2024

Hello. What exactly do you mean by "some parts of the script seem to be missing"? The code works as it is, I just used it two weeks ago for computational tests. Maybe you can specify what seems to be missing.

Besides that, I am not quite sure what kind of SDDP script you are looking for.

If you refer to running functions from SDDP.jl somewhere in my project - this is not done. When I started working on this project, I also dealt with non-convex cuts to approximate the value functions (this is however, not covered in the paper you are referring to). Compared to SDDP.jl this required some significant changes to the Bellman function implementation. Therefore, I based my code on SDDP.jl and re-used some of its code, but SDDP.jl's train function itself is not called.

If you simply refer to calling the SDDiP algorithm, there is a function named DynamicSDDiP.solve() which is called in the starter.jl files of my examples and initiates a run of my version of SDDiP. The function itself is part of the algorithmMain.jl file.

To be more precise: For each example in the examples directory, there is a model.jl file defining the model to be solved, a algo_config.jl file specifying most of the parameter choices for the current runs (these parameters are defined and described in typedefs.jl in the src directory) and a starter.jl file allowing to specify several runs (sharing the algo_config file) at once. If you run the starter.jl file, these runs will be started. The runs are initiated by calling the DynamicSDDiP.solve() function from algorithmMain.jl in the src directory. This function in turn then calls more specific functions for the forward pass, backward pass, solving the Lagrangian dual etc. So there is not really one single SDDP/SDDiP script, but the algorithm consists of several functions that are split across multiple files in the src directory.

Hope this explanation helps.

@152215
Copy link
Author

152215 commented Nov 26, 2024

Thanks for your reply, I will continue to study your code.

@ChrisFuelOR
Copy link
Owner

Okay. Just let me know if you have any further questions.

@152215
Copy link
Author

152215 commented Dec 12, 2024

I apologize for disturbing you again, but I am currently encountering some issues and I would appreciate your guidance:

  1. Since I intend to use the non-binarization approach for my problem, I would like to know if there are any other adjustments required in the algorithm besides the binary transformation of variables. From my understanding of your code, I did not notice any other adjustments needed, meaning the algorithm should be generally applicable. I would like to confirm whether that is the case.

  2. While using the Chen-Luedtke method to generate cuts, I noticed that the Benders cut index is updated at each backward step, but I couldn’t find where exactly the Benders cuts are generated. Could you kindly point me in the right direction? I have also reviewed the Chen-Luedtke code, and if I remember correctly, their strategy generates new Benders cuts at each iteration, so I assume your approach follows this strategy as well. I would appreciate your confirmation.

"""
Updating the list of Benders cuts for the approach by Chen & Luedtke
for current cut_generation_regime without state approximation
and for SingleCutRegime.
"""
function update_Benders_cut_list!(
    node::SDDP.Node,
    add_cut_flags::Vector{Bool},
    cut_aggregation_regime::DynamicSDDiP.SingleCutRegime,
    state_approximation_regime::DynamicSDDiP.NoStateApproximation,
    )

    if any(add_cut_flags)
        cut_index = lastindex(node.bellman_function.global_theta.cuts)
        push!(node.ext[:Benders_cuts_original], (cut_index, :all))
    end
end
  1. In Figure 17 of your paper, I observed that the CL-LN-gamma method performs very well, but in the code, I could only find the CL-gamma method that uses an L1 norm constraint and normal CL-LN without using gamma restriction. It seems that there is no corresponding CL-LN-gamma. Could you please clarify whether my understanding is correct? Below is the relevant part of the code.
 CL L1 gamma
function add_normalization_constraint!(
    node::SDDP.Node,
    approx_model::JuMP.Model,
    number_of_states::Int,
	normalization_coeff::Union{Nothing,NamedTuple{(:ω, :ω₀),Tuple{Vector{Float64},Float64}}},
    normalization_regime::DynamicSDDiP.ChenLuedtke
)

    @assert :span_variable in keys(JuMP.object_dictionary(approx_model))

    span_variable = approx_model[:span_variable]
    π₀ = approx_model[:π₀]

    abs_span_variable = JuMP.@variable(approx_model, abs_span_variable[1:length(span_variable)])
    JuMP.@constraint(approx_model, π₀ + sum(abs_span_variable[i] for i in 1:length(span_variable)) <= 1)
    JuMP.@constraint(approx_model, abs_span_1[i=1:length(span_variable)], -abs_span_variable[i] <= span_variable[i])
    JuMP.@constraint(approx_model, abs_span_2[i=1:length(span_variable)], abs_span_variable[i] >= span_variable[i])

    return
end
 CL LN
function add_normalization_constraint!(
    node::SDDP.Node,
    approx_model::JuMP.Model,
    number_of_states::Int,
	normalization_coeff::Union{Nothing,NamedTuple{(:ω, :ω₀),Tuple{Vector{Float64},Float64}}},
    normalization_regime::Union{DynamicSDDiP.Core_Midpoint,DynamicSDDiP.Core_Epsilon,DynamicSDDiP.Core_In_Out,DynamicSDDiP.Core_Relint,DynamicSDDiP.Core_Conv},
)

	π⁺ = approx_model[:π⁺]
	π⁻ = approx_model[:π⁻]
	π₀ = approx_model[:π₀]

	ω₀ = normalization_coeff.ω₀
	ω = normalization_coeff.ω

	JuMP.@constraint(approx_model, ω₀ * π₀ + sum(ω[i] * (π⁺[i] - π⁻[i]) for i in 1:number_of_states) <= 1)

	return
end
  1. Lastly, I plan to implement one of your algorithms to generate cuts without variable binarization. I might need to re-implement your algorithm in Python. I would like to choose an efficient algorithm, simple if possible. Could you provide some simple suggestions.
    Thank you for your research and wish you good luck in your work.

@ChrisFuelOR
Copy link
Owner

ChrisFuelOR commented Dec 12, 2024

I apologize for disturbing you again, but I am currently encountering some issues and I would appreciate your guidance:

It's not disturbing. Your questions are welcome – and, as I stated already, I am well aware that it’s not easy to familiarize with the code given that no proper documentation exists yet.

  1. Since I intend to use the non-binarization approach for my problem, I would like to know if there are any other adjustments required in the algorithm besides the binary transformation of variables. From my understanding of your code, I did not notice any other adjustments needed, meaning the algorithm should be generally applicable. I would like to confirm whether that is the case.

I am not completely sure if I understand you correctly. To clarify, the project in its current form gives you three options:

  1. Solve a model using SDDiP without any binarization (and using all kinds of different linear cuts, like Benders cuts, strengthened Benders cuts, standard SDDiP Lagrangian cuts, deep Lagrangian cuts, LN Lagrangian cuts). This approach has no convergence guarantees. This is what is done for the CLSP instances in the paper; see examples/CLSP_larger/model_no_bin.jl and starter_no_bin.jl in the code.
  2. Solve a model using SDDiP with static binarization (again using all the above linear cuts). This is what is proposed in the original SDDiP paper. Here, the binarization is not part of the algorithm, so if you intend to do this, you have to include the binarization as part of the model itself. This is what is done for the CLSP-Bin instances in the paper; see examples/CLSP/model_bin.jl and starter_bin.jl in the code.
  3. Solve a model using SDDiP with dynamic binarization. Here, the binarization (based on some user-defined initial parameters) is applied as part of the algorithm and refined if necessary. This approach results in a lift-and-project cut generation approach where the value functions are approximated using non-convex cuts. This is theoretically interesting, but quite costly, so probably not suited to solve most problems of realistic size. This approach is used in some other papers of mine.

Note that you can also use combinations of 1.) and 3.). More precisely, for each type of cut that you want to construct in the algorithm, you can define using the state_approximation_regime if it should be applied with or without a dynamic state binarization.

If you intend to use the non-binarization approach, there are no adjustments required within the algorithm. You just have to make sure that DynamicSDDiP.NoStateApproximation() is used for the state_approximation_regime of your cuts (see the algo_config.jl files to see how that is done for the example)

  1. While using the Chen-Luedtke method to generate cuts, I noticed that the Benders cut index is updated at each backward step, but I couldn’t find where exactly the Benders cuts are generated. Could you kindly point me in the right direction? I have also reviewed the Chen-Luedtke code, and if I remember correctly, their strategy generates new Benders cuts at each iteration, so I assume your approach follows this strategy as well. I would appreciate your confirmation.

My approach follows the strategy that each time a Benders cut is computed (alternatively, this is possible also for strengthened Benders cuts) it is stored in some list for the respective node. This is node.ext[:Benders_cuts_original] . So if you use the Chen-Luedtke method, the parameters of these cuts are accessed.

Note however, that generating Benders cuts is not done automatically. So if you want to use the Chen-Luedtke method, you have to make sure that (strengthened) Benders cuts are generated at all. To do so, my code allows you to define several different cut_generation_regimes which can then be included in the algorithm (again, look at the algo_config.jl files of some examples for illustration, as long as there is no proper documentation).

For instance, you can define a cut_generation_regime_1 by using DynamicSDDiP.LinearDuality() (or DynamicSDDiP.StrengthenedDuality()) as the duality_regime parameter to generate Benders or strengthened Benders cuts. Then you can define a cut_generation_regime_2 using a different duality_regime to generate Lagrangian cuts. And then you can use [cut_generation_regime_1, cut_generation_regime_2] for the parameter cut_generation_regimes in the algo_params struct that you use for running the algorithm.

  1. In Figure 17 of your paper, I observed that the CL-LN-gamma method performs very well, but in the code, I could only find the CL-gamma method that uses an L1 norm constraint and normal CL-LN without using gamma restriction. It seems that there is no corresponding CL-LN-gamma. Could you please clarify whether my understanding is correct? Below is the relevant part of the code.

Maybe the confusion comes from my implementation of the Chen-Luedtke method being split into several parts.
As mentioned in my answer for the previous question, you first have to make sure that (strengthened) Benders cuts are generated. Second, you have to make sure that for the cuts that you want to generate the dual_space_regime parameter is set to DynamicSDDiP.BendersSpanSpaceRestriction(K, :multi_cut) with $K$ some natural number. This introduces the dual space restriction constraint from the paper. In principle, it can be applied in combination with any dual normalization.
For the $\gamma$-approach specifically, as you noted, an $\ell_1$-norm constraint is used. However, instead of the original dual variables $\pi_k$, it includes $\pi_0$ and some auxiliary variables associated with the dual space restriction. Therefore, it exactly models the CL-LN-gamma method.

Note that Chen and Luedtke also proposed a more sophisticated approach where an auxiliary MIP is solved. This is not included in my project.

  1. Lastly, I plan to implement one of your algorithms to generate cuts without variable binarization. I might need to re-implement your algorithm in Python. I would like to choose an efficient algorithm, simple if possible. Could you provide some simple suggestions.

My suggestion would be to start with a simple implementation of SDDP (or using one that already exists). Then see where strengthened Benders cuts get you. Then try to add my ideas on LN Lagrangian cuts if it is important for you to improve the lower bounds.

Importantly, my code is way more sophisticated than what you need to simply replicate the part on the LN cuts. The thing is that my code is very modular because it allows the user to switch between and even combine all kinds of different approaches from various of my papers.

Finally, if you are looking to explore Lagrangian cuts in the future (which many people might say is not worth it, because solving the duals is just too costly and convergence may take way too long to leverage the tightness properties of Lagrangian cuts anyway), I think a few preprints on computational improvements should be published quite soon. At least, I attended some talks at the INFORMS Annual Meeting this year on some yet to be published ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants