Modifying the workflow to enable the use of different methods (LHS, Sobol) allowing for scenario discovery applications #13

AgnesBelt · 2024-04-15T15:49:25Z

Hi @willu47, I have now started trying to modify the workflow to enable the option to choose between different methods for generating the model runs, so that I can opt for LHS or Sobol in case I am interested in doing Scenario Discovery.
I am sure there is plenty of mistakes, but I have tried to use the following approach:

Modify the config file to allow for selecting a sampling method different from Morris
Modify the snakemake file to consider the method options and adjust the folder creation structure based on the number of model runs expected with each method.
Create new create_sample.py scripts, one for LHS and one for Sobol, to generate the input data files needed for the runs.
modify the sample.smk file to add if statement for redirecting the create_sample rule to the related script pertaining to the method chosen in the config file.

Would be great to look into it together at our next meeting.

…very applicaiton - e.g. LHS or Sobol

…ods for scenario discovery applications - depending on the method selected, the folder structure will be generated according to the number of scenarios generated

…iles depending on the sampling method of interest, to allow for scenario discovery applications

…thods - i.e. LHS, Sobol respectively. Need for testing if they are defined correctly

…S and Sobol methods

…erent sample methods

… keep consistency with generic naming

…s for Sobol

…fferent methods than Morris

…uns/0 folder does not get created and the sample.txt file cannot be saved

…consistency with other methods options

…o allow for switching between different sampling methods

AgnesBelt · 2024-04-17T15:20:14Z

Thanks to support from @HauHe, @willu47, and @camiloramirezgo, we seem to have now managed to fix the bugs I was encountering and the workflow should now be able to use different sampling methods (Morris, LHS, Sobol) to execute the workflow and produce results.

This should allow for the use of the esom_gsa workflow also in scenario discovery analysis.

Still pending: after having computed all scenarios and generated the results, the workflow proceed to compute some sensitivity analysis and fails to do so when LHS or Sobol are selected as methods.
If I am not mistaken, this could be skipped for SD applications with LHS and Sobol methods. So, I might still look into it to clean up the process.

AgnesBelt · 2024-04-17T15:40:03Z

Current issue still pending:

…ncertain parameters under analysis

…llow for using the user-specific choice of starting year of interpolation, as indicated in the config file under start_year

…ar of interpolation and pass it on to the create_modelrun.py script

… files to allow for using the user-specific choice of starting year of interpolation, as indicated in the config file under start_year this is now working also for parameters with no interpolation_index

AgnesBelt · 2024-04-18T13:35:34Z

I have now added also the option to specific a start_year for the interpolation, to allow for changing the start interpolation year to a year in the middle of the modelling period. This is of use for GLUCOSE SD application, but might be also useful for other cases.

…g in also the MODE_OF_OPERATION and add it to typed_index

…ses with no interpolation_index

willu47

I'd suggest combining the create_sample_*.py scripts into one, with one function per method, and then pass in the method, otherwise there is a fair amount of duplicate code and three files instead of one.

I found a few issues

one with len(PARAMETERS) not working (wrong number of modelruns created when using Sobol)
suggest using 2**replicates rather than replicates**2, as only the first is guaranteed to be a power of 2
non integer type of start_year was causing problems in the create_modelrun.py script

willu47 · 2024-04-30T13:49:41Z

config/config.yaml

@@ -2,25 +2,32 @@

 # Populate the scenarios.csv file with a list of scenario names
 # and path (description optional) to the model csv data
-scenarios: config/scenarios.csv
+scenarios: resources/glucose/scenarios_glucose.csv


Don't modify this e.g. revert to config/scenarios.csv

willu47 · 2024-04-30T13:50:10Z

config/config.yaml


 # Tell the workflow which model results to plot
-results: config/results.csv
+results: resources/glucose/results.csv


Revert to config/results.csv

willu47 · 2024-04-30T13:50:20Z

config/config.yaml


 # Filetype options: 'csv' or 'parquet' or 'feather'
 filetype: csv

 # Define the uncertain parameters used to define the Monte Carlo sample
-parameters: config/parameters.csv
+parameters: resources/glucose/parameters_glucose.csv


Revert to config/parameters.csv

willu47 · 2024-04-30T13:50:38Z

config/config.yaml


 # Path to the OSeMOSYS model file
-model_file: resources/osemosys_fast.txt
+model_file: resources/glucose/osemosys_fast.txt


Revert to resources/osemosys_fast.txt

config/config.yaml

willu47 · 2024-04-30T13:51:57Z

workflow/Snakefile

+elif METHOD == 'LHS':
+    MODELRUNS = range(config['replicates'])
+elif METHOD == 'Sobol':
+    MODELRUNS = range((config['replicates']**2) * (len(PARAMETERS) + 2))


len(PARAMETERS) is always 1, you need:

PARAMETERS = pd.read_csv(config['parameters'])['name'].to_numpy()

and

MODELRUNS = range((2 ** REPLICATES) * (PARAMETERS.shape[0] + 2))

addressed partially, I did not see the reason for using PARAMETERS.shape[0] as the PARAMETERS have just one dimension. I also kept the PARAMETERS definition linked to the 'indexes', is there any specific reason why I should switch to the 'name'?

I guess the fastest way would be PARAMETERS = len(pd.read_csv(config['parameters']))

One other issue is that this ignores grouping of parameters...

workflow/rules/osemosys.smk

workflow/scripts/create_sample_Sobol.py

workflow/scripts/create_modelrun.py

…in the parameters.csv file

… in the config file

…tainty parameters for variables with no yearly values

AgnesBelt added 5 commits April 15, 2024 17:25

adding options to select different sampling method for scenario disco…

8e342fd

…very applicaiton - e.g. LHS or Sobol

modifying snakemake file to enable the use of different sampling meth…

c25f6dd

…ods for scenario discovery applications - depending on the method selected, the folder structure will be generated according to the number of scenarios generated

changes in the sample.smk file to select different create_sample.py f…

a532005

…iles depending on the sampling method of interest, to allow for scenario discovery applications

adding different scripts for generating samples based on different me…

5f09fdc

…thods - i.e. LHS, Sobol respectively. Need for testing if they are defined correctly

changes in the snakemake file for calculating nr of model runs for LH…

b1ba801

…S and Sobol methods

AgnesBelt requested a review from willu47 April 15, 2024 15:49

AgnesBelt added 14 commits April 17, 2024 10:24

fixing calculation of model runs for Sobol method, N= replicates^2

830941e

fixing the syntax for rule create_sample to allow for the use of diff…

8e0cbdd

…erent sample methods

modifying N definition for Sobol sample, N=replicates^2

27e121a

changing name for sample.txt file to be non-Morris specific

15b7006

started adding some documentation to the create_sample_LHS.py script

ca87810

changed from morris_sample.txt to generic sample.txt in all inputs to…

43ace18

… keep consistency with generic naming

adding some documentation in the config file

2c995f6

testing some minor changes in the calculations of number of model run…

d8dacfc

…s for Sobol

changing from morris_sample.txt to generic sample.txt

277039a

changing from morris_sample.txt to generic sample.txt for use with di…

445704e

…fferent methods than Morris

testing out some options to fix bug: current issue is that the modelr…

bad0398

…uns/0 folder does not get created and the sample.txt file cannot be saved

remove reference to morris_sample (replace with simple 'sample') for …

287031a

…consistency with other methods options

adding fi at the end of the if/then option under create_sample rule t…

1b987cb

…o allow for switching between different sampling methods

removed test command that are not needed anymore

955c9f5

AgnesBelt added 4 commits April 18, 2024 11:23

adding option to specify the starting year of interpolation for the u…

19e6188

…ncertain parameters under analysis

modifying the script to generate the model run input data files to a…

e2c169b

…llow for using the user-specific choice of starting year of interpolation, as indicated in the config file under start_year

adjusting the osemosys.smk rule to read in the user-specific start_ye…

da0e419

…ar of interpolation and pass it on to the create_modelrun.py script

fixing bug on create_modelrun.py to generate the model run input data…

f36fa4d

… files to allow for using the user-specific choice of starting year of interpolation, as indicated in the config file under start_year this is now working also for parameters with no interpolation_index

AgnesBelt added 3 commits April 18, 2024 15:51

correcting message typo under calculate_SA_objective rule

2d594a1

fixing bug: changing input type from 'int' to 'int64' to allow readin…

23aad32

…g in also the MODE_OF_OPERATION and add it to typed_index

removed the need for adding a YEAR into the parameters indexes for ca…

f28f090

…ses with no interpolation_index

testing new changes also for Sobol method

1d75c6e

willu47 reviewed Apr 30, 2024

View reviewed changes

AgnesBelt added 8 commits May 2, 2024 11:05

changing the deifnition of PARAMETERS as to take in all values input …

ab0a6f7

…in the parameters.csv file

addressing minor issues as identified by Will's comments

14ec3a3

updated instruction for Sobol sampling on the config.yaml file

f550b28

adding conda environments requisites to run the workflow

a7005bd

adding last changes to allow for definnying end year of interpolation…

71561a1

… in the config file

adding GLUCOSE-related files

8d5de88

editing back the function apply_interploted_values to allow for uncer…

249a1ca

…tainty parameters for variables with no yearly values

latest settingd of the config file used for the runs on Nov 26, 2024

ef6a960

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modifying the workflow to enable the use of different methods (LHS, Sobol) allowing for scenario discovery applications #13

Modifying the workflow to enable the use of different methods (LHS, Sobol) allowing for scenario discovery applications #13

AgnesBelt commented Apr 15, 2024

AgnesBelt commented Apr 17, 2024

AgnesBelt commented Apr 17, 2024

AgnesBelt commented Apr 18, 2024

willu47 left a comment •

edited

Loading

willu47 Apr 30, 2024

willu47 Apr 30, 2024

willu47 Apr 30, 2024

willu47 Apr 30, 2024

willu47 Apr 30, 2024

AgnesBelt May 2, 2024

willu47 May 2, 2024

willu47 May 2, 2024

Modifying the workflow to enable the use of different methods (LHS, Sobol) allowing for scenario discovery applications #13

Are you sure you want to change the base?

Modifying the workflow to enable the use of different methods (LHS, Sobol) allowing for scenario discovery applications #13

Conversation

AgnesBelt commented Apr 15, 2024

AgnesBelt commented Apr 17, 2024

AgnesBelt commented Apr 17, 2024

AgnesBelt commented Apr 18, 2024

willu47 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

willu47 left a comment •

edited

Loading