Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First results #11

Merged
merged 64 commits into from
Jun 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
52e5e08
mini changes fixed policy scripts
Mar 21, 2024
1bdbeab
refactored hf login
Mar 21, 2024
c4f47d2
typo, expanding opt space
Mar 21, 2024
49bd134
added new hyperpar files
Mar 21, 2024
34187eb
small fixes
Mar 21, 2024
11be3cb
exploring the population dynamics, especially with a higher survival …
Apr 3, 2024
2ed9f7e
updated train script/sb3 train util
Apr 4, 2024
aa38bac
added parallel rl train script
Apr 4, 2024
daf878e
updated rppo yaml
Apr 4, 2024
0509809
updated yaml name
Apr 4, 2024
235b1be
updated relative paths
Apr 4, 2024
7bc68e0
added installation to train bash script
Apr 4, 2024
29b7e8b
wacky behavior in evaluate_policy: now use in-house eval_pol for opti…
Apr 5, 2024
9c42b98
found r_dev bug on reset, fixed it! Plus, playing around with paramet…
Apr 8, 2024
81aefba
added SystemDynamics
Apr 9, 2024
8103a8a
initialize_population a bit leaner
Apr 11, 2024
ff44f4b
added custom harv vul, observe_total, updated notebooks with more com…
Apr 24, 2024
ba789ec
cautionary rule now has two possible obserrvations
Apr 24, 2024
b2c8629
added to pop-dyn tests, CR and Esc agents now admit biomass or mean_w…
Apr 24, 2024
e4554aa
notebooks, deleted legacy debug, attributes, varnames for CR and esc
Apr 25, 2024
e85cf9a
Added get_r_devs_v2 option
May 2, 2024
5008935
added ray simulator
May 2, 2024
8dfc897
now fixed_policy_opt script admits config files for env
May 2, 2024
ed79158
notebooks
May 2, 2024
42b2728
hyperpars
May 2, 2024
40e2cf2
added config file input to fixed_policy_opt, added optional id input
May 2, 2024
d993bf6
hyperpars
May 3, 2024
aeb91fd
hyperparams
May 3, 2024
904ef39
mntCar
May 4, 2024
659fdc4
added a ray train util
May 6, 2024
2aa8bd8
allow custom harvest and vulnerability curves
May 6, 2024
233381b
added asm CR-like
May 7, 2024
1942489
good hyperpars for ppo?
May 7, 2024
bed678d
added cr-like env, added trophy harvesting, messed with hyperpars
May 8, 2024
0b363e5
hyperpars
May 23, 2024
a4fe73d
now n_trophy_ages is an input parameter
May 23, 2024
da45ca0
training script/util handles hf properly now
May 23, 2024
1f24258
simulator not remote now
May 23, 2024
5a0d925
notebooks
May 23, 2024
29ef7a8
plot reproduction notebook
May 23, 2024
3d3fd3b
fixed policy by cases notebook
May 23, 2024
0bfbd3c
hyperpars
May 23, 2024
f1ef728
notebook
May 23, 2024
b3bb3ba
added AsmEnvEsc for escapement actions
May 23, 2024
40ff31c
hyperpars
May 23, 2024
33b0204
notebooks
May 23, 2024
f9b6d1c
AsmEnvEsc bugs
May 23, 2024
ed97060
notebook, AsmEnvEsc.get_mortality method
May 24, 2024
3b11b12
results notebook update
May 24, 2024
5ec60e8
added constant action agent
May 29, 2024
d016962
hyperpars
May 29, 2024
912b973
require scikit-learn specific version for skopt to work currently
May 29, 2024
9c8455d
fixed policy scripts update
May 29, 2024
cdf3200
mwt obs in AsmEnv
May 29, 2024
926c30a
small denominator safety in AsmEnvEsc
May 29, 2024
af28fb1
input files galore
May 29, 2024
f5fa42f
notebooks
May 29, 2024
814b164
longer tuning, do not store obj fun (unserializable)
May 30, 2024
1df123d
New train scripts
May 31, 2024
d471608
hyperpars for results
May 31, 2024
3a54646
New sb3 train util with checkpoint saving
May 31, 2024
9a6b287
added results/figures notebooks
Jun 6, 2024
b293f8d
updated results notebooks
Jun 6, 2024
664965f
updated tests
Jun 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions hyperpars/for_results/fixed_policy_UM1.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
config:
upow: 1
harvest_fn_name: "default"
n_eval_episodes: 250
n_calls: 70
id: "UM1"
repo_id: "boettiger-lab/rl4eco"
7 changes: 7 additions & 0 deletions hyperpars/for_results/fixed_policy_UM2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
config:
upow: 0.6
harvest_fn_name: "default"
n_eval_episodes: 250
n_calls: 70
id: "UM2"
repo_id: "boettiger-lab/rl4eco"
8 changes: 8 additions & 0 deletions hyperpars/for_results/fixed_policy_UM3.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
config:
upow: 1
harvest_fn_name: "trophy"
n_trophy_ages: 10
n_eval_episodes: 250
n_calls: 70
id: "UM3"
repo_id: "boettiger-lab/rl4eco"
41 changes: 41 additions & 0 deletions hyperpars/for_results/ppo_biomass_UM1.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# algo
algo: "PPO"
total_timesteps: 6000000
algo_config:
tensorboard_log: "../../../logs"
#
policy: 'MlpPolicy'
# learning_rate: 0.00015
policy_kwargs: "dict(net_arch=[64, 32, 16])"
#
# batch_size: 512
# gamma: 0.9999
# learning_rate: !!float 7.77e-05
# ent_coef: 0.00429
# clip_range: 0.1
# gae_lambda: 0.9
# max_grad_norm: 5
# vf_coef: 0.19
# policy_kwargs: "dict(log_std_init=-3.29, ortho_init=False, net_arch=[256, 128])"
# policy_kwargs: "dict(net_arch=[256, 128])"
use_sde: True
# clip_range: 0.1

# env
env_id: "AsmEnv"
config:
observation_fn_id: 'observe_1o'
n_observs: 1
#
harvest_fn_name: "default"
upow: 1
n_envs: 12

# io
repo: "cboettig/rl-ecology"
save_path: "../saved_agents/results/"

# misc
id: "biomass-UM1-64-32-16"
# id: "short-test"
additional_imports: ["torch"]
41 changes: 41 additions & 0 deletions hyperpars/for_results/ppo_biomass_UM2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# algo
algo: "PPO"
total_timesteps: 6000000
algo_config:
tensorboard_log: "../../../logs"
#
policy: 'MlpPolicy'
# learning_rate: 0.00015
policy_kwargs: "dict(net_arch=[64, 32, 16])"
#
# batch_size: 512
# gamma: 0.9999
# learning_rate: !!float 7.77e-05
# ent_coef: 0.00429
# clip_range: 0.1
# gae_lambda: 0.9
# max_grad_norm: 5
# vf_coef: 0.19
# policy_kwargs: "dict(log_std_init=-3.29, ortho_init=False, net_arch=[256, 128])"
# policy_kwargs: "dict(net_arch=[256, 128])"
use_sde: True
# clip_range: 0.1

# env
env_id: "AsmEnv"
config:
observation_fn_id: 'observe_1o'
n_observs: 1
#
harvest_fn_name: "default"
upow: 0.6
n_envs: 12

# io
repo: "cboettig/rl-ecology"
save_path: "../saved_agents/results/"

# misc
id: "biomass-UM2-64-32-16"
# id: "short-test"
additional_imports: ["torch"]
42 changes: 42 additions & 0 deletions hyperpars/for_results/ppo_biomass_UM3.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# algo
algo: "PPO"
total_timesteps: 6000000
algo_config:
tensorboard_log: "../../../logs"
#
policy: 'MlpPolicy'
# learning_rate: 0.00015
policy_kwargs: "dict(net_arch=[64, 32, 16])"
#
# batch_size: 512
# gamma: 0.9999
# learning_rate: !!float 7.77e-05
# ent_coef: 0.00429
# clip_range: 0.1
# gae_lambda: 0.9
# max_grad_norm: 5
# vf_coef: 0.19
# policy_kwargs: "dict(log_std_init=-3.29, ortho_init=False, net_arch=[256, 128])"
# policy_kwargs: "dict(net_arch=[256, 128])"
use_sde: True
# clip_range: 0.1

# env
env_id: "AsmEnv"
config:
observation_fn_id: 'observe_1o'
n_observs: 1
#
harvest_fn_name: "trophy"
n_trophy_ages: 10
upow: 1
n_envs: 12

# io
repo: "cboettig/rl-ecology"
save_path: "../saved_agents/results/"

# misc
id: "biomass-UM3-64-32-16"
# id: "short-test"
additional_imports: ["torch"]
41 changes: 41 additions & 0 deletions hyperpars/for_results/ppo_both_UM1.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# algo
algo: "PPO"
total_timesteps: 6000000
algo_config:
tensorboard_log: "../../../logs"
#
policy: 'MlpPolicy'
# learning_rate: 0.00015
policy_kwargs: "dict(net_arch=[64, 32, 16])"
#
# batch_size: 512
# gamma: 0.9999
# learning_rate: !!float 7.77e-05
# ent_coef: 0.00429
# clip_range: 0.1
# gae_lambda: 0.9
# max_grad_norm: 5
# vf_coef: 0.19
# policy_kwargs: "dict(log_std_init=-3.29, ortho_init=False, net_arch=[256, 128])"
# policy_kwargs: "dict(net_arch=[256, 128])"
use_sde: True
# clip_range: 0.1

# env
env_id: "AsmEnv"
config:
observation_fn_id: 'observe_2o'
n_observs: 2
#
harvest_fn_name: "default"
upow: 1
n_envs: 12

# io
repo: "cboettig/rl-ecology"
save_path: "../saved_agents/results/"

# misc
id: "2obs-UM1-64-32-16"
# id: "short-test"
additional_imports: ["torch"]
41 changes: 41 additions & 0 deletions hyperpars/for_results/ppo_both_UM2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# algo
algo: "PPO"
total_timesteps: 6000000
algo_config:
tensorboard_log: "../../../logs"
#
policy: 'MlpPolicy'
# learning_rate: 0.00015
policy_kwargs: "dict(net_arch=[64, 32, 16])"
#
# batch_size: 512
# gamma: 0.9999
# learning_rate: !!float 7.77e-05
# ent_coef: 0.00429
# clip_range: 0.1
# gae_lambda: 0.9
# max_grad_norm: 5
# vf_coef: 0.19
# policy_kwargs: "dict(log_std_init=-3.29, ortho_init=False, net_arch=[256, 128])"
# policy_kwargs: "dict(net_arch=[256, 128])"
use_sde: True
# clip_range: 0.1

# env
env_id: "AsmEnv"
config:
observation_fn_id: 'observe_2o'
n_observs: 2
#
harvest_fn_name: "default"
upow: 0.6
n_envs: 12

# io
repo: "cboettig/rl-ecology"
save_path: "../saved_agents/results/"

# misc
id: "2obs-UM2-64-32-16"
# id: "short-test"
additional_imports: ["torch"]
42 changes: 42 additions & 0 deletions hyperpars/for_results/ppo_both_UM3.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# algo
algo: "PPO"
total_timesteps: 6000000
algo_config:
tensorboard_log: "../../../logs"
#
policy: 'MlpPolicy'
# learning_rate: 0.00015
policy_kwargs: "dict(net_arch=[64, 32, 16])"
#
# batch_size: 512
# gamma: 0.9999
# learning_rate: !!float 7.77e-05
# ent_coef: 0.00429
# clip_range: 0.1
# gae_lambda: 0.9
# max_grad_norm: 5
# vf_coef: 0.19
# policy_kwargs: "dict(log_std_init=-3.29, ortho_init=False, net_arch=[256, 128])"
# policy_kwargs: "dict(net_arch=[256, 128])"
use_sde: True
# clip_range: 0.1

# env
env_id: "AsmEnv"
config:
observation_fn_id: 'observe_2o'
n_observs: 2
#
harvest_fn_name: "trophy"
n_trophy_ages: 10
upow: 1
n_envs: 12

# io
repo: "cboettig/rl-ecology"
save_path: "../saved_agents/results/"

# misc
id: "2obs-UM3-64-32-16"
# id: "short-test"
additional_imports: ["torch"]
41 changes: 41 additions & 0 deletions hyperpars/for_results/ppo_mwt_UM1.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# algo
algo: "PPO"
total_timesteps: 6000000
algo_config:
tensorboard_log: "../../../logs"
#
policy: 'MlpPolicy'
# learning_rate: 0.00015
policy_kwargs: "dict(net_arch=[64, 32, 16])"
#
# batch_size: 512
# gamma: 0.9999
# learning_rate: !!float 7.77e-05
# ent_coef: 0.00429
# clip_range: 0.1
# gae_lambda: 0.9
# max_grad_norm: 5
# vf_coef: 0.19
# policy_kwargs: "dict(log_std_init=-3.29, ortho_init=False, net_arch=[256, 128])"
# policy_kwargs: "dict(net_arch=[256, 128])"
use_sde: True
# clip_range: 0.1

# env
env_id: "AsmEnv"
config:
observation_fn_id: 'observe_mwt'
n_observs: 1
#
harvest_fn_name: "default"
upow: 1
n_envs: 12

# io
repo: "cboettig/rl-ecology"
save_path: "../saved_agents/results/"

# misc
id: "mwt-UM1-64-32-16"
# id: "short-test"
additional_imports: ["torch"]
41 changes: 41 additions & 0 deletions hyperpars/for_results/ppo_mwt_UM2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# algo
algo: "PPO"
total_timesteps: 6000000
algo_config:
tensorboard_log: "../../../logs"
#
policy: 'MlpPolicy'
# learning_rate: 0.00015
policy_kwargs: "dict(net_arch=[64, 32, 16])"
#
# batch_size: 512
# gamma: 0.9999
# learning_rate: !!float 7.77e-05
# ent_coef: 0.00429
# clip_range: 0.1
# gae_lambda: 0.9
# max_grad_norm: 5
# vf_coef: 0.19
# policy_kwargs: "dict(log_std_init=-3.29, ortho_init=False, net_arch=[256, 128])"
# policy_kwargs: "dict(net_arch=[256, 128])"
use_sde: True
# clip_range: 0.1

# env
env_id: "AsmEnv"
config:
observation_fn_id: 'observe_mwt'
n_observs: 1
#
harvest_fn_name: "default"
upow: 0.6
n_envs: 12

# io
repo: "cboettig/rl-ecology"
save_path: "../saved_agents/results/"

# misc
id: "mwt-UM2-64-32-16"
# id: "short-test"
additional_imports: ["torch"]
Loading
Loading