-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration of Continual Learning tasks and algorithm (WIP) #45
Draft
kalifou
wants to merge
171
commits into
araffin:master
Choose a base branch
from
GaspardQin:circular_movement_omnibot
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
171 commits
Select commit
Hold shift + click to select a range
967e4ac
Added targets files
Caselles cbffb3b
new tasks for continual learning: random target, circular and square …
kalifou 8949baa
Merge branch 'circular_movement_omnibot' of https://github.com/Gaspar…
Caselles 8f23671
adding args for learning the CL tasks
kalifou 6750919
Merge branch 'circular_movement_omnibot' of https://github.com/Gaspar…
Caselles daff917
collect CL args for replay
kalifou ad352c9
Merge branch 'circular_movement_omnibot' of https://github.com/Gaspar…
Caselles bc21db4
WIP on continual tasks
Caselles c68baeb
Continual tasks: added vizu and solved a few bugs
Caselles a1d4da9
Solved bug on history not getting emptied between episodes
Caselles bf9c6c9
add penality for bumping
kalifou 7de393d
coeff for circular task
kalifou edd82fa
fix reward shaping with the product operator
kalifou ed6923c
adding new task - eight shape (draft)
kalifou 1920de4
On-Policy dataset-generator
kalifou e6bdde1
add small fix
kalifou 93ea793
Merge branch 'master' of https://github.com/GaspardQin/robotics-rl-sr…
TLESORT a5b6623
Generative Replay for Dataset generation
kalifou b976aa8
Merge branch 'circular_movement_omnibot' of https://github.com/Gaspar…
TLESORT 71f29fa
fix to on-policy generation for srl based policies
kalifou 413c447
fix to loading args for replay
kalifou e99f6f2
Merge branch 'circular_movement_omnibot' of https://github.com/Gaspar…
TLESORT 6082a68
first steps towards policy distillation
TLESORT a6dbfe8
small fix (init OmniRobotManagerBase)
kalifou bc224b8
cleaning pkgs imports
kalifou bcb12db
clean up & loading srl model in distillation script
kalifou 4fb1e7b
cross-evaluation
sun-te deaf450
cross-evaluation
sun-te ca5df77
read-me update
sun-te c2d0405
plot results
sun-te 89aa184
cross evaluation and comparison plot
sun-te 7e07c29
On-policy generation: Fix to save action proba
kalifou c05bf7d
draft: Policy distillation
kalifou a612571
loss update
kalifou 94aba32
bug-fix for pipeline cross
sun-te c69e558
format
kalifou a352a77
format and update data loader in submodule (srl_zoo)
kalifou 3a7de3a
Merge branch 'dev_distillation' of https://github.com/GaspardQin/robo…
kalifou 3d133ab
fix for off-policy generation
kalifou b797657
distillation: handling ts size & saving model
kalifou e446489
fix for replay
kalifou 9de7d58
additional fix for replay
kalifou 9e8bfb8
change loss for distillation for mse, it seems to work better
TLESORT b19da9f
update Distillation loss: swith to MSE
kalifou 1bb20a5
format, fix args safety & load
kalifou 9f64156
remove useless script, fix distillation with raw pixels & dataset fus…
kalifou 39c5d22
fix for distillation using RL from raw_pixels
kalifou dd1fdbe
Merge branch 'dev_distillation' into circular_movement_omnibot
kalifou 1edda89
cross evaluation during training
sun-te 9cace21
fix for distillation from raw_pixels
kalifou dd8cba7
fix data fusioner
kalifou 6e865aa
Option for generating shorter episodes (SC) and fix
kalifou 8797884
command for distillation readme updated
TLESORT 911a7e0
fix merged conflict, i have put MSE for distillation loss
TLESORT 159947f
command for distillation readme updated
TLESORT 8ddd317
loss MSE for distillation
Caselles 570251d
Update Readme for distillation
kalifou db434a5
Added KL loss for distilaltion
Caselles be399f7
change cpu_number to 6
TLESORT 1be664a
change cpu_number to 6
TLESORT 85b588f
file added to be able to run all experiments at once
TLESORT cb160b7
Update readme
sun-te 2574888
add args for replay when loading specific env task (Omnirobot)
kalifou ca757a8
Merge branch 'circular_movement_omnibot' of https://github.com/Gaspar…
kalifou 487df1b
fix command for random dataset generator
TLESORT 3687a75
Added sample flag for getAction and fixed KL loss for distillation.
Caselles 44730bc
fix command for train SRL and added a force flag for dataset generator
TLESORT 2876f53
Merge branch 'circular_movement_omnibot' of https://github.com/Gaspar…
TLESORT a3a0784
update datafusionner to log task labels to each obs
kalifou 059a190
update option for shorter eps (CC)
kalifou 1a42fdd
update: add option in distillation loss for temperature depending on …
kalifou 62bb031
fix adaptive loss
kalifou 2a6c0db
update safety
kalifou f3b21a8
distillation readme fully tested, normally everything is written in i…
TLESORT 74fe983
default temperature changed to 0.1
TLESORT 621cbaf
Merge branch 'circular_movement_omnibot' into cross-task
kalifou cfa068f
merge: circular_... into cross_task
kalifou f6a5d72
short episode flag added into dataset_generator for cc
TLESORT 3cbd093
cross evaluation for model trained by srl
sun-te 7e39645
Update Distilation_Readme.md
sun-te 619cd6b
Update Distilation_Readme.md
sun-te eee27f4
cross eval
sun-te d3f799e
Merge branch 'cross-task' of https://github.com/GaspardQin/robotics-r…
sun-te 34fff5d
fix MLP Policy for distillation
kalifou cc6c9d4
Merge branch 'circular_movement_omnibot' into cross-task
kalifou ef81461
add automatic creation of save_path if the folder does not exist
TLESORT a90681b
add latest past possible (to use carefully)
TLESORT 4ab3c49
add latest flag
TLESORT 6fd19d1
normalisation of reward
sun-te 6b37395
update and fix readme with comments
TLESORT a776b9a
Merge branch 'cross-task' of https://github.com/GaspardQin/robotics-r…
sun-te bca7eaf
Merge branch 'circular_movement_omnibot' of https://github.com/Gaspar…
TLESORT 919fc9b
first version of script to run all in once
TLESORT 66b6bdf
small fix, NB : starting from a clean repos is recommended to run the…
TLESORT be7b779
tested run in once script
TLESORT 162d96a
dry run file for end to end testing
TLESORT 95e02be
name's folder have been parametrize to easely change path
TLESORT 77b1cb0
evaluation for student policy(TODO)
sun-te f586d0c
Added scripts for raw pixels
Caselles 6fdba9e
grid walking policy (draft)
kalifou adf48fb
grid walker for exploration in on-policy data generation
kalifou f53dbc5
Option for finetuning of SRL while distilling
kalifou 4cdb4f4
cross_eval and student eval
sun-te c8af536
update: student policy distillation and eval while learning a teacher…
kalifou e561b93
update of distillation eval
kalifou 35286ef
update eval distillation
kalifou 4c6bdfd
Adjust episode window for checkpoints when saving a teacher
kalifou 324aa20
fix default value
kalifou db3fb30
Merge branch 'circular_movement_omnibot' into cross-task
kalifou a09cb48
update distillation eval
kalifou 8afa433
add fix (copy merge to proper loc)
kalifou 03091b1
Plot for cross evaluation
sun-te eacc954
eval of student from single source
kalifou 1e2ec91
corss eval after training
sun-te bca6200
comment and evaluation bugs fixed
sun-te dcea7b9
cross plot
sun-te 0264c31
merge updates from master & cross-task into current branch
kalifou 3d0f79d
clean and test for distillation(draft)
kalifou fd08dd7
corss eval and dataset generator, test_eval
sun-te bac57ce
dataset generator update
sun-te 7fe104c
merge cross-task branch into current
kalifou 80c01ae
update tests for distillation
kalifou 9eec0f5
reduce distillation config files
kalifou a85942d
fix generator for cross env compatibility
kalifou bbe9c44
fix for on-policy data-gen: normalizing obs
kalifou b959e43
small fixes & cleaning: data-gen, distillation logs
kalifou 3874111
adapt generative replay for on-policy data generation
kalifou 51694e0
args.log_dir fix when latestPath is used
TLESORT 9daf762
Merge branch 'circular_movement_omnibot' of https://github.com/Gaspar…
TLESORT e1b72bd
more informative print
TLESORT 2b9dcef
fix merge
TLESORT 8982d7c
Merge branch 'circular_movement_omnibot' of https://github.com/Gaspar…
TLESORT 299a374
escape task
sun-te 86c7a73
update for escape evaluation
sun-te 6ec2e0a
ground_truth moification
sun-te ea564a2
modify readme.md in policy distillation
8ae7d9b
note in dataset_generator
4646f16
update supervised_rl/reade.md
9490fa8
fix distillation at checkpoints for CC (TC)
kalifou cc4e30c
Merge pull request #4 from saybunthet/circular_movement_omnibot
kalifou 208d472
cleaning
kalifou af944ee
Merge branch 'circular_movement_omnibot' into escape_dev
kalifou 486ea7e
fixed orientation for the chasing agent
sun-te 12cd0c0
bug fixed
sun-te 77f7267
target position update
sun-te 7f2ce56
fix merger in case of distillation
kalifou fe09e49
reward update
sun-te b8899f9
Merge branch 'escape_dev' of https://github.com/GaspardQin/robotics-r…
sun-te ed26f2e
reward can be float for circular task and escaping task
sun-te 83bed1e
a new dataset merger for the balanced timesteps settings during the m…
sun-te d8fdb05
Merge pull request #8 from GaspardQin/circular_movement_omnibot_data_…
sun-te 8701bb0
Revert "reward can be float for circular task and escaping task"
sun-te fbde89d
dataset manager
sun-te 52e4730
separator
sun-te 99d3039
data separator
sun-te ce2d7e4
separator
sun-te f6c0f03
sparser dataset
sun-te d45a369
resampling of data
sun-te e622e1f
float reward data merger
sun-te e823cbb
Merge pull request #9 from GaspardQin/revert-8-circular_movement_omni…
sun-te 6c29da3
separator
sun-te a1d2639
dataset_merger can preserve the original dataset for further use
sun-te bf50fc5
cleaning
sun-te 24b16be
learning
sun-te 6e2272c
preserve original data after merge
sun-te d10e9b6
resampling for the distillation
sun-te 11586d1
cleaning
sun-te 604b1ac
Merge pull request #11 from GaspardQin/escape_dev
sun-te d4c2bd9
test4esc&clearning
sun-te 9527b00
Delete delete_val.ipynb
sun-te e2f4c76
Update environment.yml
sun-te File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
|
||
OmnirobotEnv-v0: | ||
# Base path to SRL log folder | ||
# log_folder: srl_zoo/logs/Omnibot_random_simple/ | ||
log_folder: srl_zoo/logs/Omnibot_circular/ | ||
autoencoder: 19-02-04_23h27_22_custom_cnn_ST_DIM200_autoencoder_reward_inverse_forward/srl_model.pth | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
|
||
OmnirobotEnv-v0: | ||
# Base path to SRL log folder | ||
# log_folder: srl_zoo/logs/escape_agent/ | ||
log_folder: srl_zoo/logs/escape_agent/ | ||
autoencoder: 19-02-04_23h27_22_custom_cnn_ST_DIM200_autoencoder_reward_inverse_forward/srl_model.pth | ||
srl_combination: 19-06-03_18h38_59_custom_cnn_ST_DIM200_autoencoder_inverse/srl_model.pth | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
|
||
OmnirobotEnv-v0: | ||
# Base path to SRL log folder | ||
# log_folder: srl_zoo/logs/Omnibot_random_simple/ | ||
log_folder: srl_zoo/logs/merge_CC_SC/ | ||
autoencoder: 19-02-04_23h27_22_custom_cnn_ST_DIM200_autoencoder_reward_inverse_forward/srl_model.pth | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
|
||
OmnirobotEnv-v0: | ||
# Base path to SRL log folder | ||
# log_folder: srl_zoo/logs/Omnibot_random_simple/ | ||
log_folder: srl_zoo/logs/Omnibot_random_simple/ | ||
autoencoder: 19-02-04_23h27_22_custom_cnn_ST_DIM200_autoencoder_reward_inverse_forward/srl_model.pth | ||
|
||
|
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I recall, we would only need libz-dev as Atary requirement, since cmake is already part of stable-baselines install guidelines