Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] add folder for active loop experiment #100

Draft
wants to merge 147 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
147 commits
Select commit Hold shift + click to select a range
f39bbe8
add folder for active loop experiment
ekellbuch Jun 30, 2023
dad3996
add notes from brainstorming session.
ekellbuch Jun 30, 2023
53a714b
add notes from brainstorming session with sample folder structure.
ekellbuch Jun 30, 2023
67a85c4
add loop to test on eval files
ekellbuch Jul 5, 2023
5b23846
add active loop functions and demo files
ekellbuch Jul 5, 2023
2440bc9
update readme
ekellbuch Jul 5, 2023
219f423
update default calls to iteration 1 in active loop codebase
ekellbuch Jul 5, 2023
d245373
remove unnecessary code
ekellbuch Jul 6, 2023
b74a412
run codebase
ekellbuch Jul 6, 2023
7249ca6
add tags to wandb call
ekellbuch Jul 6, 2023
e05f8fc
minor spacing change
ekellbuch Jul 6, 2023
ae6edd6
add fast_dev_run options
ekellbuch Jul 6, 2023
c6cf6e1
update active run id
ekellbuch Jul 6, 2023
e1ec698
add manual_step to predict_dataset to make computations on heatmap
ekellbuch Jul 10, 2023
15eaa30
update readme to include random baseline
HaotianXiangsti Jul 10, 2023
40646e2
Add config for Haotian Device
HaotianXiangsti Jul 11, 2023
487d296
Remove heat_map=True in trainer.predict, as it doesn't have this arg
HaotianXiangsti Jul 11, 2023
189e6c7
manual_step for export_predictions_and_labeled_video function
HaotianXiangsti Jul 11, 2023
e4f9643
change manual_step: Optional[str] to manual_step: Optional[bool]
HaotianXiangsti Jul 11, 2023
a6ae1fd
updat readme to include project baselines
ekellbuch Jul 13, 2023
acb60a7
add uncertainty baseline
ekellbuch Jul 13, 2023
b8abb14
demo ensemble
ekellbuch Jul 13, 2023
16fc98e
demo ensemble yaml
ekellbuch Jul 13, 2023
fbc4479
bash script to train many models
ekellbuch Jul 13, 2023
6d90791
Prediction file for heatmap
HaotianXiangsti Jul 16, 2023
3fec5d8
Merge branch 'active_loop' of https://github.com/ekellbuch/lightning-…
HaotianXiangsti Jul 16, 2023
10a52e2
Add functions for three active_learning methods
HaotianXiangsti Jul 17, 2023
775b979
All function needs argments only from configuration
HaotianXiangsti Jul 17, 2023
2d319c7
Delete code to save csv in local runtime (csv only for code test)
HaotianXiangsti Jul 17, 2023
5a568c1
neat for heatmap prediction
HaotianXiangsti Jul 17, 2023
7196995
add low energy randomlization
HaotianXiangsti Jul 17, 2023
ab0c6a5
Add arg for Ensembling
HaotianXiangsti Jul 18, 2023
c4cf848
Add parent path for previous output path
HaotianXiangsti Jul 18, 2023
239925e
Add arg for random and use before assign arg
HaotianXiangsti Jul 18, 2023
6f0187c
Use break to stop the for loop before exceeding the end iteration
HaotianXiangsti Jul 18, 2023
e9850e7
Uncomment the line for output_prev_run
HaotianXiangsti Jul 19, 2023
7a3ce1f
Add line to get current iteration number from active ymal
HaotianXiangsti Jul 19, 2023
f9e0b3e
I made a mistake and delete the line to get current itration number
HaotianXiangsti Jul 19, 2023
683decd
add ibl-paw yaml for Haotian
HaotianXiangsti Jul 19, 2023
2424a89
add mirror-mouse yaml for Haotian
HaotianXiangsti Jul 19, 2023
28121d7
Add ensemble func and let model run in iteration
HaotianXiangsti Jul 19, 2023
b1cdf9b
Add code in hydra for additional ensemble
HaotianXiangsti Jul 19, 2023
e1a0307
Change config for active-learning
HaotianXiangsti Jul 19, 2023
3405134
Add lines for non-ensemble method
HaotianXiangsti Jul 19, 2023
2aebe30
Fix the error that model cannot every iteration select 10 frames
HaotianXiangsti Jul 19, 2023
89309a2
Add oneline to check wether use_ensemble is True
HaotianXiangsti Jul 19, 2023
0fd871a
Rename the filename of files in every active learning iteration.
HaotianXiangsti Jul 19, 2023
ca71c53
Add current interation into file name
HaotianXiangsti Jul 19, 2023
87f2272
correct the dir for training file
HaotianXiangsti Jul 20, 2023
efcc927
add a flag for ensemble
HaotianXiangsti Jul 20, 2023
498b98f
add a flag for ensemble
HaotianXiangsti Jul 20, 2023
dc5cf94
add a flag for ensemble
HaotianXiangsti Jul 20, 2023
cd67120
add a flag in call_active_loop
HaotianXiangsti Jul 20, 2023
ab29a30
add function for margin sampling
HaotianXiangsti Jul 20, 2023
55a1b16
Add line for randomly read wrong csv file
HaotianXiangsti Jul 20, 2023
7cfacae
Add line for typecheck for margin sampling
HaotianXiangsti Jul 20, 2023
5283b91
finished margin sampling
HaotianXiangsti Jul 20, 2023
4f4b47c
Add code for cp active_test csv for every ietration
HaotianXiangsti Jul 20, 2023
c194958
add ensemble pipeline
ekellbuch Jul 20, 2023
89e554f
add dir for test_video_dir
HaotianXiangsti Jul 21, 2023
9ec4a97
change iterations_folder to iterations_folders
HaotianXiangsti Jul 21, 2023
22da470
change iterations_folder to iterations_folders in active yaml
HaotianXiangsti Jul 21, 2023
6160504
Add lines to calculate heatmap in predictions
HaotianXiangsti Jul 24, 2023
b8221f6
Add Margin Sampling and Single PCA Loss
HaotianXiangsti Jul 25, 2023
e38d541
Change dir to fit Haotian's Device
HaotianXiangsti Jul 25, 2023
29876d8
Change dir to fit Haotian's Device on active yaml
HaotianXiangsti Jul 25, 2023
3a6c54d
Add code to suit Haotian’s Device on Train_hydra
HaotianXiangsti Jul 25, 2023
2d43211
Add plt package for hist plot
HaotianXiangsti Jul 25, 2023
3a1462e
Add lines for Ensemble Method to working properly
HaotianXiangsti Jul 25, 2023
e9938f9
Add code to plot hist for active_test dataset
HaotianXiangsti Jul 26, 2023
7c99300
Add code to plot hist for active_test dataset on active_loop
HaotianXiangsti Jul 26, 2023
84bf09e
add maual_step flag for predict dataset on active test dataset
HaotianXiangsti Jul 26, 2023
cc1e0ae
Add lines for pixle error plot for EqualVariance
HaotianXiangsti Jul 27, 2023
c749035
Add simple rotation img aug method
HaotianXiangsti Jul 27, 2023
511cd4e
Add lines for pixle error plot for EqualVariance
HaotianXiangsti Jul 27, 2023
34b92e0
Add active method for Equal Variance
HaotianXiangsti Jul 27, 2023
5660eaa
Add function to caculate cosine similarity
HaotianXiangsti Jul 27, 2023
069277c
Import that cosine calculating function in trainhydra
HaotianXiangsti Jul 27, 2023
ce95232
Package EqualVariance Calculation Method in a function in prediction.py
HaotianXiangsti Jul 27, 2023
a6bfb4a
import equalvariance function from prediction.py
HaotianXiangsti Jul 27, 2023
e09d6d6
change naming method for active_test_equalvariance.csv
HaotianXiangsti Jul 27, 2023
0d3308f
Add comments on var_total.reshape(-1,12) which may cause issue when u…
HaotianXiangsti Jul 27, 2023
fc18545
add uncertainity sampling active method
HaotianXiangsti Jul 27, 2023
553cdb8
Change lines to select 10 smallest likelyhood in the unlabeled pool
HaotianXiangsti Jul 27, 2023
8d5bace
rename folder to avoid conflicts with newer pl version
ekellbuch Jul 28, 2023
08b8b6c
add test code for full pipeline
ekellbuch Jul 28, 2023
f542931
add todo based on list discussion
ekellbuch Jul 28, 2023
5935bae
Add a if loop to suit predition as a list
HaotianXiangsti Aug 3, 2023
7d30103
Add lines to bypass active test dataset
HaotianXiangsti Aug 3, 2023
2959e48
Change active_loop to active_pipeline to suit current folder name
HaotianXiangsti Aug 5, 2023
e4c0b97
Add function to select frames from each video
HaotianXiangsti Aug 7, 2023
364f8f3
Merge branch 'active_loop' of https://github.com/ekellbuch/lightning-…
HaotianXiangsti Aug 7, 2023
58e635b
Add Weight and Bias config for mirror mouse dataset
HaotianXiangsti Aug 7, 2023
d2861de
update with comments
ekellbuch Aug 8, 2023
6d482e0
update to read list for outputs
ekellbuch Aug 8, 2023
0060f87
update wandb to log metrics
ekellbuch Aug 8, 2023
ea2fa10
log all metrics in wandb if available
ekellbuch Aug 8, 2023
6dee538
make iterations relative to test set
ekellbuch Aug 8, 2023
32a7eb6
add code to test active pipeline
ekellbuch Aug 8, 2023
59c48d9
change subset function and add active_test flag for merge function (i…
HaotianXiangsti Aug 8, 2023
da4e981
add new subset sample function
HaotianXiangsti Aug 9, 2023
9d48509
add crim13 config
HaotianXiangsti Aug 9, 2023
5ef8f73
change a minor df bug
HaotianXiangsti Aug 9, 2023
53fa77e
add updated pipeline for active leraning utlis
HaotianXiangsti Aug 10, 2023
30a2cba
add upadtes pipeline for call_active_loop
HaotianXiangsti Aug 10, 2023
f90f617
Comment some unkonw lines
HaotianXiangsti Aug 11, 2023
4071308
add correct select samples methods
HaotianXiangsti Aug 28, 2023
9a3b461
add lines to keep select frames from same 5 vids
HaotianXiangsti Sep 4, 2023
9c54a80
keep the first 100 frames the same
HaotianXiangsti Sep 4, 2023
3960e5c
change to avoid a bug in os path join
HaotianXiangsti Sep 4, 2023
83efc0b
change to avoid a bug in os path join
HaotianXiangsti Sep 4, 2023
f54debd
Add test code for sampling frames function
HaotianXiangsti Sep 6, 2023
b008143
Debug the sample. Bug is from a OR operation when trying to group all…
HaotianXiangsti Sep 6, 2023
4b5278f
update test code for verify number of vids used
HaotianXiangsti Sep 7, 2023
759ecf5
Test Code for active_loop for iteration 0
HaotianXiangsti Sep 9, 2023
f2c1163
add comment
HaotianXiangsti Sep 18, 2023
5a46f73
Add dummy CollectedData csv file.
HaotianXiangsti Sep 20, 2023
355a9ba
I change the way of how to get the same starting frames. Change the l…
HaotianXiangsti Sep 20, 2023
8e95cf4
I change the way of how to get the same starting frames. Change the l…
HaotianXiangsti Sep 20, 2023
cb98db2
Add a function to find common frames between the frames in the same s…
HaotianXiangsti Sep 20, 2023
6405da3
add parameters for vids_select and frames sample
HaotianXiangsti Sep 26, 2023
0c98bab
add lines to automatic run the pipeline everytime
HaotianXiangsti Sep 26, 2023
0362264
change all hardcoded parameters in vids_select and frames sample func…
HaotianXiangsti Sep 26, 2023
c73b327
add a dummy csv file with only 200 frames
HaotianXiangsti Sep 26, 2023
7961096
add print funct for non random function to print training set size
HaotianXiangsti Sep 26, 2023
5e93b7d
add print funct for non random function to print training set size
HaotianXiangsti Sep 26, 2023
f5ea44f
change the CollectData_new into OOD data
HaotianXiangsti Sep 26, 2023
19ccb2b
add extra flag for wandb
HaotianXiangsti Sep 27, 2023
da28e61
cheng the comment of how to calculate number of sampled frames
HaotianXiangsti Sep 27, 2023
525be39
cahnge method to heatmap variance
HaotianXiangsti Sep 27, 2023
f064aac
add testcode for sampling func in test file
HaotianXiangsti Sep 27, 2023
1305edf
add more test func to test active_pipeline in Unitest
HaotianXiangsti Sep 27, 2023
74e5e71
add more
HaotianXiangsti Sep 27, 2023
03a618a
change the heatmap var number equal to number of keypoints
HaotianXiangsti Sep 28, 2023
9a4a05f
change the number of heatmap var corresponding to the number of keypo…
HaotianXiangsti Sep 28, 2023
b53a628
Merge branch 'active_loop' of https://github.com/ekellbuch/lightning-…
HaotianXiangsti Oct 12, 2023
6ea6703
change the test csv
HaotianXiangsti Oct 12, 2023
35f4236
change uncompatible probelm between torch and torchaudio
HaotianXiangsti Nov 1, 2023
e81c26b
change test csv file path
HaotianXiangsti Nov 1, 2023
3162aaa
change active yaml file path
HaotianXiangsti Nov 1, 2023
f6ba810
all methods start from the same ckpt in the first iteration.
HaotianXiangsti Nov 7, 2023
9adc8ad
Merge branch 'active_loop' of https://github.com/ekellbuch/lightning-…
HaotianXiangsti Nov 8, 2023
efe242d
all methods start from the same ckpt in the first iteration.
HaotianXiangsti Nov 8, 2023
4157976
change active yaml file path
HaotianXiangsti Nov 8, 2023
e4898b8
change active yaml file path
HaotianXiangsti Nov 8, 2023
ed7f77c
Add description for the active learning pipeline.
HaotianXiangsti Nov 15, 2023
6ae9f80
Add description for the active learning pipeline.
HaotianXiangsti Nov 15, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
110 changes: 110 additions & 0 deletions active_pipeline/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# Active Learning pipeline

The config file `configs/config_ibl_active.yaml` has the configuration details to run an active learning pipeline.
This config file is in addition to the config file for the experiment `configs/config_ibl_experiment.yaml`.

Before running an experiment, add the following parameters to your experiment config, as in `configs/config_ibl_experiment.yaml`:

```
export LIGHTPOSE_DIR="{path_to_lightpose}/lightning-pose"
```
- To track metrics for the active learning pipeline, we use [wandb](https://wandb.ai/).
```
# Additional flags for wandb.
wandb:
logger: True
params:
project:
entity:
tags:
```
- If there is a second dataset that you want to use for testing the active pipeline
```
# Additional flags for active learning pipeline
active_loop:
# location of active loop labels; for ex script, this should be relative to `data_dir`
csv_file: "../CollectedData_active_test.csv"
```


Then in your active pipeline config `active_loop/configs/config_ibl_active.yaml`, add your experiment config:
```
active_loop:
experiment_cfg: "{full_path_to_lp}/active_loop/configs/config_ibl_active.yaml"
```

One you have done this step you can run the test code to make sure everything is working:
```
python tests/active_loop/test_call_active.py
```

If there are no errors: run the full pipeline using
```
python active_loop/call_active_loop.py "{full_path_to_lp}/active_loop/configs/config_ibl_active.yaml"
```


# TODO:
- [ ] add compatibility for offline evaluation
- [ ] iterations folder can be an absolute path so as to not overwrite any config,
- [ ] add compatibility with datasets without active_test sets
- [ ] update test function: it runs full evaluation pipeline, but it should only use a subset of the test set.
- [ ] separate plotting code.


# Pipeline

Codebase:
- [x] Define active loop config, example in `configs/config_ibl_active.yaml`
- [x] Make `iterations` folder which is set in the config file.
- [x] Given a run with a `config.yaml` and a `data` directory. The `data` directory has a `CollectedData.csv` file withe label information.
- [x] Step 1: train a model using the config.yaml, which outputs a `predictions.csv` file
- Step 2:
- [x] Given a` predictions_new.csv file` select N frames at random <>
- AL methods: Fix videos which you want to label.
- [X] random sampling: input: predictions file: output: `random_frames.csv` .Possibly multiple runs, possibly data from multiple models.
- [ ] random sampling wo low energy
- [ ] margin sampling: input: callback when creating predictions file that computes the margins in the heatmap and output: `margin_frames.csv`
- [ ] ensemble sampling: input: multiple predictions file (from different models), output: `ensemble_frames.csv`
- [ ] error-loss based sampling: input predictions file, outputs: `loss_frames.csv` (reprojection error, smoothing error)
- [x] How to combine the frames from different methods?
- [x] create a new file `iteration_active_loop/experiment0/${method}_${num_frames}` with the `new_frames (and their labels)*` (labels available in debug mode or from user).
- [x] merge previous run train frames in `CollectedData.csv` in new `${method}_${num_frames}_CollectedData.csv` file.
- [x] update parameters in exp. config file (for example `configs/config_ibl_experiment.yaml`) to point to the updated `${method}_${num_frames}_CollectedData.csv` file*.


# Pipeline:

```
loop_iteration(method, data, loop_number)
- if loop_number = 0
- make data/iterations_folder
- if loop_number > 0
- run `select_frames(method)` on data.
- [x] output `iteration_#/'selected_frames_$method.csv'` which has the selected frames given a method
- [ ] call function `select_frames('all_methods')`: picks N frames from all methods
- output is `new_train_data.csv`
```


Launch experiment:
- [x] Step 0: start with folder with videos: `data/`
- split labeled data and unlabeled data into train/val/test + test_across_loop_iterations.
- `Collected.csv` labeled: ibl1/Collected.csv 1 video with 1k labels (to train model)
- `Collected_new.csv` unlabeled-videos: ibl1_corruption_level (to eval model, used to select frames)
- `Collected_test_loop.csv`test_across_loop_iterations: not in the bucket to be labeled (ibl1_gaussian_noise_5, ibl1_brightness_5) (to compare across active_loop iterations)
- [x] Step 1: Select frames to label
- loop_iteration(method='random', data, loop_number)
- [x] Step 2: Train a model:
- run `train_hydra.py `
- this produces an outputs/#/#/ with `predictions.csv`, `predictions_new.csv`, `predictions_test_loop.csv`checkpoints,etc.
- [x] Step 3: select frames for active loop
- `random_CollectedData.csv` = loop_iteration_#(method='random', data=`predictions.csv`)
- Go back to step 1. train_frames, where data.csv_file points to new `random_CollectedData.csv` file

# Baselines:
- [x] Random sampling
- [x] Ensemble Baseline:
- train 4 models each with different seeds, each will produce files `predictions.csv`, `predictions_new.csv`
- calculate the variance across the predictions of the models.
- select frames with the highest variance.
Loading