Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with checkpoint #189

Closed
AarushShintre opened this issue Jul 18, 2024 · 4 comments
Closed

Issue with checkpoint #189

AarushShintre opened this issue Jul 18, 2024 · 4 comments

Comments

@AarushShintre
Copy link

Hi, I just started using lightning-pose on the main branch for the multi-data stream(non-fused) method. I am repeatedly met with the following error:
Screenshot from 2024-07-18 16-46-45

I would really appreciate any help with this!

Our Hydra config file:

data parameters

image_orig_dims: {'height': 540, 'width': 720}
image_resize_dims: {'height': 384, 'width': 384}
data_dir: /home/gangliaguardian/lightning-pose/data/multi-test/
video_dir: /home/gangliaguardian/lightning-pose/data/multi-test/videos/
csv_file: ['view0.csv', 'view1.csv', 'view2.csv', 'view3.csv']
view_names: ['view0', 'view1', 'view2', 'view3']
downsample_factor: 2
dynamic_crop: False
num_max_instances: 1
num_keypoints: 5
keypoint_names: ['nose', 'tail_base', 'tail_point_1', 'tail_point_2', 'tail_end']
mirrored_column_matches: None


training parameters

imgaug: dlc
train_batch_size: 8
val_batch_size: 1
test_batch_size: 1
train_prob: 0.95
val_prob: 0.05
train_frames: 1
num_gpus: 1
log_every_n_steps: 10
check_val_every_n_epoch: 5
num_workers: 4
early_stop_patience: 3
unfreezing_epoch: 20
min_epochs: 300
max_epochs: 300
ckpt_every_n_epochs: 10
gpu_id: 0
rng_seed_data_pt: 0
rng_seed_model_pt: 0
lr_scheduler: multisteplr
lr_scheduler_params: {'multisteplr': {'milestones': [150, 200, 250], 'gamma': 0.5}}


model parameters

losses_to_use: []
backbone: resnet50_animal_ap10k
model_type: heatmap
heatmap_loss_type: mse
model_name: test
checkpoint: None
lightning_pose_version: 1.5.0


dali parameters

general: {'seed': 123456}
base: {'train': {'sequence_length': 32}, 'predict': {'sequence_length': 96}}
context: {'train': {'batch_size': 16}, 'predict': {'sequence_length': 96}}


losses parameters

pca_multiview: {'log_weight': 5.0, 'components_to_keep': 3, 'epsilon': None}
pca_singleview: {'log_weight': 5.0, 'components_to_keep': 0.99, 'epsilon': None}
temporal: {'log_weight': 5.0, 'epsilon': 20.0, 'prob_threshold': 0.05}


callbacks parameters

anneal_weight: {'attr_name': 'total_unsupervised_importance', 'init_val': 0.0, 'increase_factor': 0.01, 'final_val': 1.0, 'freeze_until_epoch': 0}

@themattinthehatt
Copy link
Collaborator

@AarushShintre how many labeled frames do you have? this is something I've been meaning to fix for a bit, the checkpoints don't get saved out properly if you have <20ish frames

@AarushShintre
Copy link
Author

I used just 10 frames per view as a test. This might have caused the issue. I will try again with 25 frames and update if anything changes. Also, do you plan on adding the multi-data stream(non fused) to the Pose-app?

@themattinthehatt
Copy link
Collaborator

I think it should work with 25 frames but yes, please let me know if it doesn't (I'll still look into the checkpointing issue with a small number of training frames).

We don't currently have plans to add the multi-data stream to the Pose-app; the problem is that LabelStudio doesn't natively work with multiple views, so we'd need to figure out how to make it easy for people to label their multiview datasets. How did you label your data by the way?

@YitingChang
Copy link

YitingChang commented Jul 19, 2024

I'm also working on multiview model training. I've been using JARVIS Annotation Tool to label my multiview datasets. It's pretty easy to install and use. If you do camera calibration, it can even project manual annotations on a subset of the cameras to the remaining ones, which significantly reduces labeling work!
https://github.com/JARVIS-MoCap/JARVIS-AnnotationTool

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants