Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix device in regressor.py #3

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lucasresck
Copy link

Dear authors,

Thank you for releasing fast_l1 code together with datamodels.

This pull request intends to fix the device of some tensors in regressor.py.

While running the linear regression step of datamodels, I faced an issue regarding tensors not being in the same device.

After running

python -m datamodels.regression.compute_datamodels \
    -C regression_config.yaml \
    --data.data_path "$tmp_dir/reg_data.beton" \
    --cfg.out_dir "$tmp_dir/reg_results"

I would face something similar to

  File "/path_to_python3.9/site-packages/fast_l1-0.0.1-py3.9.egg/fast_l1/regressor.py", line 221, in train_saga
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

or

  File "/path_to_python3.9/site-packages/fast_l1-0.0.1-py3.9.egg/fast_l1/regressor.py", line 341, in train_saga
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

This happened because in lines 221 and 341 some CPU tensors are being indexed/sliced using other tensors that lie on the GPU, in this case, idx and still_opt_outer. On the other hand, they are both on the GPU because weight and train_loader in datamodels/datamodels/regression/compute_datamodels.py#L151 are on the GPU when train_saga is called.

This pull request ensures that both idx and still_opt_outer are on the CPU before indexing/slicing.

Indices need to be on CPU before being used to index/slice a tensor that is on CPU.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant