Conversation
pina/pinn.py
Outdated
| data_loader = self._create_dataloader() | ||
| data_loader_loop = data_loader |
There was a problem hiding this comment.
The dataset creation is just a view of the original tensors or do they implicitly perform new allocations?
There was a problem hiding this comment.
Can you be more specific?
There was a problem hiding this comment.
Is allocating twice the same data (input_pts and dataloader) or are they just references to same memory? Have you check that?
There was a problem hiding this comment.
from my understanding points are allocated in input_pts then in create dataloader we return a dict of iterables, each iterable takes a dataset as input (one for each key of the dict). A dataset has _tensors as private variable, so it is creating a shallow copy of input_pts[key], which is needed for the __get_item__ method.
There was a problem hiding this comment.
This is how I thought to handle different datasets (one for each key), for creating the dataloader and maintaining the same train routine. Notice that allocation is done on CPU, the tensors are moved to GPU in batches anyway. In practice input_pts is always on CPU (even when trained on GPU).
There was a problem hiding this comment.
Umh, I'm lost since you're saying it's just a shallow copy, and few lines below you state that there are actually two copies of the same tensor.
From my knowledge, I'm not even sure it's a shallow copy. Maybe we can merge this PR and asap I gonna retouch the span_pts method in order to directly create the DataSet and always use such a data structure. Does it sound good for you?
There was a problem hiding this comment.
Ok for me, we can put it as a new issue and I can help in trying to fix it.
|
Everything should now be in a class, the code is more readable and names are more self explanatory |
Solving #33