You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a neural network-based log probability function $log p_{NN}(\theta|x, \vec{t})$. If I increase the size of $\vec{t}$ my code essentially creates a batch of x repeated len($\vec{t}$) times. While I am able to refactor my code so I compute this log probability and add it in smaller batches of len($\vec{t}_{BATCH}$), I still run into memory issues while computing the gradient.
I further refactored the log probability function code to accumulate the gradients while evaluating the log probability function so it returns a log probability as well as its gradient with respect to the parameters $\theta$. Now, the pass_grad argument seems to only accommodate a constant tensor or function that returns a tensor of dimension D. The NN-based log probability is also stochastic so I cant wrap the gradient as a separate function and pass it separately.
I would ideally like to restructure the code so as to evaluate the gradients when it evaluates the log probability -- I was going to modify my local hamiltorch package to do this, but I first thought I'd check if there's already a function in the package that handles this or a better workaround, in case other users have encountered this before?
The text was updated successfully, but these errors were encountered:
I have a neural network-based log probability function$log p_{NN}(\theta|x, \vec{t})$ . If I increase the size of $\vec{t}$ my code essentially creates a batch of x repeated len($\vec{t}$ ) times. While I am able to refactor my code so I compute this log probability and add it in smaller batches of len($\vec{t}_{BATCH}$ ), I still run into memory issues while computing the gradient.
I further refactored the log probability function code to accumulate the gradients while evaluating the log probability function so it returns a log probability as well as its gradient with respect to the parameters$\theta$ . Now, the pass_grad argument seems to only accommodate a constant tensor or function that returns a tensor of dimension D. The NN-based log probability is also stochastic so I cant wrap the gradient as a separate function and pass it separately.
I would ideally like to restructure the code so as to evaluate the gradients when it evaluates the log probability -- I was going to modify my local hamiltorch package to do this, but I first thought I'd check if there's already a function in the package that handles this or a better workaround, in case other users have encountered this before?
The text was updated successfully, but these errors were encountered: