Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GrayBox predictor #96
Add GrayBox predictor #96
Changes from all commits
1a491da
ba7bc0c
f8a1e2d
a1642d3
71531c7
fc73fe6
c474e83
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the Flux model always use
Float32
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's the default. Different precision is a bit all over the place, since if you throw
x::Vector{Float64}
in, it automatically changes your weights to the same type. I dislike this aspect of Flux. At the very least, let's wait until someone complains before addressing this. It works for the tests.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add a Hessian function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do people compute Hessian's of NNs? Or you just want arbitrary possibility? Do you have an example where this is useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My sense is to leave as-is for the first pass. We can always add it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a paper I am about to submit, we used hessians with a PyTorch NN in an optimal control problem and saw a significant speed up. This was done with PyNumero's graybox interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you link me the code of getting Hessians etc out of torch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll dig it up from my former student that just graduated. In the meantime, I know that we used
torch.func
which provides functions to evaluate the Jacobian and the Hessian directly: https://pytorch.org/docs/stable/func.api.htmlThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also found
jacobian = torch.autograd.functional.jacobian(model, x)
. Butfunc
seems better.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tried
torch.autograd.functional
, but it was quite a bit slower. Notably, we did leverage the batch abilities oftorch.func
to evaluate all the gradients of a NN over a different sets of inputs which probably gavetorch.func
an extra advantage.