Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about hyperparameter searching #46

Open
anguoyang opened this issue Feb 8, 2023 · 1 comment
Open

about hyperparameter searching #46

anguoyang opened this issue Feb 8, 2023 · 1 comment
Assignees

Comments

@anguoyang
Copy link

Hi @ConnorBaker thanks for sharing your code.
Regarding the hyperparameters searching, do you intend to search a lightweight architecture? how about the resulted model in size/param/flops? thanks

@ConnorBaker
Copy link
Owner

Hi @anguoyang,

Credit for the code goes to @Algolzw!

I'm fairly new to machine learning and the techniques available for both training and tuning. Regarding hyper parameter searching, I'm using Syne Tune (https://github.com/awslabs/syne-tune) because it supports multi-objective hyper parameter optimization. In particular I wanted to be able to optimize along the Pareto frontier for PSNR, MSSSIM, and LPIPS (and not just one of them).

From some of the reading I've done recently, I see that there's a distinction drawn between hyper parameter optimization and architecture optimization (called neural architecture search, I think?). I haven't messed with much of the latter, but it looks interesting. Towards that end, I've been looking at (but have not used):

In the immediate term, I'm looking into packaging the repo with Nix because I am exhausted by steps I have to take to get the build I want working locally. I don't think I've pushed it yet, but I've been using PyTorch with CUDA 12 and Triton from head, all patched to work with the 4090 I got recently. It's been a huge pain compiling everything from source over and over, so I'd like very much to be able to just use Nix (which I am familiar with) instead of crying over Dockerfiles every day.

Beyond that, I'd like to swap out some of the components for more performant or (potentially accurate) counterparts. For example, I think the following things hold some promise:

I also factored out some code from this repo into separate repos to make it more maintainable:

I'd really like to work some more on mfsr_utils in particular -- I feel like there's a lot of commonalities in SR workflows (multi and single frame) which people end up re-implementing. I'd like to add in some common augmentations (blur kernel, noise, etc.) and support taking patches from larger images (instead of relying on them to be pre-cropped to small sizes).

I've also been thinking about modifying the model so it can work with different sizes of input. In other words, begin training on very small images (perhaps 16x16) and over the course of training, increase to higher resolution images. I'm curious if that would help performance.

After that I'll probably revisit the training/optimization portion.

I hope that answers your question! If you have any references you think would be interesting, I'd love to see them.


I just realized that this doesn't exist outside of my head anywhere so I'm going to keep this issue open so I don't lose track of it.

@ConnorBaker ConnorBaker self-assigned this Feb 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants