hash collusions and higher dimensional hash grids #346

leventt · 2022-03-14T20:34:00Z

leventt
Mar 14, 2022

First of all, this work has been inspiring for me to say the least and I appreciate the structuring of the repos and the work on tcnn!!!

I have been tinkering with tcnn and multi-res hash grid encodings with higher input dimensions.
I had to enable 5/6/7 dimensions here:
https://github.com/NVlabs/tiny-cuda-nn/blob/bd29d77c589e7593680a3aa508e35a0136df904b/include/tiny-cuda-nn/encodings/grid.h#L931

I am wondering why they are disabled by default. My experience so far has been that it may be causing more hash map collusions but I didn't dig deeper to confirm.

I am also curious to confirm my suspicion that most of the representation may be encoded in the hash grid resolutions (especially for my case, where I use this as a material encoder in UV space) rather than the following MLP ~~where the input and the data structure lookup is concatenated.~~ (apparently encoding is not concatenated as I just found out here #151 (comment), but my question still may be relevant)

Has there been experimental results where having more learnable features in the hash grid improved convergence times?
I have had faster convergence with 3e-4 learning rate and bigger log2 hash map sizes so far but I usually can't fit my structures into L1/L2 caches where the Fully Fused MLP shines... I am curious, if there may be a sweet spot where the convergence is fast enough at lower amounts of epochs with some hyper parameter tuning while sacrificing the structural packing and fusing optimizations.

summary of my questions (which are all over the place):
1.) Why higher dimensional inputs for hash grid encodings are disabled by default?
2.) Are there any best practices or observations for hash collusions?
3.) Are these representations memorized mostly in the spatial data structures rather than the MLP and perhaps, has there been any results to confirm or deny that?
4.) At the expense of not having some optimizations, can there be faster convergence, and if there has been any experiments for that?

Thanks for your time in advance and the great work \o/

Tom94 · 2022-03-16T14:40:31Z

Tom94
Mar 16, 2022
Maintainer

Hi Ahmet, thank you very much for the kind words!

The trouble with higher dimensions is two-fold:

N-linear (or higher order) interpolation is exponential in the dimensionality, since each cell has 2^D vertices. This makes higher dims less attractive very quickly.
Some of the dimensions one might like to add (e.g. time for a 4D grid, or view dir for a 5D grid) have very little correlation with the spatial dimensions. For example, in a stationary scene where just one small object moves through time, most of the scene should share its spatial content over the time axis. But the hash grid doesn't allow this -- transitioning to another cell on the time axis also scrambles all the hash values throughout space. For this reason, I don't think the current hashing approach is suitable to these types of "identify the low-D manifold in high-D space" learning approaches. NNs are much better at this. :)

This should answer (1), now the other queries:

The geometric scaling of the hash tables is used precisely to avoid having to analyze collision density. If N_min and N_max are chosen such that N_min means "no collisions" and N_max means "too many collisions to make sense of", then the spectrum of resolutions inbetween covers any possible collision rate where learning can make sense. In a way, this is needed in tasks where the structure of the problem (e.g. 2D surface in 3D space) is not apparent ahead of time and thus the sweet-spot for table size vs. resolution only emerges during training.
The answer is probably a fuzzy "both". What's annoying about learning from data is that the resulting representation often isn't very well interpretable... if you use the GUI, for example, to visualize the encoded values as well as the hidden activations of the neural network, you'll see that it's hard to interpret precisely what's being stored. That said, I suspect that most of the information content of spatial quantities is represented by the spatial data structure, just based on how much these data structures help in reconstruction quality.
I'm not aware of any particular approaches, unfortunately. This would be a promising thing to research further.

Cheers!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hash collusions and higher dimensional hash grids #346

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

hash collusions and higher dimensional hash grids #346

leventt Mar 14, 2022

Replies: 1 comment

Tom94 Mar 16, 2022 Maintainer

leventt
Mar 14, 2022

Tom94
Mar 16, 2022
Maintainer