Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gltfpack: Experimental support for floating point position quantization #462

Merged
merged 7 commits into from
Aug 18, 2022

Conversation

zeux
Copy link
Owner

@zeux zeux commented Aug 17, 2022

Currently gltfpack provides all or nothing quantization: either all
attributes get quantized, or none do.

When quantization is used, we use the most size and memory efficient
formats, which is great, but it can lead to integration issues for
applications - since dequantization uses an extra transform, this
results in changes that are needed for some applications to take that
transform into account. Notably, these changes mostly revolve around
positions (and rarely around texture coordinates wrt texture
replacement, but positions are definitely the main source of issues).

Due to the flexibility of meshopt codecs however, there are other
options available that don't change the coordinate space of mesh
positions, but instead just change the positions a little bit to encode
more efficiently.

This change adds a new mode, -vpf, which uses floating point
quantization to encode position values:

  • When using regular compression (-c), we quantize each coordinate
    individually
  • When using extra compression (-cc), we use exponential filter when
    serializing the position stream

It is expected that either mode is less efficient than integer
quantization, but -cc can be closer to integer quantization and on
meshes with many attributes the extra size could be an acceptable
tradeoff for better compatibility. Of course this also results in geometry
taking a little bit more space after being loaded, since we use 12 bytes
per position instead of 8 bytes.

  • Fix min/max bounds (these need to be precise to validate correctly)
  • Gather size statistics for various models
  • Validate quantization bits behavior for models to ensure precision is reasonably close between different modes
  • Figure out how to select between float quantization vs separate vs shared exponent

Currently gltfpack provides all or nothing quantization: either all
attributes get quantized, or none do.

When quantization is used, we use the most size and memory efficient
formats, which is great, but it can lead to integration issues for
applications - since dequantization uses an extra transform, this
results in changes that are needed for some applications to take that
transform into account. Notably, these changes mostly revolve around
positions (and rarely around texture coordinates wrt texture
replacement, but positions are definitely the main source of issues).

Due to the flexibility of meshopt codecs however, there are other
options available that don't change the coordinate space of mesh
positions, but instead just change the positions a little bit to encode
more efficiently.

This change adds a new mode, -vpf, which uses floating point
quantization to encode position values:

- When using regular compression (-c), we quantize each coordinate
  individually
- When using extra compression (-cc), we use exponential filter when
  serializing the position stream

It is expected that either mode is less efficient than integer
quantization, but -cc can be closer to integer quantization and on
meshes with many attributes the extra size could be an acceptable
tradeoff for better compatibility.
@zeux
Copy link
Owner Author

zeux commented Aug 17, 2022

TODO:

  • Gather size statistics for various models
  • Validate quantization bits behavior for models to ensure precision is reasonably close between different modes
  • Check if exponential encoding (but one coordinate at a time) is better for -c

This may be a better idea for this mode in general since it doesn't make
one component suffer for the sake of others...

For now we always use exp encoding when using -vpf and compression, so:

- vpf + c: one at a time exponent encoding
- vpf + cc: shared exponent encoding
- vpf + no compression: float quantization

Time will tell if this is a good idea.
@zeux
Copy link
Owner Author

zeux commented Aug 17, 2022

Using very aggressive 8 bits for position (default is 14) on FlightHelmet:

-vp 8 -cc (current), ~10.1 bits per position
image

-vp 8 -vpf (new, float quantization), ~36 bits per position
image

-vp 8 -vpf -c (new, separate exponent encoding), ~21.7 bits per position
image

-vp 8 -vpf -cc (new, shared exponent encoding), ~14.3 bits per position
image

It looks like shared exponent might be too aggressive to pursue, since it introduces a significant amount of error due to coupled components - this gets worse because gltfpack likes to transform meshes to world space, which doesn't matter for quantization but does matter for this new mode (and might be something to disable conditionally...)

@zeux
Copy link
Owner Author

zeux commented Aug 17, 2022

BrainStem.gltf - again, with 8-bit quantization (very aggressive):

-vp 8 -cc (quantized integer): 10 bits
image

-vp 8 -vpf (quantized float): 38 bits
image

-vp 8 -vpf -c (separate exponent): 25 bits
image

-vp 8 -vpf -cc (shared exponent): 14 bits
image

Here shared exponent seems to be doing alright, probably because the mesh is more local / centered which is normal for rigged meshes.

@zeux
Copy link
Owner Author

zeux commented Aug 17, 2022

Because quantized compressed positions can be misleading wrt the resulting size, here's just the size stats for the two models above with default bits (14):

SciFiHelmet:
-cc (integer quantization): 32.7 bits
-vpf (float quantization): 47.9 bits
-vpf -c (separate exponent): 44.3 bits
-vpf -cc (shared exponent): 39.6 bits

BrainStem:
-cc (integer quantization): 29.8 bits
-vpf (float quantization): 44.8 bits
-vpf -c (separate exponent): 43.4 bits
-vpf -cc (shared exponent): 32.8 bits

So that I don't get confused in the future: all numbers for float quantization do include vertex codec compression on top even if the current version of the code doesn't allow that configuration.

Here's what the data looks like for the entire glTF buffer (sans textures) after zstd:

SciFiHelmet:
-cc (integer quantization): 562 KB
-vpf (float quantization): 670 KB
-vpf -c (separate exponent): 650 KB
-vpf -cc (shared exponent): 608 KB

BrainStem:
-cc (integer quantization): 254 KB
-vpf (float quantization): 310 KB
-vpf -c (separate exponent): 305 KB
-vpf -cc (shared exponent): 265 KB

@zeux
Copy link
Owner Author

zeux commented Aug 17, 2022

CarbonBike model from #433 (the new mode isn't sensible to local transforms to the same extent and as such the model is preserved well), before gzip/zstd:

no quantization: vertex 3208635 bytes, index 126219 bytes
integer quantization: vertex 848142 bytes, index 126219 bytes
float quantization: vertex 1186861 bytes, index 126219 bytes
separate exponent: vertex 1166609 bytes, index 126219 bytes
shared exponent: vertex 1049273 bytes, index 126219 bytes

Overall this feels like a great future replacement for -noq but I'll need to figure out how to think about shared/separate exponent given the precision concerns vs size tradeoff.

@zeux
Copy link
Owner Author

zeux commented Aug 17, 2022

One other alternative is to share the exponent between all components of a single coordinate in a vertex stream. This is close in spirit to uniform quantization, as it essentially selects a power-of-two bucket and thus should be close to the error of the integer quantization, but the exponent doesn't take space in the stream as it compresses perfectly.

8 bits + integer quantization: 10.1 bits:

image

8 bits + float quantization: 36 bits:

image

8 bits + separate exponent: 20.4 bits:

image

8 bits + shared exponent: 14.3 bits:

image

8 bits + component exponent: 14.2 bits:

image

zeux added 5 commits August 16, 2022 21:54
Here instead of sharing the exponent between components of each vertex,
we share the exponent between all vertices for each component, so we
only get 3 unique exponents (and as such the exponent byte compresses
perfectly).

This is similar in spirit to integer quantization, although just like
with all other exponent encodings / float quantization, the position of
the mesh affects the precision of the encoding, not just the scale.
Shared exponent isn't that good of an idea for positions, but parallel
exponent is, so use that for -cc, stick to individual exponent encoding
for -c as it compresses better than quantized floats, and use quantized
floats when no compression is performed.
We need to quantize the bounds with the same algorithm we apply to
vertex data.

Technically, when using meshopt compression, we also need to apply
exponent encoding to bounds. However, in practice the delta is small and
the validator can't decode meshopt data, so we'll leave this until
another day.
-cf requires that we never use filters during encoding, since the
fallback buffer needs to contain raw data. We activate filters when
using -cc and -vpf, so these need to be mutually exclusive.
It's too early for that but it looks like we might deprecate -noq in the
future in favor of -vpf for most use cases...
@zeux zeux merged commit 29a6b8c into master Aug 18, 2022
@zeux zeux deleted the gltf-vpf branch August 18, 2022 03:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant