Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What exactly the x(or say surface) represents in forward function of Autoencoder #8

Open
zhanghm1995 opened this issue Oct 18, 2022 · 3 comments

Comments

@zhanghm1995
Copy link

Hi, when I read this code, I found there are two inputs x and points in Autoencoder. I found the x is the surface variable, so what it exactly mean?

Since the reconstruction task just needs point cloud input, why do we need the "surface" as one of inputs?

Maybe I misunderstand something as I cannot download so a large dataset.

@zhanghm1995 zhanghm1995 changed the title What exactly the x represents in forward function of Autoencoder What exactly the x(or say surface) represents in forward function of Autoencoder Oct 18, 2022
@1zb
Copy link
Owner

1zb commented Oct 19, 2022

There are two kinds of points in the reconstruction task

  1. surface points (x)
  2. query points (points)
    To reconstruct the surface, we want to be able to obtain the labels (or SDFs) for any query points in 3D space. Then we can apply an iso-surface extraction method (e.g., Matching Cubes) to get the desired 3D polygonal mesh.

@zhanghm1995
Copy link
Author

Thanks for your reply.

So it means you use the surface points (x) in encoder to get the embedding of centers in surface points, and when decoding, you use the query points (they contain vol_points and near_point, right?), combined with the sampled centers embedding features in surface points, to obtain the classification labels in 3D space.

However, you use all the query points in the decoder, do these query points offer gt information to make the classification labels learning more easier? Since these query points already contain the occupancy information in 3D space.

Maybe I misunderstand something.

BTW, if I just want to learn an AutoEncoder model that reconstructs the input point cloud itself, how can I do this?

@1zb
Copy link
Owner

1zb commented Oct 20, 2022

The learning of neural fields (a.k.a., neural implicit representations, coordinate-based networks) is to represent shapes with a function (an MLP in this case).

  1. In testing, we are able to query occupancy given any query point
  2. In training, we need ground-truth occupancies (they are our main optimizing target)

For your case (I assume it's a point cloud autoencoder), it's totally different from our task. However, you can still try to reuse our point cloud encoder. Then build a decoder with upsampling on top of it.

We will release a subset of the datasets. We use OccNet's repository to convert shapenet models to watertight ones first. Then we sample on the surfaces to get point cloud representations of models and sample labeled (inside/outside) points in the bounding volume. However, if you are only interested in point clouds, just use any polygonal mesh processing software (e.g., trimesh) to do surface sampling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants