Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Machine Learning Support #7

Open
Marco-Congedo opened this issue Dec 25, 2019 · 20 comments
Open

Machine Learning Support #7

Marco-Congedo opened this issue Dec 25, 2019 · 20 comments

Comments

@Marco-Congedo
Copy link

Hello,

following this discussion, the steps followed by PosDefManifold.jl for doing classification in the tangent space are:

  1. parallel transport of all points to the center of the manifold (identity for the positive definite matrices manifold). This involves the computation of a center of mass of the points in the manifold and a function for parallel transport.

  2. Exponential map (projection on the tangent space)

  3. vectorization (a special one, for example, for the pos def manifold a weight √2 is given to the off-diagonal elements)

I am not familiar with other manifolds. Are all these operations possible with all currently implemented manifolds?

@kellertuer
Copy link
Member

Thanks for you idea! I would like to shortly comment

  1. I don't understand this point. Do you mean you compute mean of your given data points to choose a tangent plane? then center would refer to the Riemannian Center of mass? This however would not require a parallel transport, really just the computation of the mean (see for example our current implementation here; actually we currently provide 2 methods a third is in one of the PRs: GreadientDescent, GeodesicInterpolation and CyclicProximalPoint.

  2. I think you mean the inverse exponential map (usually referred to as logarithmic map)? The exponential map maps from the tangent bundle to the manifold. More generally it might be usefull to phrase this in terms of (inverse) retractions.

  3. I don't follow this one. Why do you have to do this and what does it actually do?

The first two should already be widely available – but an implementation would anyways just use the interfaces from ManifoldsBase. And a manifold that wants to use the ML methods should implement the necessary.

Finally I would like to discuss whether this is in scope of the package. I don't want to say, that it's not nice stuff, just that ManifoldsML.jl might in some time also be a good idea to start. But of course we can first start here and with enough stuff to cover (maybe even combining with Manopt.jl for the optimization, though Manopt has still to be ported to ManifoldsBase.jl), we can do a spin of.

@Marco-Congedo
Copy link
Author

Marco-Congedo commented Dec 25, 2019

Hi,

  1. so, we compute a mean as a base-point for projecting onto the tangent space. This is just an educated guess, not necessarely the optimal base-point, but it works well in practice.

Important, we do not project on the tangent space directly at this point, rather, we first parallel transport all points to the center of the manifold (identity). In the pos def manifold this can be done directly on the manifold (see here and reference therein). This step is crucial to achieve good machine learning performance and also instrumental to operate transfer learning. For a general manifold, what we need is to

  • compute the mean
  • project onto the tangent space
  • paraller transport to the center of the manifold
  1. Yeah, sorry, i meant the logarithmic map.

  2. In the pos def manifold, tangent 'vectors' are symmetric matrices. Machine learning models take as input feature vectors, that is why tangent 'vectors' are vectorized so as to get feature vectors. Since those matrices are symmetric, only one triangle is vectorized and to compensate for this a weight equal to √2 is given to the off-diagonal elements.

In summary, what is needed is

  • an algorithm to estimate a mean
  • the logaritmic map
  • the parallel transport
  • an appropriate definition of vectorization, which will depend on the manifold

Are all ingredients (besides the last one) available for all manifolds in Manifold.jl?

As per the scope, i would keep it in a separate package, something like ManifoldLearn.jl.
Actually, PosDefManifold.jl is an appropriate skeleton (it is Python-free, differently from sciKitLearn.jl).
All it is needed as far as i see is to implement an interface to Manifold.jl for these operations (currently this is done by PosDefManifold.jl for the pos def manifold only.)

PosDefManifoldML.jl also implements machine learning models acting directly on the manifold. For those, all is needed is a function for computing the mean of several points and a distance function.
Thus, to be complete, all is needed is:

  • a distance function
  • an algorithm to estimate a mean
  • the logaritmic map
  • the parallel transport

@kellertuer
Copy link
Member

Thanks for the clarifications. I still don't follow the “center”, if its not the mean. So your “center” is always the identity? As the neutral element of the Lie group it has nice properties for SPDs, however, it is in general not available. I am a little stuck on your terms – you can only parallel transport tangent vectors not points. So to clarify: With m the mean and e the center, do you
log into the tangent space at m and then parallel_transport to e? I am not yet sure why, but if so a general algorithm needs maybe a replacement for e, since such a point is not necessarily available (in its prominence as the unit matrix), for example on the sphere. Still an algorithm could have a center which indicates your favourite point (and its tangent space), that could be set to the mean (if nothing prominent is available) and would be set to center=e on SPD.

Concerning the functions (distance, mean, log, parallel transport), they are all available for nearly all manifolds that we have currently implemented (where I would still suggest to to log more general and use an inverse_retraction that defaults to using the log, but that's a technical detail. For the vectorisation I also have a rough idea what to do, but that would be something quite ML-specific maybe.

Concerning the scope: I think it's a great idea to start a package like ManifoldML.jl or ManifoldLearn.jl (I would prefer the first name actually). It would implement algorithms using ManiofldsBase.jl (which provides all necessary function interfaces) and maybe Manopt for optimizations in between – and use Manifolds in its concrete examples/tutorials. I'd happily help and provide input to that!

@Marco-Congedo
Copy link
Author

Thats great Ronny,

 i don't have time now but i will start working on that in January/February.

I will definitely need your input!

To clarify, yes, of course i am talking about transporting tangent vectors in the expected way as you describe, however one nice property of the pos def manifold is that the same transformation applies to points on the manifold, as explained here and proved in the reference therein. This is just a tachnical point, forget about it.

It is good enough to have a preferred point for parallel transporting the tangent vectors, which would be the identity for the pos def manifold and something else for the others. By the way, the reason why this works well is that when you have another set of points, independent from the first, hence with a different mean, by parallel transporting it to the identity as well makes the model you already computed a workable model. There are several works showing this and i can point to them if someone is interested.

All the best

@kellertuer
Copy link
Member

Ah, I missed that detail. Thanks for clarification.

If you start working, just give me a note and I'll hapily help.

Concerning the working best – I trust you, but it would also be fine to “parallel transport the model” from one tangent space to another in order to be able to compare. We should collect such literature in the documentation, too.

@mateuszbaran
Copy link
Member

Great, I'd definitely like to see support for ML on manifold-valued data 👍 . I can help with this as well 🙂 .

Regarding that vectorization, there are a few different ways this can be done. We have the vee function that does that (but most manifolds don't have this function), Another option is through an orthonormal basis of the tangent space at a point. We have a few ideas about providing this functionality (see JuliaManifolds/Manifolds.jl#66 ).

@Marco-Congedo
Copy link
Author

Great Mateusz, i will let you know in this thread when i find some time for this. Looking forward to hearing your and Ronny's ideas on the vectorization when the time comes. And, by the way, thanks for the CovarianceEstimation.jl package. It's very handy!

@mateuszbaran
Copy link
Member

Nice to hear that you like CovarianceEstimation.jl.

With JuliaManifolds/Manifolds.jl#86 merged, we have vectorization, so Manifolds.jl is ready now 🙂 .

@kellertuer
Copy link
Member

Hi @Marco-Congedo,
I hope you're doing fine these days.

We have for a while a get_coordinates which vectorises a tangent vector in any tangent space on any manifold with respect to a basis, so the vector is even of the same size as the manifold dimension.

I haven't had the time to look closely at PosDefManifoldML.jl, but if you have time we could start a ManifoldML, I think? As far as I see we now have all tools you need available within ManifoldsBase.jl.

@Marco-Congedo
Copy link
Author

Hello Ronny,
i have been thinking about this and i meant to write you my thoughts. Thanks for sparkling the discussion again then. Here is what i have been thinking: once you have vectorized tangent vectors, STANDARD machine learning tools applies. Therefore, PosDefManifoldML.jl is not the best way to empower Manifold.jl with ML capabilities, although it can also be used since it also supports vectorized tangent vectors as input. However, many more ML models are available in general ML packages. Right now in Julia it is hot MLJ.jl. My guess is that linking Manfold.jl to MLJ.jl would be a much more valuable endeavor for the community.

@kellertuer
Copy link
Member

Hallo Marco,
ah, then I am not sure what PosDefManifoldML.jl does in total (but I also did not check too closely), since I hoped that it would roughly do what you write, but yes, if you decide for a tangent space and coordinates it's classical ML, though only locally of course! It heavily depends on the chosen tangent space.

But yes, for the pure linear part, coupling it to the best ML package is an approach I would've taken, too. Currently, I am a little busy (and not too familiar with MLJ neither).

@Marco-Congedo
Copy link
Author

Hello, i am not familiar with MLJ.jl either. PosDefManifoldML.jl handles directly positive definite matrices, with (so far) a classifier in the PD manifold and two classifiers in the tangent space (handling the tangent space and vectorization step automatically). However, this only works for the PD manifold.

@kellertuer
Copy link
Member

Then it would also be interesting, whether the classifier on the PD manifold can begeneralized to other manifolds – does it mainly use geodesics? “Just” the Euclidean case is – I think – not a package, since it would just require the steps above, logs/transports/coordinates, and MJL. The three steps before are all just one line each.

@Marco-Congedo
Copy link
Author

The classifier acting on the manifold that i implemented is the minimum distance to mean (MDM), which only needs the concept of distance along the geodesic and the concept of barycenter. This is a special case of kNN, which only needs the concept of distance. The geodesics should be unique, not sure it would make sense on all manifolds you support. Despite its simplicity, the MDM has proven a good classifier in the brain-computer interface field, however passing in the tangent space and adopting more complex classifiers (in particular, support vector machines and LASSO logistic regression) ususally gives better performances, sometimes much better.

Another popular way to treat manifold data is to obtain features by Laplacian (Diffusion) Eigenmaps, based on the inter-distance matrix. This only requires the concept of distance (by the way, i implemented a couple of these methods in PosDefManifold.jl)

Just to make it clear, PosDefManifoldML.jl does the following: it takes PD matrices as input and allows to fit a model/make prediction with it and to test models using cross-validation.
So far it implements the following ML models:

  • MDM (on the manifold)
  • SVM (on the tangent space)
  • LASSO Logistic Regression (on the tangent space)

Indeed, i guess making a more general package along the lines of PosDefManifoldML.jl could allow the support for several manifolds.

@kellertuer
Copy link
Member

Thanks for all the details.

With the uniqueness of geodesics, this can of course be (locally) done on any manifold, also barycentre or means are available on all our manifolds. So if I find time (most probably only after the semester in Germany so basically after JuliaCon) I'll take a look. I already know about SVM and Lasso.
The Laplacian is also something I would like to implement here, yes.

@Marco-Congedo
Copy link
Author

OK, let me know when you are in. I will have a new student in October who may join. As per the Laplacians, i have already code available in PosDefManifold.jl. As per PosDefManifoldML.jl, actually it should suffice to add a manifold as argument in the ML model constructors (now there is an argument 'metric', to apply any of the 10 metrics implemented in PosDefManifold.jl) and to call the appropriate functions depending on the manifold. Cheers.

@kellertuer
Copy link
Member

I started a package at ManifoldML.jl, so let's discuss further details there?
It's really just a first start, nothing fancy yet, just a very first approach to k-means clustering.

It might be worth

  • including the metrics here in Manifolds, I haven't had the time to check what can be transferred and how (distance? inner? what's available for your metrics?)
  • maybe mean could be also but into base, so that ManifoldML can purely depend on base?

@mateuszbaran
Copy link
Member

Great, thanks for staring this!

  • maybe mean could be also but into base, so that ManifoldML can purely depend on base?

Moving mean to base would mean adding StatsBase as a dependency. How about a ManifoldStats.jl package that gathers statistics and distributions and builds upon ManifoldsBase.jl?

@kellertuer
Copy link
Member

I think ManifoldStatistics (or ManifoldStats) would be reasonable as a kind of interface (specific more efficient means should stil be here). But for me that also depends on starting more with distributions, otherwise I feel it might be “too small”?

For now the ML can also just work with Manifolds as a dependence, its only in development for quite some time, I think, anyways.

@mateuszbaran
Copy link
Member

Sure, there is no rush to make ManifoldsML.jl independent from Manifolds.jl.

@kellertuer kellertuer transferred this issue from JuliaManifolds/Manifolds.jl Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants