`FDataGrid.call` any (compatible) shape? #562

ego-thales · 2023-08-11T14:38:11Z

Hi,

Currently, FDataGrid.__call__ only takes shapes (n_samples, dim_domain) or (dim_domain,) (or () when dim_domain == 1). I think it would be natural to allow any shape + (dim_domain,) to allow for multiple dimension evaluation without going through the trouble of flattening before and unravelling after.

What do you think?
Élie

The text was updated successfully, but these errors were encountered:

vnmabus · 2023-08-11T15:00:56Z

Sorry, I have trouble understanding the proposed behaviour, can you show an example?

ego-thales · 2023-08-11T15:33:31Z

For example, suppose you have fd with dim_domain = 3, dim_codomain = 4.
If I want to evaluate fd on an edge of its domain (which is a 2D surface of size, say, n*m), I would do the following:

Generate eval_points, the matrix of all points of the edge, which would have shape (n, m, 3),
Simply call fd(eval_points) and get res with shape (len(fd), n, m, dim_codomain).

Is it a bit more understandable?

It is essentially a situation I faced in implementing #561.

eliegoudout · 2023-11-17T23:45:32Z

For future reference in case we revisit, I make things more clear.

The proposed version is that :

__call__ accepts in.shape = (any_shape, dim_domain) and returns out.shape = (n_samples, any_shape, dim_codomain),
Furthermore, in the special case dim_domain = 1, the corner case in.shape = () could be allowed and interpreted as in.shape = (1,) (unsure if it's a good idea though...).

We can also discuss about the best position for any_shape in the input and output shapes tuples.

vnmabus · 2023-11-18T09:20:35Z

So, I am toying with generalizing the functions in https://github.com/GAA-UAM/scikit-fda/tree/feature/ndfunction, and I am starting to modify the evaluation. What I want to achieve is:

All function classes FDataGrid, FDataBasis, etc represent arrays of functions (following a NDFunction protocol), no longer limited to the 1D case. Thus we have an arbitrary shape, with the shape of the array of functions themselves.
All the functions in the array receive as an input an array (so we have an additional property input_shape) and return another array (output_shape).
Apart from that you can evaluate several points at a time (and your suggestion is that they can also have arbitrary shape, lets call it points_shape).
This is further complicated by the aligned and grid parameters (which maybe should be split into separate functions, at least the second one).
We can complicate it further by considering broadcasting possibilities.

This is a bit difficult to reason about, so I would appreciate any suggestions given all these constraints.

vnmabus · 2023-11-28T09:44:39Z

So, for the aligned non-grid case (the easiest to reason about), we would have:

In-shape: we need to include points_shape (arbitrary) and input_shape (determined by the functional object). I would say that the natural order here is (points_shape, input_shape), as it coincides both with NumPy broadcasting order and with some interfaces such as those in SciPy's interpolation module, e.g. LinearNDInterpolator. It can also be interpreted as passing a list of points in the normal case.
A small problem with allowing arbitrary input and output shapes is that input_shape=() and input_shape=(1,) are no longer equivalent. In the first case, if you pass as in-shape (10, 1), the last 1 would be interpreted as part of points_shape and thus adds an additional dimension to the returned points. In the second case, if you pass as in-shape (10,), that would be an error. I would like to hear your opinion about this.
Out-shape: we need to include shape, points_shape and output_shape. If we agree that in-shape must be (points_shape, input_shape), it makes also sense to return (points_shape, output_shape) in that order. That leaves two possibilities, namely placing shape at the leftmost position or at the rightmost one. We were placing it at the left, as the leading dimension was also shape in the internal representation of functions, but that can be changed if there are strong reasons. This would leave out-shape as (shape, points_shape, output_shape).

vnmabus · 2023-11-28T09:53:49Z

Now, the unaligned case has exactly the same out-shape, but shape has to be present in in-shape too. The most natural way would be to have matching input and output, and so in-shape should be (shape, points_shape, input_shape).

It would have been great to be able to discern between aligned and unaligned from the shape of the evaluation points alone, without the need of an align keyword parameter, but I do not see how that could be possible given that points_shape is arbitrary.

Also, in this proposed API there is no discussion about broadcasting at all. We should probably discuss if/when broadcasting should be allowed and how.

ego-thales added the enhancement label Aug 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`FDataGrid.call` any (compatible) shape? #562

`FDataGrid.call` any (compatible) shape? #562

ego-thales commented Aug 11, 2023

vnmabus commented Aug 11, 2023

ego-thales commented Aug 11, 2023

eliegoudout commented Nov 17, 2023

vnmabus commented Nov 18, 2023 •

edited

Loading

vnmabus commented Nov 28, 2023 •

edited

Loading

vnmabus commented Nov 28, 2023

FDataGrid.__call__ any (compatible) shape? #562

FDataGrid.__call__ any (compatible) shape? #562

Comments

ego-thales commented Aug 11, 2023

vnmabus commented Aug 11, 2023

ego-thales commented Aug 11, 2023

eliegoudout commented Nov 17, 2023

vnmabus commented Nov 18, 2023 • edited Loading

vnmabus commented Nov 28, 2023 • edited Loading

vnmabus commented Nov 28, 2023

`FDataGrid.call` any (compatible) shape? #562

`FDataGrid.call` any (compatible) shape? #562

vnmabus commented Nov 18, 2023 •

edited

Loading

vnmabus commented Nov 28, 2023 •

edited

Loading