Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more motivational use cases #80

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bernhardmgruber
Copy link
Contributor

@bernhardmgruber bernhardmgruber commented May 9, 2021

Here are some more motivational use cases I came up with:

  • interpolated access
  • generic subsetting
  • predicated subsetting
  • slicing

I have not tested if the document builds, because I could not get it working with TeXstudio on Windows. Also in my WSL Ubuntu I installed several packages but did not succeed in building it. I hope it runs fine, or is easy to fix for you then :)

* interpolated access
* generic subsetting
* predicated subsetting
* slicing
Comment on lines +129 to +155
\subsection{Interpolated access}
Subscripting a random-access range with non-integral types can have interesting use cases,
like interpolating between the elements at the nearest smaller and larger integral index of the subscript:
\smallskip\begin{lstlisting}
auto operator[](std::ranges::random_access_range auto r, double index) {
double integ;
const double frac = std::modf(index, &integ);
return std::lerp(r[integ], r[integ + 1], frac);
}
std::vector v{0.0, 1.0, 4.5, 7.8};
double value = v[1.6]; // value == 3.1
\end{lstlisting}
However, adding such an overload in practice might be dangerous because the floating-point index might truncate to a range's integral subscript operator in case the floating point overload is not visible, or change the behavior of an existing floating point subscription relying on the truncation.
This issue could be worked around using a custom type which cannot truncate into an integral, allowing to e.g. also specify the used interpolation:
\smallskip\begin{lstlisting}
std::vector v{0.0, 1.0, 4.5, 7.8};
double value = v[1.6_linear]; // value == 3.1
\end{lstlisting}

With the adoption of P2128, we could even allow interpolation on higher dimensional objects, like the proposed `std::mdspan` or `std::mdarray`.
\smallskip\begin{lstlisting}
std::mdarray<double, N, M, K> a = ...;
double value = a[7.5, 8.9, 76.4]; // trilinear interpolation
\end{lstlisting}

While we are aware that such a usage might be niche or a separate function for interpolated access could be arguably better, we would like to give library writes the necessary machinery to implement such a feature within their domains if they desire so.

Copy link
Contributor

@MFHava MFHava May 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be really interesting what EWG thinks about the interpolated access example (and the simd-example for that matter), as some people recently stated that subscripting should always offer reference semantics...

Copy link
Contributor Author

@bernhardmgruber bernhardmgruber May 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this feature so much, that I am probably blind to the critique. @mattkretz also did not like it a lot, especially the float subscript. Being a bit of a 3D graphics fan, I am used to stuff being interpolated all the time, so this feels natural to me. But sure, if the use case is too controversal, we can also drop it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any precedent for a float-subscript in 3D graphics? Based on my albeit limited GLSL experience, I'm inclined to believe that whilst data is often interpolated, such an access is never expressed via subscripting...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, looking that up now you are probably right:

  • In OpenGL/GLSL there is the texture(texObj, coords...) function.
  • In D3D/HLSL there is the texObj.Sample(sampler, coords...) member function.
  • In Cg, there is the tex2D(sampler, coords) function.

So it seems there is no direct precedence in shading languages for interpolation using subscripting. Such functionality is indeed provided by functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question for me now becomes whether interpolation on subscription is even a bad idea in niche domains, domain specific libraries or closed internal code bases. I agree that such a feature does not belong into the standard library. But could there still be interesting applications in such limited scenarios?

Comment on lines +156 to +162
\subsection{Generic subsetting}
The idea from section \ref{sec:simd_load} to use SIMD vectors as subscripts can be further generalized to allow any kind entity representing multiple indices as subscript.
\smallskip\begin{lstlisting}
auto operator[](std::ranges::random_access_range r, std::ranges::input_range indices);
\end{listing}
The concrete implementation and return type of such an operation is debatable and thus intentionally omitted.
Still, it could allow for better formulation of some algorithms or more efficient access for certain ranges, if the index set used for subscription is known in advance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from a potentially terser notation, I'm not convinced that this would be better than something like:
vec | std::views::indexed(indices)

For this and all following examples: I'd recommend switching to more concrete examples than globally overloading based on concepts - as that wouldn't work anyways (ADL would never find them).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure: we can always replace any operator by a function :) The question is which syntax feels more "natural". Maybe I should think more in a mathematical direction, where subscripting happens with non-integers.

The ADL part is a fair point! If I template the non-member operator[] on the first argument, I cannot be found if it is in another namespace. That makes it useless for std containers, since I cannot put the operator in namespace std. This is no obstacle for @mattkretz, because simd will live in std. Which makes me wonder where else people add non-member operators outside the namespaces of their parameter types.

Comment on lines +164 to +175
\subsection{Predicated subsetting}
Subscripting with a unary predicate would allow us to select a subset of an existing container.
Combined with an easy way to specify these predicates, e.g. Boost.Lambda2, this allows powerful and concise expressions:
\smallskip\begin{lstlisting}
template <std::ranges::random_access_range R, typename UnaryPredicate>
auto operator[](R&& r, UnaryPredicate&& p) {
return std::forward<R>(r) | std::views::filter(std::forward<UnaryPredicate>(p));
}
std::vector v{0.0, -1.0, 4.5, -7.8};
for (auto e : v[_1 > 0])
...;
\end{listing}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. This looks nice - at least with a terse lambda notation it's really readable, it would be hideous with the current lambda syntax...

Copy link
Contributor Author

@bernhardmgruber bernhardmgruber May 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In @mattkretz's design for SIMD, I think he originally had this syntax instead of the where expression. With non-member operator[], such a facility could be added by users in their namespace for std::simd:

template <typename T, typename Abi>
decltype(auto) operator[](std::simd<T, Abi>& s, std::simd_mask<T, Abi> m) {
	return std::where(s, m);
}

Comment on lines +177 to +189
\subsection{Slicing}
With the adoption of P2128, a generic slicing facility could be defined:
\smallskip\begin{lstlisting}
template <std::ranges::random_access_range R>
auto operator[](R r, std::size_t from, std::size_t to, std::size_t step = 0) { ... }

std::vector v{...};
auto slice1 = v[10, 20]; // elements from the 10th to the 20th (exclusive)
auto slice2 = v[10, 20, 2]; // every second element from the 10th to the 20th (exclusive)
\end{listing}
Such slicing operators are supported in other languages, e.g. Python.
There have also been C++ language extensions to add such slicing functionality, e.g. Intel's Cilk Plus Array Notations.
The adoption of P2128 and this proposal would allow for corresponding library solutions to emerge.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slicing ala Cilk+ would be quite nice for certain (numeric) domains. Generalizing the subscript operator enables emulating this without dedicated language support, but I'm not sure if that syntax would be desirable for such a feature...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v[10, 20, 2] is indeed a bit ambiguous. It could also mean take the 10th, 20th and 2nd element. But then, should we disallow users to write that if they want? In Python, I can write v[10:20:2] and people usually understand what that means. Good conventions can establish in the communities.

Copy link
Contributor

@MFHava MFHava May 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v[10, 20, 2] is indeed a bit ambiguous. It could also mean take the 10th, 20th and 2nd element.

Without knowing the type of v it could be literally anything - e.g. for a matrix it could simply mean: v[10, 20..2].

In Python, I can write v[10:20:2] and people usually understand what that means.

Yes, but:

  1. Does Python actually have multi-dim subscripting?
  2. If yes: Does it use : for that purpose or is that a different syntax? (e.g. ,)

It's beyond the scope of the paper but IMHO a multi-dim subscript operator is ill-suited for a slicing mechanic as it conflates indices with slices.

Copy link
Contributor Author

@bernhardmgruber bernhardmgruber Jun 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I have seen so far, and I am by far not fluent in Python and the popular Numpy, there is a distinction:

  • Multidim array access is done with repeated subscripts: m[4][6]
  • Slicing is done using a different notation: m[4:6], meaning the two elements at index 4 and 5.
  • The notation m[4, 6] apparently creates a tuple and passes that to the subscript operator, resulting in an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a multi-dim subscript operator is ill-suited for a slicing mechanic as it conflates indices with slices.

I guess you are right here as well. Slicing using operator[] is probably confusing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants