Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ndarray iterators don't always work well with impl Iterator #1290

Closed
lePerdu opened this issue Jun 2, 2023 · 3 comments · Fixed by #1417
Closed

ndarray iterators don't always work well with impl Iterator #1290

lePerdu opened this issue Jun 2, 2023 · 3 comments · Fixed by #1417
Labels

Comments

@lePerdu
Copy link

lePerdu commented Jun 2, 2023

When writing some iterator helper functions around the base ndarray iterators, I found a scenario where the borrow checker rejects what seems to be valid code. The scenario is generating an iterator over indices matching some criteria in an Array of references.

use ndarray::{iter::IndexedIter, Array1, Ix1};

/// This doesn't compile:
///
/// > hidden type for `impl Iterator<Item = usize> + 'a` captures lifetime that does not appear in bounds
/// > hidden type ... captures the lifetime `'s` as defined here
///
/// Various alternatives I tried which have the same issue:
/// - `.iter().enumerate()` instead of `.indexed_iter()`
/// - `.filter_map()` instead of `.filter().map()`
pub fn nd_iter_non_empty_indices_not_compiling<'s, 'a>(
    array: &'a Array1<&'s str>,
) -> impl Iterator<Item = usize> + 'a {
    array
        .indexed_iter()
        .filter(|(_index, elem)| !elem.is_empty())
        .map(|(index, _elem)| index)
}

/// Adding `'a: 's` makes the function compile, but now code using it doesn't
/// compile because the constraint `'a: 's` is impossible: the container can't
/// outlive the contained elements.
///
/// ```
/// use ndarray::array;
/// use broken_iter::*;
/// let arr = array!["", "abc", "", "123"];
/// assert_eq!(nd_iter_non_empty_indices_broken(&arr).collect::<Vec<_>>(), vec![1, 3]);
/// ```
pub fn nd_iter_non_empty_indices_broken<'s, 'a: 's>(
    array: &'a Array1<&'s str>,
) -> impl Iterator<Item = usize> + 'a {
    array
        .indexed_iter()
        .filter(|(_index, elem)| !elem.is_empty())
        .map(|(index, _elem)| index)
}

The original problem I had was with a 2D array with more complex types and filtering operation, but the error is the same.

Similar functions which work fine

I thought it might have been a limitation of the Rust compiler, but the same operation with a normal Rust slice (or Vec) works fine:

pub fn slice_iter_non_empty_indices<'s, 'a>(
    array: &'a [&'s str],
    // All of these also work:
    // array: &'a Box<[&'s str]>,
    // array: &'a Vec<&'s str>,
    // array: &'a LinkedList<&'s str>,
    // array: &'a HashSet<&'s str>,
) -> impl Iterator<Item = usize> + 'a {
    array
        .iter()
        .enumerate()
        .filter(|(_index, elem)| !elem.is_empty())
        .map(|(index, _elem)| index)
}

It also works if the iterator returns references:

/// A similar function which returns the `&str` elements themselves works fine.
/// ```
/// use ndarray::array;
/// use broken_iter::*;
/// let arr = array!["", "abc", "", "123"];
/// assert_eq!(nd_iter_non_empty_strings(&arr).collect::<Vec<_>>(), vec!["abc", "123"]);
/// ```
pub fn nd_iter_non_empty_strings<'s, 'a>(
    array: &'a Array1<&'s str>,
) -> impl Iterator<Item = &'s str> + 'a {
    array.iter().filter(|elem| !elem.is_empty()).cloned()
}

Workaround

I am able to work around the issue by creating a custom type for the iterator, but it seems like all the extra boilerplate shouldn't be necessary, especially since it isn't required when working with Vec:

/// It works fine if I use a custom struct to do the filter/map.
///
/// ```
/// use ndarray::array;
/// use broken_iter::*;
/// assert_eq!(nd_iter_non_empty_indices_custom(&array!["", "abc", "", "123"]).collect::<Vec<_>>(), vec![1, 3]);
/// ```
pub fn nd_iter_non_empty_indices_custom<'s, 'a>(
    array: &'a Array1<&'s str>,
) -> NonEmptyIndices<'a, 's> {
    NonEmptyIndices {
        indexed_iter: array.indexed_iter(),
    }
}

pub struct NonEmptyIndices<'a, 's> {
    indexed_iter: IndexedIter<'a, &'s str, Ix1>,
}

impl<'a, 's> Iterator for NonEmptyIndices<'a, 's> {
    type Item = usize;

    fn next(&mut self) -> Option<Self::Item> {
        // Manual loop for the filter/map operation
        loop {
            let (index, elem) = self.indexed_iter.next()?;
            if !elem.is_empty() {
                return Some(index);
            }
        }
    }
}

EDIT:

I found a much better work-around by specifying a more concrete return type. (Also changed it to filter_map to simplify the type).

pub fn nd_iter_non_empty_indices_better<'s, 'a>(
    array: &'a Array1<&'s str>,
) -> FilterMap<
    IndexedIter<'a, &'s str, Ix1>,
    impl FnMut((usize, &'a &'s str)) -> Option<usize>,
> {
    array.indexed_iter().filter_map(
        |(index, elem)| {
            if !elem.is_empty() {
                Some(index)
            } else {
                None
            }
        },
    )
}

This isn't terrible to work with, but it's still weird that all of the types in std::collections work fine with impl Iterator<Item = usize> + 'a but ndarray types need the explicit return type.

@adamreichold
Copy link
Collaborator

There is something else going on with the slice though, i.e. if you remove the lifetime on return value type, it shows the error message

error[[E0700]](https://doc.rust-lang.org/stable/error_codes/E0700.html): hidden type for `impl Iterator<Item = usize>` captures lifetime that does not appear in bounds
 --> src/main.rs:4:5
  |
1 |   pub fn slice_iter_non_empty_indices<'s, 'a>(
  |                                       -- hidden type `Map<Filter<Enumerate<std::slice::Iter<'s, &'s str>>, [closure@src/main.rs:7:17: 7:33]>, [closure@src/main.rs:8:14: 8:30]>` captures the lifetime `'s` as defined here
2 |       array: &'a [&'s str],
3 |   ) -> impl Iterator<Item = usize> {
  |        --------------------------- opaque type defined here
4 | /     array
5 | |         .iter()
6 | |         .enumerate()
7 | |         .filter(|(_index, elem)| !elem.is_empty())
8 | |         .map(|(index, _elem)| index)
  | |____________________________________^
  |
help: to declare that `impl Iterator<Item = usize>` captures `'s`, you can add an explicit `'s` lifetime bound
  |
3 | ) -> impl Iterator<Item = usize> + 's {
  |                                  ++++

For more information about this error, try `rustc --explain E0700`.

I.e. notice how the non-opaque type already contains only references to the lifetime 's but not 'a whereas the same error message for ndarray referenfces both 's and 'a.

@lePerdu
Copy link
Author

lePerdu commented Jun 2, 2023

That doesn't seem to be restricted to slices; the compiler also gives the same sort of warnings in the original example without any lifetime annotations on the return:

error[E0700]: hidden type for `impl Iterator<Item = usize>` captures lifetime that does not appear in bounds
  --> src/lib.rs:16:5
   |
13 |   pub fn nd_iter_non_empty_indices_not_compiling<'s, 'c>(
   |                                                  -- hidden type `Map<Filter<IndexedIter<'s, &'s str, Dim<[usize; 1]>>, [closure@src/lib.rs:18:17: 18:33]>, [closure@src/lib.rs:19:14: 19:30]>` captures the lifetime `'s` as defined here
14 |       array: &'c Array1<&'s str>,
15 |   ) -> impl Iterator<Item = usize> {
   |        --------------------------- opaque type defined here
16 | /     array
17 | |         .indexed_iter()
18 | |         .filter(|(_index, elem)| !elem.is_empty())
19 | |         .map(|(index, _elem)| index)
   | |____________________________________^
   |
help: to declare that `impl Iterator<Item = usize>` captures `'s`, you can add an explicit `'s` lifetime bound
   |
15 | ) -> impl Iterator<Item = usize> + 's {
   |                                  ++++

Note IndexedIter<'s, &'s str> which doesn't reference the lifetime of the array at all.

It seems like some bad type/lifetime inference is going on without lifetime annotations on the return type. std::slice::Iter<'s, &'s str> is incorrect in my understanding: the first lifetime parameter should be 'a (lifetime of the array) not 's (lifetime of the strings).

Following the compiler's suggestions there leads to non-working code:

  • Add + 's: Compiler says "lifetime may not live long enough consider adding the following bound: 'a: 's"
  • Add 'a: 's: The function compiles but usages of it don't compile because the array can't outlive the contained references.

@bluss
Copy link
Member

bluss commented Aug 6, 2024

This is a bug/design limitation that could maybe be resolved in the next iterations of ndarray.

It's a bug/limitation because the equivalent code for Vec works.

My suspect for explanation is that it's because Array<T, D> is invariant w.r.t the T parameter. It's invariant because of the element type T is used through S::Elem in the struct:

pub struct ArrayBase<S, D>
where S: RawData
{
    /// Data buffer / ownership information. (If owned, contains the data
    /// buffer; if borrowed, contains the lifetime and mutability.)
    data: S,
    /// A non-null pointer into the buffer held by `data`; may point anywhere
    /// in its range. If `S: Data`, this pointer must be aligned.
    ptr: std::ptr::NonNull<S::Elem>,
    /// The lengths of the axes.
    dim: D,
    /// The element count stride per axis. To be parsed as `isize`.
    strides: D,
}

But it could also be something similar but for Baseiter (internal), not for ArrayBase. Maybe that makes even more sense, since that one is inside the return type..

@bluss bluss added the bug label Aug 6, 2024
bluss added a commit that referenced this issue Aug 6, 2024
The internal Baseiter type underlies most of the ndarray iterators, and
it used `*mut A` for element type A. Update it to use `NonNull<A>` which
behaves identically except it's guaranteed to be non-null and is
covariant w.r.t the parameter A.

Add compile test from the issue.

Fixes #1290
bluss added a commit that referenced this issue Aug 6, 2024
The internal Baseiter type underlies most of the ndarray iterators, and
it used `*mut A` for element type A. Update it to use `NonNull<A>` which
behaves identically except it's guaranteed to be non-null and is
covariant w.r.t the parameter A.

Add compile test from the issue.

Fixes #1290
bluss added a commit that referenced this issue Aug 6, 2024
The internal Baseiter type underlies most of the ndarray iterators, and
it used `*mut A` for element type A. Update it to use `NonNull<A>` which
behaves identically except it's guaranteed to be non-null and is
covariant w.r.t the parameter A.

Add compile test from the issue.

Fixes #1290
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants