Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoiding temporary fails with xt::concatenate #2819

Open
mnijhuis-tos opened this issue Dec 27, 2024 · 1 comment
Open

Avoiding temporary fails with xt::concatenate #2819

mnijhuis-tos opened this issue Dec 27, 2024 · 1 comment

Comments

@mnijhuis-tos
Copy link
Contributor

When using -DXTENSOR_FORCE_TEMPORARY_MEMORY_IN_ASSIGNMENTS=Off, the following simple test fails:

    TEST(xbuilder, concatenate_assign_to_argument)
    {
        xt::xarray<int> a = xt::ones<int>({ 4 });
        xt::xarray<int> b = xt::ones<int>({ 3 });

        a = xt::concatenate(xt::xtuple(a, b));
        ASSERT_EQ(a, xt::ones<int>({7}));
    }

Instead of containing 7 ones, a will contain garbage.

I have searched for the root cause of the issue, and found the following:

  • Changing overlapping_memory_checker_traits for xgenerator, so check_overlap returns true instead of false, solves the issue. The change is in line 90 of xgenerator.hpp.

  • The problem occurs because of an assumption that an xgenerator always generates fresh values using some generator function. In this case, memory overlap indeed never occurs.

  • xt::concatenate creates an xgenerator that reads values from the arrays in its input tuple, which may overlap with the output array.

The following happens in the test:

  1. xgenerator::assign_to resizes a, so it can hold seven elements. It re-allocates memory for a, without initializing the elements.
  2. The function in the xgenerator, which has type concatenate_access, still wants to concatenate a and b. It uses the new value for a.
  3. Since a now has length 7, concatenate_access::access only uses values from a. It thus copies the uninitialized values from a to a.

Although patching xgenerator works around the issue, a proper fix would be calling check_overlap on the functor inside the xgenerator, and then calling check_overlap on the tuples inside the functor, in case the functor holds a tuple.

@mnijhuis-tos
Copy link
Contributor Author

@vakokako Since you created the optimization of avoiding temporaries, you're probably interested in this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant