Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Data] GroupedData.map_groups() doesn't allow partial callables #46185

Open
bdewilde opened this issue Jun 21, 2024 · 1 comment · May be fixed by #48907
Open

[Data] GroupedData.map_groups() doesn't allow partial callables #46185

bdewilde opened this issue Jun 21, 2024 · 1 comment · May be fixed by #48907
Labels
bug Something that is supposed to be working; but isn't data Ray Data-related issues good first issue Great starter issue for someone just starting to contribute to Ray P1 Issue that should be fixed within a few weeks

Comments

@bdewilde
Copy link

What happened + What you expected to happen

In this PR, a new requirement was imposed on the fn callable given as input to GroupedData.map_groups(): that it have a __name__ attribute. Unfortunately, callables partially parametrized using functools.partial() have no such attribute, so passing them into .map_groups() raises an error: AttributeError: 'functools.partial' object has no attribute '__name__'. This did not happen prior to the linked PR.

It's not a huge deal, but it did cause code to break unexpectedly, and I guess technically is in conflict with the type annotations on this method.

In [1]: import functools

In [2]: callable(functools.partial(lambda x: x))
Out[2]: True

In [3]: functools.partial(lambda x: x).__name__
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[3], line 1
----> 1 functools.partial(lambda x: x).__name__

AttributeError: 'functools.partial' object has no attribute '__name__'

Versions / Dependencies

ray >= 2.21
PY3.10
macOS 14.4

Reproduction script

>>> import functools
>>> import ray
>>> ds = ray.data.range(10)
>>> ds.groupby("id").map_groups(lambda x: x)  # this is fine
MapBatches(<lambda>)
+- Sort
   +- Dataset(num_rows=10, schema={id: int64})
>>> ds.groupby("id").map_groups(functools.partial(lambda x: x))  # this errors
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[8], line 1
----> 1 ds.groupby("id").map_groups(functools.partial(lambda x: x))

File ~/.pyenv/versions/3.10.13/envs/ev-detection-py310/lib/python3.10/site-packages/ray/data/grouped_data.py:253, in GroupedData.map_groups(self, fn, compute, batch_format, fn_args, fn_kwargs, fn_constructor_args, fn_constructor_kwargs, num_cpus, num_gpus, concurrency, **ray_remote_args)
    249         yield from apply_udf_to_groups(fn, batch, *args, **kwargs)
    251 # Change the name of the wrapped function so that users see the name of their
    252 # function rather than `wrapped_fn` in the progress bar.
--> 253 wrapped_fn.__name__ = fn.__name__
    255 # Note we set batch_size=None here, so it will use the entire block as a batch,
    256 # which ensures that each group will be contained within a batch in entirety.
    257 return sorted_ds._map_batches_without_batch_size_validation(
    258     wrapped_fn,
    259     batch_size=None,
   (...)
    271     **ray_remote_args,
    272 )

AttributeError: 'functools.partial' object has no attribute '__name__'

Issue Severity

Low: It annoys or frustrates me.

@bdewilde bdewilde added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jun 21, 2024
@anyscalesam anyscalesam added the data Ray Data-related issues label Jun 21, 2024
@scottjlee scottjlee added P1 Issue that should be fixed within a few weeks good first issue Great starter issue for someone just starting to contribute to Ray and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jun 26, 2024
@LeoLiao123
Copy link
Contributor

Hi, can I take on this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't data Ray Data-related issues good first issue Great starter issue for someone just starting to contribute to Ray P1 Issue that should be fixed within a few weeks
Projects
None yet
4 participants