Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: can't handle df.groupby('a').agg(c=('b', 'mean'), d=('b', 'mean')) #7414

Open
3 tasks done
MarcoGorelli opened this issue Dec 18, 2024 · 0 comments
Open
3 tasks done
Labels
bug 🦗 Something isn't working Triage 🩹 Issues that need triage

Comments

@MarcoGorelli
Copy link

MarcoGorelli commented Dec 18, 2024

Modin version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest released version of Modin.

  • I have confirmed this bug exists on the main branch of Modin. (In order to do this you can follow this guide.)

Reproducible Example

In [1]: import modin.pandas as mpd

In [2]: import pandas as pd

In [3]: df = pd.DataFrame({'a': [1,1,2], 'b': [4,5,6]})

In [4]: df.groupby('a').agg(c=('b', 'mean'), d=('b', 'mean'))
Out[4]: 
     c    d
a          
1  4.5  4.5
2  6.0  6.0

In [5]: mpd.DataFrame(df).groupby('a').agg(c=('b', 'mean'), d=('b', 'mean'))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/algebra/groupby.py:783, in GroupByReduce.build_map_reduce_functions.<locals>._reduce(df, **call_kwargs)
    782 try:
--> 783     result = wrapper(df)
    784 # This will happen with Arrow buffer read-only errors. We don't want to copy
    785 # all the time, so this will try to fast-path the code first.

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/algebra/groupby.py:769, in GroupByReduce.build_map_reduce_functions.<locals>._reduce.<locals>.wrapper(df)
    768 def wrapper(df: pandas.DataFrame):
--> 769     return cls.reduce(
    770         df,
    771         axis=axis,
    772         groupby_kwargs=groupby_kwargs,
    773         reduce_func=reduce_func,
    774         agg_args=agg_args,
    775         agg_kwargs=agg_kwargs,
    776         drop=drop,
    777         method=method,
    778         finalizer_fn=finalizer_fn,
    779         **call_kwargs,
    780     )

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/algebra/groupby.py:274, in GroupByReduce.reduce(cls, df, reduce_func, axis, groupby_kwargs, agg_args, agg_kwargs, partition_idx, drop, method, finalizer_fn)
    273 apply_func = cls.get_callable(reduce_func, df)
--> 274 result = apply_func(
    275     df.groupby(axis=axis, **groupby_kwargs), *agg_args, **agg_kwargs
    276 )
    278 if not as_index:

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/algebra/groupby.py:666, in GroupByReduce._build_callable_for_dict.<locals>.aggregate_on_dict(grp_obj, *args, **kwargs)
    665 if preserve_aggregation_order and len(custom_aggs):
--> 666     result = result.reindex(result_columns, axis=1)
    667 return result

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/pandas/core/frame.py:5378, in DataFrame.reindex(self, labels, index, columns, axis, method, copy, level, fill_value, limit, tolerance)
   5359 @doc(
   5360     NDFrame.reindex,
   5361     klass=_shared_doc_kwargs["klass"],
   (...)
   5376     tolerance=None,
   5377 ) -> DataFrame:
-> 5378     return super().reindex(
   5379         labels=labels,
   5380         index=index,
   5381         columns=columns,
   5382         axis=axis,
   5383         method=method,
   5384         copy=copy,
   5385         level=level,
   5386         fill_value=fill_value,
   5387         limit=limit,
   5388         tolerance=tolerance,
   5389     )

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/pandas/core/generic.py:5610, in NDFrame.reindex(self, labels, index, columns, axis, method, copy, level, fill_value, limit, tolerance)
   5609 # perform the reindex on the axes
-> 5610 return self._reindex_axes(
   5611     axes, level, limit, tolerance, method, fill_value, copy
   5612 ).__finalize__(self, method="reindex")

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/pandas/core/generic.py:5633, in NDFrame._reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
   5632 ax = self._get_axis(a)
-> 5633 new_index, indexer = ax.reindex(
   5634     labels, level=level, limit=limit, tolerance=tolerance, method=method
   5635 )
   5637 axis = self._get_axis_number(a)

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/pandas/core/indexes/base.py:4426, in Index.reindex(self, target, method, level, limit, tolerance)
   4425 elif self._is_multi:
-> 4426     raise ValueError("cannot handle a non-unique multi-index!")
   4427 elif not self.is_unique:
   4428     # GH#42568

ValueError: cannot handle a non-unique multi-index!

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Cell In[5], line 1
----> 1 mpd.DataFrame(df).groupby('a').agg(c=('b', 'mean'), d=('b', 'mean'))

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/logging/logger_decorator.py:144, in enable_logging.<locals>.decorator.<locals>.run_and_log(*args, **kwargs)
    129 """
    130 Compute function with logging if Modin logging is enabled.
    131 
   (...)
    141 Any
    142 """
    143 if LogMode.get() == "disable":
--> 144     return obj(*args, **kwargs)
    146 logger = get_logger()
    147 logger.log(log_level, start_line)

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/pandas/groupby.py:989, in DataFrameGroupBy.aggregate(self, func, engine, engine_kwargs, *args, **kwargs)
    986     if callable(agg_func):
    987         return agg_func(*args, **kwargs)
--> 989 result = self._wrap_aggregation(
    990     qc_method=type(self._query_compiler).groupby_agg,
    991     numeric_only=False,
    992     agg_func=func,
    993     agg_args=args,
    994     agg_kwargs=kwargs,
    995     how="axis_wise",
    996 )
    997 return do_relabel(result) if do_relabel else result

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/logging/logger_decorator.py:144, in enable_logging.<locals>.decorator.<locals>.run_and_log(*args, **kwargs)
    129 """
    130 Compute function with logging if Modin logging is enabled.
    131 
   (...)
    141 Any
    142 """
    143 if LogMode.get() == "disable":
--> 144     return obj(*args, **kwargs)
    146 logger = get_logger()
    147 logger.log(log_level, start_line)

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/pandas/groupby.py:1646, in DataFrameGroupBy._wrap_aggregation(self, qc_method, numeric_only, agg_args, agg_kwargs, **kwargs)
   1642 else:
   1643     groupby_qc = self._query_compiler
   1645 return type(self._df)(
-> 1646     query_compiler=qc_method(
   1647         groupby_qc,
   1648         by=self._by,
   1649         axis=self._axis,
   1650         groupby_kwargs=self._kwargs,
   1651         agg_args=agg_args,
   1652         agg_kwargs=agg_kwargs,
   1653         drop=self._drop,
   1654         **kwargs,
   1655     )
   1656 )

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/logging/logger_decorator.py:144, in enable_logging.<locals>.decorator.<locals>.run_and_log(*args, **kwargs)
    129 """
    130 Compute function with logging if Modin logging is enabled.
    131 
   (...)
    141 Any
    142 """
    143 if LogMode.get() == "disable":
--> 144     return obj(*args, **kwargs)
    146 logger = get_logger()
    147 logger.log(log_level, start_line)

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/storage_formats/pandas/query_compiler_caster.py:157, in apply_argument_cast.<locals>.cast_args(*args, **kwargs)
    155     kwargs = cast_nested_args_to_current_qc_type(kwargs, current_qc)
    156     args = cast_nested_args_to_current_qc_type(args, current_qc)
--> 157 return obj(*args, **kwargs)

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/storage_formats/pandas/query_compiler.py:4191, in PandasQueryCompiler.groupby_agg(self, by, agg_func, axis, groupby_kwargs, agg_args, agg_kwargs, how, drop, series_groupby)
   4188             ErrorMessage.warn(message)
   4190 if isinstance(agg_func, dict) and GroupbyReduceImpl.has_impl_for(agg_func):
-> 4191     return self._groupby_dict_reduce(
   4192         by, agg_func, axis, groupby_kwargs, agg_args, agg_kwargs, drop
   4193     )
   4195 is_transform_method = how == "transform" or (
   4196     isinstance(agg_func, str) and agg_func in transformation_kernels
   4197 )
   4199 original_agg_func = agg_func

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/logging/logger_decorator.py:144, in enable_logging.<locals>.decorator.<locals>.run_and_log(*args, **kwargs)
    129 """
    130 Compute function with logging if Modin logging is enabled.
    131 
   (...)
    141 Any
    142 """
    143 if LogMode.get() == "disable":
--> 144     return obj(*args, **kwargs)
    146 logger = get_logger()
    147 logger.log(log_level, start_line)

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/storage_formats/pandas/query_compiler_caster.py:157, in apply_argument_cast.<locals>.cast_args(*args, **kwargs)
    155     kwargs = cast_nested_args_to_current_qc_type(kwargs, current_qc)
    156     args = cast_nested_args_to_current_qc_type(args, current_qc)
--> 157 return obj(*args, **kwargs)

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/storage_formats/pandas/query_compiler.py:3868, in PandasQueryCompiler._groupby_dict_reduce(self, by, agg_func, axis, groupby_kwargs, agg_args, agg_kwargs, drop, **kwargs)
   3866         reduce_dict[reduced_col_name] = reduce_fn
   3867     map_dict[col] = map_fns
-> 3868 return GroupByReduce.register(map_dict, reduce_dict, **kwargs)(
   3869     query_compiler=self,
   3870     by=by,
   3871     axis=axis,
   3872     groupby_kwargs=groupby_kwargs,
   3873     agg_args=agg_args,
   3874     agg_kwargs=agg_kwargs,
   3875     drop=drop,
   3876 )

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/algebra/groupby.py:100, in GroupByReduce.register.<locals>.<lambda>(*args, **kwargs)
     92     reduce_func = build_fn(reduce_func)
     94 assert not (
     95     isinstance(map_func, dict) ^ isinstance(reduce_func, dict)
     96 ) and not (
     97     callable(map_func) ^ callable(reduce_func)
     98 ), "Map and reduce functions must be either both dict or both callable."
--> 100 return lambda *args, **kwargs: cls.caller(
    101     *args, map_func=map_func, reduce_func=reduce_func, **kwargs, **call_kwds
    102 )

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/algebra/groupby.py:440, in GroupByReduce.caller(cls, query_compiler, by, map_func, reduce_func, axis, groupby_kwargs, agg_args, agg_kwargs, drop, method, default_to_pandas_func, finalizer_fn)
    438 else:
    439     new_index = None
--> 440 new_modin_frame = query_compiler._modin_frame.groupby_reduce(
    441     axis,
    442     broadcastable_by,
    443     map_fn,
    444     reduce_fn,
    445     apply_indices=apply_indices,
    446     new_index=new_index,
    447 )
    449 result = query_compiler.__constructor__(new_modin_frame)
    450 return result

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/logging/logger_decorator.py:144, in enable_logging.<locals>.decorator.<locals>.run_and_log(*args, **kwargs)
    129 """
    130 Compute function with logging if Modin logging is enabled.
    131 
   (...)
    141 Any
    142 """
    143 if LogMode.get() == "disable":
--> 144     return obj(*args, **kwargs)
    146 logger = get_logger()
    147 logger.log(log_level, start_line)

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/pandas/dataframe/utils.py:753, in lazy_metadata_decorator.<locals>.decorator.<locals>.run_f_on_minimally_updated_metadata(self, *args, **kwargs)
    751     elif apply_axis == "rows":
    752         obj._propagate_index_objs(axis=0)
--> 753 result = f(self, *args, **kwargs)
    754 if apply_axis is None and not transpose:
    755     result._deferred_index = self._deferred_index

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/pandas/dataframe/dataframe.py:4554, in PandasDataframe.groupby_reduce(self, axis, by, map_func, reduce_func, new_index, new_columns, apply_indices)
   4552     if by_parts.shape[axis] != self._partitions.shape[axis]:
   4553         self._filter_empties(compute_metadata=False)
-> 4554 new_partitions = self._partition_mgr_cls.groupby_reduce(
   4555     axis, self._partitions, by_parts, map_func, reduce_func, apply_indices
   4556 )
   4557 return self.__constructor__(
   4558     new_partitions,
   4559     index=new_index,
   4560     columns=new_columns,
   4561     pandas_backend=self._pandas_backend,
   4562 )

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/logging/logger_decorator.py:144, in enable_logging.<locals>.decorator.<locals>.run_and_log(*args, **kwargs)
    129 """
    130 Compute function with logging if Modin logging is enabled.
    131 
   (...)
    141 Any
    142 """
    143 if LogMode.get() == "disable":
--> 144     return obj(*args, **kwargs)
    146 logger = get_logger()
    147 logger.log(log_level, start_line)

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/pandas/partitioning/partition_manager.py:351, in PandasDataframePartitionManager.groupby_reduce(cls, axis, partitions, by, map_func, reduce_func, apply_indices)
    348 # Assuming, that the output will not be larger than the input,
    349 # keep the current number of partitions.
    350 num_splits = min(len(partitions), NPartitions.get())
--> 351 return cls.map_axis_partitions(
    352     axis,
    353     mapped_partitions,
    354     reduce_func,
    355     enumerate_partitions=True,
    356     num_splits=num_splits,
    357 )

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/logging/logger_decorator.py:144, in enable_logging.<locals>.decorator.<locals>.run_and_log(*args, **kwargs)
    129 """
    130 Compute function with logging if Modin logging is enabled.
    131 
   (...)
    141 Any
    142 """
    143 if LogMode.get() == "disable":
--> 144     return obj(*args, **kwargs)
    146 logger = get_logger()
    147 logger.log(log_level, start_line)

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/pandas/partitioning/partition_manager.py:869, in PandasDataframePartitionManager.map_axis_partitions(cls, axis, partitions, map_func, keep_partitioning, num_splits, lengths, enumerate_partitions, **kwargs)
    817 @classmethod
    818 def map_axis_partitions(
    819     cls,
   (...)
    827     **kwargs,
    828 ):
    829     """
    830     Apply `map_func` to every partition in `partitions` along given `axis`.
    831 
   (...)
    867     some global information about the axis.
    868     """
--> 869     return cls.broadcast_axis_partitions(
    870         axis=axis,
    871         left=partitions,
    872         apply_func=map_func,
    873         keep_partitioning=keep_partitioning,
    874         num_splits=num_splits,
    875         right=None,
    876         lengths=lengths,
    877         enumerate_partitions=enumerate_partitions,
    878         **kwargs,
    879     )

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/logging/logger_decorator.py:144, in enable_logging.<locals>.decorator.<locals>.run_and_log(*args, **kwargs)
    129 """
    130 Compute function with logging if Modin logging is enabled.
    131 
   (...)
    141 Any
    142 """
    143 if LogMode.get() == "disable":
--> 144     return obj(*args, **kwargs)
    146 logger = get_logger()
    147 logger.log(log_level, start_line)

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/pandas/partitioning/partition_manager.py:74, in wait_computations_if_benchmark_mode.<locals>.wait(cls, *args, **kwargs)
     71 @wraps(func)
     72 def wait(cls, *args, **kwargs):
     73     """Wait for computation results."""
---> 74     result = func(cls, *args, **kwargs)
     75     if BenchmarkMode.get():
     76         if isinstance(result, tuple):

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/pandas/partitioning/partition_manager.py:595, in PandasDataframePartitionManager.broadcast_axis_partitions(cls, axis, apply_func, left, right, keep_partitioning, num_splits, apply_indices, broadcast_all, enumerate_partitions, lengths, apply_func_args, **kwargs)
    590 if apply_indices is None:
    591     apply_indices = np.arange(len(left_partitions))
    593 result_blocks = np.array(
    594     [
--> 595         left_partitions[i].apply(
    596             preprocessed_map_func,
    597             *(apply_func_args if apply_func_args else []),
    598             other_axis_partition=(
    599                 right_partitions if broadcast_all else right_partitions[i]
    600             ),
    601             **kw,
    602             **({"partition_idx": idx} if enumerate_partitions else {}),
    603             **kwargs,
    604         )
    605         for idx, i in enumerate(apply_indices)
    606     ]
    607 )
    608 # If we are mapping over columns, they are returned to use the same as
    609 # rows, so we need to transpose the returned 2D NumPy array to return
    610 # the structure to the correct order.
    611 return result_blocks.T if not axis else result_blocks

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/logging/logger_decorator.py:144, in enable_logging.<locals>.decorator.<locals>.run_and_log(*args, **kwargs)
    129 """
    130 Compute function with logging if Modin logging is enabled.
    131 
   (...)
    141 Any
    142 """
    143 if LogMode.get() == "disable":
--> 144     return obj(*args, **kwargs)
    146 logger = get_logger()
    147 logger.log(log_level, start_line)

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/pandas/partitioning/axis_partition.py:288, in PandasDataframeAxisPartition.apply(self, func, num_splits, other_axis_partition, maintain_partitioning, lengths, manual_partition, *args, **kwargs)
    259     other_shape = np.cumsum(
    260         [0] + [len(o.list_of_blocks) for o in other_axis_partition]
    261     )
    263     return self._wrap_partitions(
    264         self.deploy_func_between_two_axis_partitions(
    265             self.axis,
   (...)
    285         )
    286     )
    287 result = self._wrap_partitions(
--> 288     self.deploy_axis_func(
    289         self.axis,
    290         func,
    291         args,
    292         kwargs,
    293         num_splits,
    294         maintain_partitioning,
    295         *self.list_of_blocks,
    296         min_block_size=(
    297             MinRowPartitionSize.get()
    298             if self.axis == 0
    299             else MinColumnPartitionSize.get()
    300         ),
    301         lengths=lengths,
    302         manual_partition=manual_partition,
    303     )
    304 )
    305 if self.full_axis:
    306     return result

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/logging/logger_decorator.py:144, in enable_logging.<locals>.decorator.<locals>.run_and_log(*args, **kwargs)
    129 """
    130 Compute function with logging if Modin logging is enabled.
    131 
   (...)
    141 Any
    142 """
    143 if LogMode.get() == "disable":
--> 144     return obj(*args, **kwargs)
    146 logger = get_logger()
    147 logger.log(log_level, start_line)

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/pandas/partitioning/axis_partition.py:462, in PandasDataframeAxisPartition.deploy_axis_func(***failed resolving arguments***)
    460             result = func(dataframe.copy(), *f_args, **f_kwargs)
    461         else:
--> 462             raise err
    464 # to reduce peak memory consumption
    465 del dataframe

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/pandas/partitioning/axis_partition.py:457, in PandasDataframeAxisPartition.deploy_axis_func(***failed resolving arguments***)
    455 warnings.filterwarnings("ignore", category=FutureWarning)
    456 try:
--> 457     result = func(dataframe, *f_args, **f_kwargs)
    458 except ValueError as err:
    459     if "assignment destination is read-only" in str(err):

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/algebra/groupby.py:787, in GroupByReduce.build_map_reduce_functions.<locals>._reduce(df, **call_kwargs)
    784 # This will happen with Arrow buffer read-only errors. We don't want to copy
    785 # all the time, so this will try to fast-path the code first.
    786 except ValueError:
--> 787     result = wrapper(df.copy())
    788 return result

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/algebra/groupby.py:769, in GroupByReduce.build_map_reduce_functions.<locals>._reduce.<locals>.wrapper(df)
    768 def wrapper(df: pandas.DataFrame):
--> 769     return cls.reduce(
    770         df,
    771         axis=axis,
    772         groupby_kwargs=groupby_kwargs,
    773         reduce_func=reduce_func,
    774         agg_args=agg_args,
    775         agg_kwargs=agg_kwargs,
    776         drop=drop,
    777         method=method,
    778         finalizer_fn=finalizer_fn,
    779         **call_kwargs,
    780     )

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/algebra/groupby.py:274, in GroupByReduce.reduce(cls, df, reduce_func, axis, groupby_kwargs, agg_args, agg_kwargs, partition_idx, drop, method, finalizer_fn)
    271 groupby_kwargs["level"] = list(range(len(df.index.names)))
    273 apply_func = cls.get_callable(reduce_func, df)
--> 274 result = apply_func(
    275     df.groupby(axis=axis, **groupby_kwargs), *agg_args, **agg_kwargs
    276 )
    278 if not as_index:
    279     idx = df.index

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/modin/core/dataframe/algebra/groupby.py:666, in GroupByReduce._build_callable_for_dict.<locals>.aggregate_on_dict(grp_obj, *args, **kwargs)
    664 # The order is naturally preserved if there's no custom aggregations
    665 if preserve_aggregation_order and len(custom_aggs):
--> 666     result = result.reindex(result_columns, axis=1)
    667 return result

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/pandas/core/frame.py:5378, in DataFrame.reindex(self, labels, index, columns, axis, method, copy, level, fill_value, limit, tolerance)
   5359 @doc(
   5360     NDFrame.reindex,
   5361     klass=_shared_doc_kwargs["klass"],
   (...)
   5376     tolerance=None,
   5377 ) -> DataFrame:
-> 5378     return super().reindex(
   5379         labels=labels,
   5380         index=index,
   5381         columns=columns,
   5382         axis=axis,
   5383         method=method,
   5384         copy=copy,
   5385         level=level,
   5386         fill_value=fill_value,
   5387         limit=limit,
   5388         tolerance=tolerance,
   5389     )

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/pandas/core/generic.py:5610, in NDFrame.reindex(self, labels, index, columns, axis, method, copy, level, fill_value, limit, tolerance)
   5607     return self._reindex_multi(axes, copy, fill_value)
   5609 # perform the reindex on the axes
-> 5610 return self._reindex_axes(
   5611     axes, level, limit, tolerance, method, fill_value, copy
   5612 ).__finalize__(self, method="reindex")

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/pandas/core/generic.py:5633, in NDFrame._reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
   5630     continue
   5632 ax = self._get_axis(a)
-> 5633 new_index, indexer = ax.reindex(
   5634     labels, level=level, limit=limit, tolerance=tolerance, method=method
   5635 )
   5637 axis = self._get_axis_number(a)
   5638 obj = obj._reindex_with_indexers(
   5639     {axis: [new_index, indexer]},
   5640     fill_value=fill_value,
   5641     copy=copy,
   5642     allow_dups=False,
   5643 )

File ~/polars-api-compat-dev/.venv/lib/python3.12/site-packages/pandas/core/indexes/base.py:4426, in Index.reindex(self, target, method, level, limit, tolerance)
   4422     indexer = self.get_indexer(
   4423         target, method=method, limit=limit, tolerance=tolerance
   4424     )
   4425 elif self._is_multi:
-> 4426     raise ValueError("cannot handle a non-unique multi-index!")
   4427 elif not self.is_unique:
   4428     # GH#42568
   4429     raise ValueError("cannot reindex on an axis with duplicate labels")

ValueError: cannot handle a non-unique multi-index!

Issue Description

it raises

Expected Behavior

it doesn't raise

spotted this in Narwhals because we are so awesome 😎

Error Logs

Replace this line with the error backtrace (if applicable).

Installed Versions

INSTALLED VERSIONS

commit : 3e951a6
python : 3.12.6
python-bits : 64
OS : Linux
OS-release : 5.15.167.4-microsoft-standard-WSL2
Version : #1 SMP Tue Nov 5 00:21:55 UTC 2024
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : C.UTF-8

Modin dependencies

modin : 0.32.0
ray : None
dask : 2024.12.1
distributed : None

pandas dependencies

pandas : 2.2.3
numpy : 2.2.0
pytz : 2024.2
dateutil : 2.9.0.post0
pip : None
Cython : None
sphinx : None
IPython : 8.28.0
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.3
blosc : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : 2024.10.0
html5lib : None
hypothesis : 6.115.5
gcsfs : None
jinja2 : 3.1.4
lxml.etree : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
psycopg2 : None
pymysql : None
pyarrow : 18.0.0
pyreadstat : None
pytest : 8.3.4
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.14.1
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlsxwriter : None
zstandard : None
tzdata : 2024.2
qtpy : None
pyqt5 : None

@MarcoGorelli MarcoGorelli added bug 🦗 Something isn't working Triage 🩹 Issues that need triage labels Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🦗 Something isn't working Triage 🩹 Issues that need triage
Projects
None yet
Development

No branches or pull requests

1 participant