Improve the user facing DataFrame and Series constructor by hide the query_compiler parameter #7366
Labels
Interfaces and abstractions
Issues with Modin's QueryCompiler, Algebra, or BaseIO objects
Today, the constructor for dataframe looks like the following
which contains a parameter query_compiler, and occurs in the generated documentation. However, we don't really want user to use this parameter, it is just for our internal construction usage.
In pandas, they allow construction of dataframe/series directly from BlockManager class, and the BlockManager is a property on NDFrame class https://github.com/pandas-dev/pandas/blob/main/pandas/core/generic.py#L257. Pandas provides an internal method _from_mgr https://github.com/pandas-dev/pandas/blob/main/pandas/core/generic.py#L309 to allow construction directly from BlockManager. In the frontend, pandas actually handles when data is a BlockManager directly, but it is not documented in the API.
In modin, maybe we can do something similar, where we can push the query_compiler construction to the BasePandasDataset, and provide an internal constructer _from_query_compiler to allow direct creation of the class from query compiler.
In that way, we will also be consistent with the pandas constructor definition.
The text was updated successfully, but these errors were encountered: