You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Catalog.to_hipscat() is much slower than Catalog._ddf.to_parquet() for large jobs.
For example, for a smaller job, to_hipscat() took 100s, while to_parquet() took only 50s. For a larger job to_hipscat took 63 minutes, while to_parquet() took only 12 minutes. These are on Bridges2, 128 cores / 256 GB, 16 Dask workers.
For the larger job for the most of the time I see no activity with Dask Dashboard, and 100% of a single CPU core usage with top. This probably means that some planning job is talking all the time, not actual computations and I/O.
I have described the situation in which the bug arose, including what code was executed, information about my environment, and any applicable data others will need to reproduce the problem.
I have included available evidence of the unexpected behavior (including error messages, screenshots, and/or plots) as well as a descriprion of what I expected instead.
If I have a solution in mind, I have provided an explanation and/or pseudocode and/or task list.
The text was updated successfully, but these errors were encountered:
Tried to reproduce this issue again today, and it seems more related to overall slowness of operations in the presence of a large task graph, as opposed to an issue specific to the to_hats functionality.
Bug report
Catalog.to_hipscat()
is much slower thanCatalog._ddf.to_parquet()
for large jobs.For example, for a smaller job,
to_hipscat()
took 100s, whileto_parquet()
took only 50s. For a larger jobto_hipscat
took 63 minutes, whileto_parquet()
took only 12 minutes. These are on Bridges2, 128 cores / 256 GB, 16 Dask workers.For the larger job for the most of the time I see no activity with Dask Dashboard, and 100% of a single CPU core usage with
top
. This probably means that some planning job is talking all the time, not actual computations and I/O.Code I run
Before submitting
Please check the following:
The text was updated successfully, but these errors were encountered: