Replies: 1 comment 1 reply
-
What is the data source, and is the data frame filtered?
(from mobile phone)
Op za 18 dec. 2021 21:00 schreef yohplala ***@***.***>:
… Hi,
I am implementing a function that yields pandas dataframes from vaex with
*variable* chunk sizes.
Hence, I cannot directly rely on vdf.to_pandas_df(chunk_size=50_000_000)
Instead, I am using yield vdf[start:end].to_pandas_df(), with start and
end being updated in a for loop.
Please, do you see any bottleneck / performance issue with this approach?
(I am asking, as vaex tends sometimes to show surprises :))
thanks in advance for your feedback!
Bests,
—
Reply to this email directly, view it on GitHub
<#1782>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANPEPIYVUYQBUOVUU5JJPTURTR5RANCNFSM5KLAPM2Q>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I am implementing a function that yields pandas dataframes from vaex with variable chunk sizes.
Hence, I cannot directly rely on
vdf.to_pandas_df(chunk_size=50_000_000)
Instead, I am using
yield vdf[start:end].to_pandas_df()
, withstart
andend
being updated in afor
loop.Please, do you see any bottleneck / performance issue with this approach?
(I am asking, as vaex tends sometimes to show surprises :))
thanks in advance for your feedback!
Bests,
Beta Was this translation helpful? Give feedback.
All reactions