Skip to content

Can you split a large vaex dataframe into smaller hdf5 files? #2156

Answered by JovanVeljanoski
torivor asked this question in Q&A
Discussion options

You must be logged in to vote

Hey ,

Maybe i don't understand but isn't this what you want?

import vaex

df = vaex.example()

for i, df_part in enumerate(df.split([0.2, 0.2, 0.2, 0.2, 0.2])):
    print(i, df_part.shape)

Or do you want to export a single big dataframe (in hdf5 or otherwise) into smaller files? In that case there is df.export_many(..)

Maybe you should explain your usecase a bit better.. i.e. what you want to achieve. If you want to pass data to an ML model or some kind of service / process, there are probably better ways than the one above.

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by JovanVeljanoski
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants