-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Function to just read variable names/metadata #167
Comments
Would a good way to handle that be identifying all methods that has a "limit"-style parameters to grab a certain percentage/number of lines and then just do a transform to return a data.frame with column index, column name, and column type of the resulting data.frame? |
That could work, but what formats will that work for other than plain text? |
I'll have to peak at what binary formats may or may not have some form of quick access. Unfortunately the benefit of faster loading, smaller file size, and consistent type definitions with binary files probably means little performance benefit for slicing data since it's stored in a non-adjacent ways. We could have a consistent interface though that's simply slower on binary (but more accurate). It may make sense to make a |
We could probably also take a look at the haven codebase and see if there's way to add some of this functionality upstream. I haven't looked but I imagine it's possible as a lot of these formats have metadata the beginning of the file before any of the actual contents start. |
What I think would be nice is if among the meta-data snooping functions there was one that listed tables/sheets/etc. It would return names if available, numbers otherwise. For formats that cannot have multiple tables or sheets it would always return |
How about we make a generic like @bokov I'm not sure how useful that is, at least initially. Let's start with the simple/flat file types and then think about how much work is worthwhile to make it work for other file types. |
I agree about starting with simple/flat file types and not necessarily supporting all types. But what you think about the idea of following what seems to be the overall philosophy of this package and writing unified front end functions that does this for the supported formats (and some kind of message for unsupported ones). Instead of exporting format-specific functions. |
Yea, that's what I meant. Sorry for not being clear - we wouldn't export the methods, just like we don't export import/export methods. |
It might be useful to be able to just read metadata without loading an entire file. I don't think there's a way to do this consistently across file types, though.
The text was updated successfully, but these errors were encountered: