Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

document that table_columns parameter's array of hashes must be ordered #380

Open
nik-bladey opened this issue Apr 3, 2020 · 4 comments

Comments

@nik-bladey
Copy link

nik-bladey commented Apr 3, 2020

It would be helpful if the documentation for the table_columns parameter surfaced in API Client resources (like in post_files_csv, among others) specified that the array of hashes must be ordered. Since these are key:value pairs, I initially assumed that as with Python dictionaries the order would be irrelevant. It took me a while of puzzling over very confusing downstream errors before realizing I had to provide them in the same order as the columns of the table. Apparently arrays of hashes are always like this / this is common computer science knowledge, but I've never used hashes before and it was not intuitive.

As Manticore is working on surfacing table_columns in even more places, I suspect other people will run into this more in the future.

For reference, the current documentation looks like this:
image

@mheilman
Copy link
Contributor

mheilman commented Apr 6, 2020

Apparently arrays of hashes are always like this / this is common computer science knowledge, but I've never used hashes before and it was not intuitive.

I don't think this has anything to do with common CS knowledge. The docstring seems just unclear. I think it should be improved upstream in the platform API code rather than the API client. I'll file an internal issue, but I think we should leave this open until it gets addressed in the platform API.

@mheilman
Copy link
Contributor

mheilman commented Apr 6, 2020

@Zephyraith, would it be possible for you to provide an example of the input you were trying to provide versus what platform was expecting?

@nik-bladey
Copy link
Author

nik-bladey commented Apr 28, 2020

Some helpful rephrasing in the documentation occurred as part of #379 to substitute "dictionaries" for "hashes", but that didn't fully address my concerns.

@stephen-hoover, this relates to a Slack conversation we / you had today in #pmi-engineering and #data-engineering -- perhaps you have contributing thoughts.

1
2
3
4

@stephen-hoover
Copy link

For me, the main point of confusion was in not using table_columns. I'd assumed that when appending to an existing table using dataframe_to_civis, the DataFrame column information would be used to correctly order the columns for appending to the database table, but the DataFrame column names are ignored. A couple of @Zephyraith 's examples here are about that (for me) confusing behavior when not providing table_columns.

One non-obvious thing is that (as I understand it), table_columns is about the columns in the data that are being uploaded, and not so much about the table in the database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants