Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] safer safenames #1921

Open
terefang opened this issue Jun 25, 2024 · 5 comments
Open

[Feature Request] safer safenames #1921

terefang opened this issue Jun 25, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@terefang
Copy link

while the current safename option are many and database column-name safe, actual naming schema might not be what the user wants.

my proposal would be the (s)afe mode as defined be the following steps:

  • lowercase all headers
  • replace any char not in [:alnum:] with "_"
  • replace [_]+ with "_"
  • if prefix option specified, prefix every column-name with the prefix

(s)afe mode should operate in ascii-chars-only mode, (S)afe mode should work in unicode-chars mode.

@jqnatividad
Copy link
Owner

FYI - I originally designed safenames to help with real-world data ingestion using Datapusher+ for CKAN - especially with spreadsheet header names.

That's why its defaults are heavily informed by CKAN requirements.

I can certainly make it "safer" using your proposal, though I have to prioritize CKAN data ingestion.

Perhaps, I can just add a new command called safernames, so as not to perturb safenames which is working quite well in our pipelines.

@jqnatividad jqnatividad added the enhancement New feature or request label Jun 26, 2024
@terefang
Copy link
Author

hmm .. would the creation of a new subcommand with similar functionality be counter-intuitive ?

@jqnatividad
Copy link
Owner

Point taken...

I'll just have to add it to safenames then in a way that doesn't have breaking API changes...

@terefang
Copy link
Author

how about --mode "s" and --mode "S" ?

@jqnatividad
Copy link
Owner

Yes... that's the easy part...

The part that I'm thinking about is how the JSON output formats will work...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants