Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xsv map to map columns through regex substitutions #228

Open
joshtriplett opened this issue Jun 28, 2020 · 0 comments
Open

xsv map to map columns through regex substitutions #228

joshtriplett opened this issue Jun 28, 2020 · 0 comments

Comments

@joshtriplett
Copy link

joshtriplett commented Jun 28, 2020

I have a large amount of CSV data, and I'd like to map it through various substitutions to canonicalize it. For instance, "take anything matching this regular expression and map it to this single value". I'd like to provide a list of such regexes (e.g. as a CSV file itself), and do the equivalent of "join" but mapping through regular expressions.

What I would imagine is an xsv map command, very similar in command-line syntax to xsv join (specify a column to map, and another file to map it through), and then XSV would take each entry in that column and attempt to apply each regex to it.

(This could potentially be heavily optimized to check and process the regular expressions in parallel, using something like hyperscan, but for the scale of data I'm working with, I'd also be fine with a linear search through the regexes.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant