Feature request: 'merge' (line 'cat' for rows, but with column interpolation) #4

jeetsukumaran · 2015-03-05T19:20:41Z

The final output would have the union of all the field names across all input files.
Fields without data (e.g., that column was missing from the original) could be left blank or a user-supplied value (e.g., "--nodata='NA'").

BurntSushi · 2015-03-05T19:22:19Z

Could you give a small example please? Just so I make sure I understand.

jeetsukumaran · 2015-03-05T19:28:12Z

data1.csv:

f1,f2,f3,f4
 0, 1, 2, 3
 4, 5, 6, 7

data2.csv:

f1,f4,f5,f6,f7
 a, b, c, d, e
 f, g, h, i, j

Output of xsv cat rows data1.csv data2.csv --nodata="NA":

f1,f2,f3,f4,f5,f6,f7
 0, 1, 2, 3,NA,NA,NA
 4, 5, 6, 7,NA,NA,NA
 a,NA,NA, b, c, d, e
 f,NA,NA, g, h, i, j
~~

jeetsukumaran · 2015-03-05T19:36:15Z

Maybe "missing data" or "missing fields" would be a more precise term. The flag could be "--interpolate-missing" or something like that.

danielecook · 2019-08-23T17:34:31Z

4 years late to the party but...

@jeetsukumaran I have written a utility that does this:

https://github.com/danielecook/tut

tut stack data1.csv data2.csv

It's called stack, and I use it to merge all kinds of output files with heterogeneous columns... I also added an option to output the filename with --add-filename (full path) or basenames --add-basename of the files being merged. This makes it super easy to merge together a collection of related files for analysis.

@BurntSushi - it would be really cool if you were able to add this as I'm sure the xsv implementation would be much faster.

geekscrapy · 2019-12-19T08:09:02Z

Echo'ing @danielecook 's last comment: It'd be amazing if this was incorporated as a subcommand 👍 Currently using csvstack - it works, but I think xsv would be faster!

data-man · 2019-12-19T17:57:36Z

Maybe tsv-utils can help (written in D).

ad-si · 2020-01-07T14:44:28Z

I was pretty surprised to find out that this is not the default behavior 😳. Addition of this would be highly appreciated 😊

alexmarco · 2020-03-30T06:18:31Z

Maybe tsv-utils can help (written in D).

HI, don't exists option in tsv-utils for stack and align input data.
+1 for this option in xsv

BurntSushi · 2020-03-30T12:01:40Z

@alexmarco Please don't post comments just to +1 a feature requests.

BurntSushi added the enhancement label Mar 5, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: 'merge' (line 'cat' for rows, but with column interpolation) #4

Feature request: 'merge' (line 'cat' for rows, but with column interpolation) #4

jeetsukumaran commented Mar 5, 2015

BurntSushi commented Mar 5, 2015

jeetsukumaran commented Mar 5, 2015

jeetsukumaran commented Mar 5, 2015

danielecook commented Aug 23, 2019

geekscrapy commented Dec 19, 2019

data-man commented Dec 19, 2019

ad-si commented Jan 7, 2020

alexmarco commented Mar 30, 2020

BurntSushi commented Mar 30, 2020

Feature request: 'merge' (line 'cat' for rows, but with column interpolation) #4

Feature request: 'merge' (line 'cat' for rows, but with column interpolation) #4

Comments

jeetsukumaran commented Mar 5, 2015

BurntSushi commented Mar 5, 2015

jeetsukumaran commented Mar 5, 2015

jeetsukumaran commented Mar 5, 2015

danielecook commented Aug 23, 2019

geekscrapy commented Dec 19, 2019

data-man commented Dec 19, 2019

ad-si commented Jan 7, 2020

alexmarco commented Mar 30, 2020

BurntSushi commented Mar 30, 2020