Git scraping support, thoughts? #1103
Replies: 3 comments 1 reply
-
I could indeed see this as being great, @frosencrantz. At some point I'll put some more work into vdplus/vgit and this could be a part of that. |
Beta Was this translation helpful? Give feedback.
-
Sounds fantastic, @saulpw! I'll mention:
|
Beta Was this translation helpful? Give feedback.
-
It looks like Simon Wilson has started to doing something along these lines. It is different than I was thinking. I thought it might be useful to compare the different results. It looks like he is more interested in providing all the data in one database that spans all versions. That also seems useful. |
Beta Was this translation helpful? Give feedback.
-
Simon Wilson has been talking about a scraping technique he calls git scraping
where a person scrapes data from a website and saves the data to git.
Here are some link about the subject:
https://next.github.com/projects/flat-data
VisiData could do a great job at supporting this data model. There would
need to be builtin support for understanding that there are different versions
of the same data file, with an ordering of those versions. This might be based
on source code history, but could also be files in a directory based on
trailing digits or timestamps alone. The UI might need to provide some changes
that would allow picking two (or more) versions for comparison or for joining.
It would be useful if VisiData could remember column types across versions, or
even transformations. Maybe VisiData could assist if the format of the data changed
like new columns or renamed columns?
If VisiData understood the ordering it might be possible to create a frequency sheet
that shows the difference between two or more versions.
Beta Was this translation helpful? Give feedback.
All reactions