Supporting the translation from natural language (NL) query to visualization (NL2VIS) can simplify the creation of data visualizations because if successful, anyone can generate visualizations by their natural language from the tabular data.
We present ncNet, a Transformer-based model for supporting NL2VIS, with several novel visualization-aware optimizations, including using attention-forcing to optimize the learning process, and visualization-aware rendering to produce better visualization results.
Input:
- a tabular dataset (csv, json, or sqlite3)
- a natural language query used for NL2VIS
- an optional chart template
Output:
- Vega-Zero: a sequence-based grammar for model-friendly, by simplifying Vega-Lite
Please refer to our paper at IEEE VIS 2021 for more details.
Python3.6+
PyTorch 1.7
torchtext 0.8
ipyvega
Install Python dependency via pip install -r requirements.txt
when the environment of Python and Pytorch is setup.
-
[Must] Download the Spider data here and unzip under
./dataset/
directory -
[Optional] Only if you change the
train/dev/test.csv
under the./dataset/
folder, you need to runprocess_dataset.py
under thepreprocessing
foler.
Open the ncNet.ipynb
to try the running example.
Run train.py
to train ncNet.
Run test.py
to eval ncNet.
@ARTICLE{ncnet,
author={Luo, Yuyu and Tang, Nan and Li, Guoliang and Tang, Jiawei and Chai, Chengliang and Qin, Xuedi},
journal={IEEE Transactions on Visualization and Computer Graphics},
title={Natural Language to Visualization by Neural Machine Translation},
year={2021},
volume={},
number={},
pages={1-1}, doi={10.1109/TVCG.2021.3114848}}
The project is available under the MIT License.
If you have any questions, feel free to contact Yuyu Luo (yuyuluo [AT] hkust-gz.edu.cn).