Model and inference scripts for this paper
-
Download the fine-tuned model (GPT-2) here (alternatively, if you use gdown:
gdown --id 161wl_sxlghXfWmR6917CdbqTMDIGOHaq
). -
Prepare an input
jsonl
file. Each line of this file should be ajson
with the input document present in keytext
:
{"text": "text1"}
{"text": "text2"}
.
.
.
{"text": "textN"}
- Run generation using:
python src/run_generation.py --task graphgen\
--model-path <path-to-fine-tuned-model>\
--input-path <path-to-input-jsonl-file>\
--output-path <path-to-output-jsonl-file>
Where:
path-to-fine-tuned-model
: path to the fine-tuned model downloaded in step 1.path-to-input-jsonl-file
: path to the input file prepared in step 2.path-to-output-jsonl-file
: path to the output file. The output file is identical to the input file with one additional field (generated_graph
).
Input (data/test_input.json)
{"text": "That , in turn , is thought to have ties to the Party of God , a better-known guerrilla faction inspired by revolutionary Iran . The Organization of the Oppressed on Earth , asserted responsibility for the June 1985 hijacking of a Trans World Airways jetliner in which a United States Navy diver , Robert D. Stethem , was killed . It is not known where Colonel Higgins is being held ."}
Generation:
python src/run_generation.py --task graphgen \
--model-path data/model-checkpoints/temporal-graph-gen/\
--input-path data/graphgen/test_input.jsonl\
--output-path data/graphgen/test_output.jsonl
Output:
{"text": "That , in turn , is thought to have ties to the Party of God , a better-known guerrilla faction inspired by revolutionary Iran . The Organization of the Oppressed on Earth , asserted responsibility for the June 1985 hijacking of a Trans World Airways jetliner in which a United States Navy diver , Robert D. Stethem , was killed . It is not known where Colonel Higgins is being held .", "generated_graph": "strict graph {\n\"The Organization asserted responsibility\" -- \"a United States Navy diver killed\" [rel=after];\n}"}
The temporal graphs are generated in the DOT language. While some of the generated graphs might not be valid DOT files (the output is generated by drawing samples from GPT-2 using nucleus sampling), we expect such cases to be rare as in our tests ~94% of the generated graphs were valid DOT files. The generated DOT graphs can be used with libraries like Graphviz or networkx for downstream applications.
We also provide a simple wrapper over these graphs in src/temporal_graph.py
.
For example, the following script parses the output graph generated above.
python src/temporal_graph.py data/test_output.json data/test_out.png
The generated graphs can also be visualized using Edotor.
We also introduce a related sub-task of temporal node generation (section 3, task 1 in the paper).
-
Download the fine-tuned model here (or you can
gdown --id 1_uBXr2bh8UVH3tlaSMnTLr0jzenorPZx
). -
Input
: The input should be of the form:
{"question": "text1"}
{"question": "text2"}
.
.
.
{"question": "textN"}
Where question
is the query string (see section 3 for details or the example below).
- Run generation using:
python src/run_generation.py --task nodegen\
--model-path <path-to-fine-tuned-model>\
--input-path <path-to-input-jsonl-file>\
--output-path <path-to-output-jsonl-file>
Where:
path-to-fine-tuned-model
: path to the fine-tuned model downloaded in step 1.path-to-input-jsonl-file
: path to the input file prepared in step 2.path-to-output-jsonl-file
: path to the output file. The output file contains a temporal graph corresponding to each line of text in the input.
- Sample input (
data/nodegen/test_input.jsonl
):
{"question": "Ms. Baeszler , 43 , relaxed by doing needlepoint . She had perfected it . She gave a needlepoint tapestry of a horse 's head to a childhood friend , who is an equestrian . She made a cocoa pot for an aunt who collects them . which event happened before She gave tapestry ?"}
- Generation script
python src/run_generation.py --task nodegen\
--model_path data/model-checkpoints/temporal-graph-gen/\
--input_path data/nodegen/test_input.jsonl\
--output_path data/nodegen/test_output.jsonl
- Sample output
{"question": "Ms. Baeszler , 43 , relaxed by doing needlepoint . She had perfected it . She gave a needlepoint tapestry of a horse 's head to a childhood friend , who is an equestrian . She made a cocoa pot for an aunt who collects them . which event happened before She gave tapestry ?", "generated_answer": "She perfected it"}
@inproceedings{madaan-yang-2021-neural,
title = "Neural Language Modeling for Contextualized Temporal Graph Generation",
author = "Madaan, Aman and Yang, Yiming",
booktitle = "Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
month = jun,
year = "2021",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2021.naacl-main.67",
pages = "864--881",
}