dataset

Laura Perez

Oct 25, 2022

62c9150 · Oct 25, 2022

Name	Name	Last commit message	Last commit date
parent directory ..
ner	ner	Initial commit	Oct 25, 2022
README.md	README.md	Initial commit	Oct 25, 2022
SparqlResults.py	SparqlResults.py	Initial commit	Oct 25, 2022
SparqlServer.py	SparqlServer.py	Initial commit	Oct 25, 2022
precompute_local_subgraphs.py	precompute_local_subgraphs.py	Initial commit	Oct 25, 2022
precompute_local_types.py	precompute_local_types.py	Initial commit	Oct 25, 2022

README.md

SPICE dataset

Dataset description

Annotations on the SPICE dataset

Entity Neighborhood Sub-Graphs

Input is a SPICE dataset, this script will extract entity neighborhood sub-graphs for gold entities. Will output a SPICE dataset copy annotated with entity neighborhood sub-graphs (added json field 'local_subgraph', each local_subgraph for each turn is constructed based on the entities in the previous question, previous answer and current question).

python precompute_local_subgraphs.py \
    --partition train  \
    --read_folder ${SPICE_CONVERSATIONS} \
    --write_folder ${ANNOTATED_SPICE_CONVERSATIONS} \
    --json_kg_folder ${PATH_JSON_KG}

Once annotations are done for gold entities, it's possible to add entity neighborhood sub-graphs for NER/NEL entities (e.g., AllenNLP). For this you need to specify the --nel_entities flag and --allennlpNER_folder that contains the conversations annotated with AllenNLP NER/NEL (see instructions for this script below).

This script also generates the global vocabulary file, it will generate a file named expansion_vocab.json in folder ANNOTATED_SPICE_CONVERSATIONS.

python precompute_local_subgraphs.py --write_folder ${SPICE_CONVERSATIONS} --task vocab

Type Sub-Graphs

Input is a SPICE dataset, will find KG type candidates mentioned in utterances, link to types in the KG and extract a set of relations for each of them. Will output a SPICE dataset copy annotated with type sub-graphs (the added json field is 'type_subgraph').

python precompute_local_types.py \
    --partition train \
    --read_folder ${SPICE_CONVERSATIONS} \
    --write_folder ${ANNOTATED_SPICE_CONVERSATIONS} \
    --json_kg_folder ${PATH_JSON_KG}

AllenNLP -based NER

Allennlp based NER-NEL scripts are present here

String Match -based NER

String based NER-NEL script are present here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Files

dataset

dataset

README.md

SPICE dataset

Annotations on the SPICE dataset

Entity Neighborhood Sub-Graphs

Type Sub-Graphs

AllenNLP -based NER

String Match -based NER

Collapse file tree

Files

dataset

Directory actions

More options

Directory actions

More options

Latest commit

History

dataset

Folders and files

parent directory

README.md

SPICE dataset

Annotations on the SPICE dataset

Entity Neighborhood Sub-Graphs

Type Sub-Graphs

AllenNLP -based NER

String Match -based NER