Skip to content

Latest commit

 

History

History
334 lines (217 loc) · 12.2 KB

OpenCCU_README.md

File metadata and controls

334 lines (217 loc) · 12.2 KB

Computational Cultural Understanding Open Evaluation (OpenCCU) Validation and Scoring Toolkit

Version: 1.3.4

Date: April 1, 2024

Table of Content

Overview

Setup

Directory Structure and File Format

Usage

Report a Bug

Authors

Licensing Statement

This package contains the tools to validate and score the ND (norm detection) task. Please refer to the OpenCCU Evaluation Plan for more information about OpenCCU evaluation including the task, and file format.

This README file describes reference annotation validation tool, system output validation tool, scoring tool, reference statistics computing tool and perfect submission generation tool.

  • Reference Annotation Validation Tool: confirms that a reference annotation set follows the LDC (Linguistic Data Consortium) OpenCCU annotation package directory structure.
  • System Output Validation Tool: confirms that a submission of system output follows the rules set in the OpenCCU Evaluation Plan.
  • Scoring Tool: scores a system output submission against a reference with a scoring index file.
  • Reference Statistics Computing Tool: computes basic statistics on the reference data for the ND task.
  • Perfect Submission Generation Tool: generates a perfect submission for the ND task.

The tools mentioned above are included as a Python package. They can be run under a shell terminal and have been confirmed to work under OS X and Ubuntu.

Install the Python package using the following commands:

git clone https://github.com/usnistgov/ccu_validation_scoring

cd ./CCU_validation_scoring

python3 -m pip install -e ./

The OpenCCU validation and scoring toolkit expects input directories and/or files to have specific structures and formats. This section gives more information on these structures and formats that are referred in subsequent sections.

The reference directory mentioned validation and scoring sections must follow the LDC annotation data package directory structure and at a minimum must contain the following files in the given directory structure to pass validation:

<reference_directory>/
     ./data/
          norms.tab
     ./docs/
          segments.tab
          file_info.tab
     ./index_files/
          <DATASET>.system_input.index.tab

where <DATASET> is the name of dataset.

Please refer to the LDC OpenCCU annotation data package README for the formats of the above .tab files.

The toolkit includes several sample reference datasets for testing. See ccu_validation_scoring/test/reference/LDC_reference_sample or other sibling directories.

The toolkit uses different index files for various purposes:

  • system input index file - tells the scorer which files are available for the system to process. This file is included in the OpenCCU source data package and is used in validation and generation of a sample submission. The format is described in the OpenCCU Evaluation Plan.
  • system output index file - tells the scorer which files were processed by the system. This file is generated by the users and is used in validation and scoring. It must be located inside the submission directory. The format is described in the OpenCCU Evaluation Plan.
  • scoring index file - tells the scorer which files to score to facilitate subset scoring. This file is generated by the users and is used in scoring. The scoring index file has one column with the header file_id containing file IDs to score, one per row.

An example of a system input index file can be found in the sample reference datasets:

ccu_validation_scoring/test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.system_input.index.tab

An example of a system output index file can be found in the sample submissions:

ccu_validation_scoring/test/pass_submissions/pass_submissions_LDC_
reference_sample/ND/CCU_P1_TA1_ND_NIST_mini-eval1_20220531_050236/system_output.index.tab

In the CCU_validation_scoring-x.x.x/ directory, run the following to get the version and usage:

CCU_scoring version

CCU_scoring is an umbrella tool that has several subcommands, each with its own set of command line options. To get a list of subcommands, execute:

CCU_scoring -h

Use the -h flag on the subcommand to get the subcommand help manual. For example:

CCU_scoring score-nd -h

Reference Validation Subcommand

Validate a reference annotation directory to make sure the reference directory have the required files.

CCU_scoring validate-ref -ref <reference_directory>

Required Arguments

  • -ref: reference directory
# an example of reference validation
CCU_scoring validate-ref -ref test/reference/LDC_reference_sample

Submission Validation Subcommands

Norm Detection (ND) has a subcommand to validate a system output file. Use the subcommands below to validate the format of a ND submission directory against a reference directory. The submission directory must include a system output index file.

CCU_scoring validate-nd -sf open -s <submission_directory> -ref <reference_directory>

Required Arguments

  • -sf open: submission format of OpenCCU

  • -s: submission directory

  • -ref: reference directory

# an example of submission validation
CCU_scoring validate-nd \
-sf open \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/ND_open/CCU_P1_TA1_ND_NIST_mini-eval1_20231130_164235 \
-ref test/reference/LDC_reference_sample

Submission Scoring Subcommands

Norm Discovery (ND) Scoring Subcommand

Use the command below to score an ND submission directory against a reference directory with a scoring index file. The submission directory must include a system output index file.

CCU_scoring score-nd -sf open -s <norm_submission_directory> -ref <reference_directory> -i <scoring_index_file> -f

Required Arguments

  • -sf open: submission format of OpenCCU

  • -s: norm submission directory

  • -ref: reference directory

  • -i: file containing the file id of scoring datasets

  • -f: change reference annotation to noann when there are the same annotations but different status

Optional Arguments

  • -n: file containing the norm to filter norm from scoring

  • -t: comma separated list of IoU thresholds

  • -o: output directory containing the score and alignment file

# an example of norm scoring
CCU_scoring score-nd \
-sf open \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/ND_open/CCU_P1_TA1_ND_NIST_mini-eval1_20231130_164235 \
-ref test/reference/LDC_reference_sample \
-i test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.ND.scoring.index.tab \
-f

Reference Statistics Computing Tool

The following command should be run within the CCU_validation_scoring-x.x.x/ directory.

python3 scripts/ccu_ref_analysis.py -r <reference_directory> -t <task_string> -i <scoring_index_file> -o <output_file> -f

Required Arguments

  • -r: reference directory

  • -t norms: task

  • -i: file containing the file id of scoring datasets

  • -o: file where the statistics will be output

  • -f: change reference annotation to noann when there are the same annotations but different status

Optional Arguments

  • -xR: character gap for the text reference instances merging

  • -aR: second gap for the time reference instances merging

  • -vR: define how to handle the adhere/violate labels for the reference norm instances merging. "class" is to use the class label only (ignoring status) to merge and "class-status" is to use the class and status label to merge

# an example of statistics computing
python3 scripts/ccu_ref_analysis.py -r test/reference/LDC_reference_sample \
-t norms \
-i test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.ND.scoring.index.tab \
-o tmp.tab \
-f

Perfect Submission Generation Tool

The following command should be run within the CCU_validation_scoring-x.x.x/ directory.

python3 scripts/generate_perfect_submission.py -sf open -ref <reference_directory> -t norms -i <scoring_index_file> -o <output_directory> -f

Required Arguments

  • -sf open: submission format of OpenCCU

  • -ref: reference directory

  • -t norms: task

  • -i: file containing the file id of scoring datasets

  • -o: output directory containing a perfect submission

  • -f: change reference annotation to noann when there are the same annotations but different status

# an example of perfect submission generation
python3 scripts/generate_perfect_submission.py -sf open -ref test/reference/LDC_reference_sample \
-t norms \
-i test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.ND.scoring.index.tab \
-o tmp \
-f

Please send bug reports to [email protected]

For the bug report to be useful, please include the command line, files and text output, including the error message in your email.

A test suite has been developed and is runnable using the following command within the CCU_validation_scoring-x.x.x/ directory:

This will run the tests against a set of submissions and reference files available under test.

pytest

Jennifer Yu <[email protected]>

Clyburn Cunningham <[email protected]>

Lukas Diduch <[email protected]>

Jonathan Fiscus <[email protected]>

Audrey Tong <[email protected]>

Full details can be found at: http://nist.gov/data/license.cfm

NIST-developed software is provided by NIST as a public service. You may use,
copy, and distribute copies of the software in any medium, provided that you
keep intact this entire notice. You may improve, modify, and create derivative
works of the software or any portion of the software, and you may copy and
distribute such modifications or works. Modified works should carry a notice
stating that you changed the software and should note the date and nature of
any such change. Please explicitly acknowledge the National Institute of
Standards and Technology as the source of the software. 

NIST-developed software is expressly provided "AS IS." NIST MAKES NO WARRANTY
OF ANY KIND, EXPRESS, IMPLIED, IN FACT, OR ARISING BY OPERATION OF LAW,
INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE, NON-INFRINGEMENT, AND DATA ACCURACY. NIST NEITHER
REPRESENTS NOR WARRANTS THAT THE OPERATION OF THE SOFTWARE WILL BE
UNINTERRUPTED OR ERROR-FREE, OR THAT ANY DEFECTS WILL BE CORRECTED. NIST DOES
NOT WARRANT OR MAKE ANY REPRESENTATIONS REGARDING THE USE OF THE SOFTWARE OR
THE RESULTS THEREOF, INCLUDING BUT NOT LIMITED TO THE CORRECTNESS, ACCURACY,
RELIABILITY, OR USEFULNESS OF THE SOFTWARE.

You are solely responsible for determining the appropriateness of using and
distributing the software and you assume all risks associated with its use,
including but not limited to the risks and costs of program errors, compliance
with applicable laws, damage to or loss of data, programs or equipment, and the
unavailability or interruption of operation. This software is not intended to
be used in any situation where a failure could cause risk of injury or damage
to property. The software developed by NIST employees is not subject to
copyright protection within the United States.