Version: 1.3.4
Date: April 1, 2024
Directory Structure and File Format
This package contains the tools to validate and score the ND (norm detection) task. Please refer to the OpenCCU Evaluation Plan for more information about OpenCCU evaluation including the task, and file format.
This README file describes reference annotation validation tool, system output validation tool, scoring tool, reference statistics computing tool and perfect submission generation tool.
- Reference Annotation Validation Tool: confirms that a reference annotation set follows the LDC (Linguistic Data Consortium) OpenCCU annotation package directory structure.
- System Output Validation Tool: confirms that a submission of system output follows the rules set in the OpenCCU Evaluation Plan.
- Scoring Tool: scores a system output submission against a reference with a scoring index file.
- Reference Statistics Computing Tool: computes basic statistics on the reference data for the ND task.
- Perfect Submission Generation Tool: generates a perfect submission for the ND task.
The tools mentioned above are included as a Python package. They can be run under a shell terminal and have been confirmed to work under OS X and Ubuntu.
- Python >= 3.8.6
- Pandas >= 2.0.3
- Pathlib >= 1.0.1
- Numpy >= 1.22.3
- Pytest >= 7.1.3
- matplotlib >= 3.5.2
Install the Python package using the following commands:
git clone https://github.com/usnistgov/ccu_validation_scoring
cd ./CCU_validation_scoring
python3 -m pip install -e ./
The OpenCCU validation and scoring toolkit expects input directories and/or files to have specific structures and formats. This section gives more information on these structures and formats that are referred in subsequent sections.
The reference directory
mentioned validation and scoring sections must follow the LDC annotation data package directory structure and at a minimum must contain the following files in the given directory structure to pass validation:
<reference_directory>/
./data/
norms.tab
./docs/
segments.tab
file_info.tab
./index_files/
<DATASET>.system_input.index.tab
where <DATASET>
is the name of dataset.
Please refer to the LDC OpenCCU annotation data package README
for the formats of the above .tab
files.
The toolkit includes several sample reference datasets for testing. See ccu_validation_scoring/test/reference/LDC_reference_sample
or other sibling directories.
The toolkit uses different index files for various purposes:
system input index file
- tells the scorer which files are available for the system to process. This file is included in the OpenCCU source data package and is used in validation and generation of a sample submission. The format is described in the OpenCCU Evaluation Plan.system output index file
- tells the scorer which files were processed by the system. This file is generated by the users and is used in validation and scoring. It must be located inside thesubmission directory
. The format is described in the OpenCCU Evaluation Plan.scoring index file
- tells the scorer which files to score to facilitate subset scoring. This file is generated by the users and is used in scoring. Thescoring index file
has one column with the headerfile_id
containing file IDs to score, one per row.
An example of a system input index file
can be found in the sample reference datasets:
ccu_validation_scoring/test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.system_input.index.tab
An example of a system output index file
can be found in the sample submissions:
ccu_validation_scoring/test/pass_submissions/pass_submissions_LDC_
reference_sample/ND/CCU_P1_TA1_ND_NIST_mini-eval1_20220531_050236/system_output.index.tab
In the CCU_validation_scoring-x.x.x/
directory, run the following to get the version and usage:
CCU_scoring version
CCU_scoring is an umbrella tool that has several subcommands, each with its own set of command line options. To get a list of subcommands, execute:
CCU_scoring -h
Use the -h
flag on the subcommand to get the subcommand help manual. For example:
CCU_scoring score-nd -h
Validate a reference annotation directory to make sure the reference directory
have the required files.
CCU_scoring validate-ref -ref <reference_directory>
Required Arguments
-ref
: reference directory
# an example of reference validation
CCU_scoring validate-ref -ref test/reference/LDC_reference_sample
Norm Detection (ND) has a subcommand to validate a system output file. Use the subcommands below to validate the format of a ND submission directory against a reference directory
. The submission directory
must include a system output index file
.
CCU_scoring validate-nd -sf open -s <submission_directory> -ref <reference_directory>
Required Arguments
-
-sf open
: submission format of OpenCCU -
-s
: submission directory -
-ref
: reference directory
# an example of submission validation
CCU_scoring validate-nd \
-sf open \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/ND_open/CCU_P1_TA1_ND_NIST_mini-eval1_20231130_164235 \
-ref test/reference/LDC_reference_sample
Norm Discovery (ND) Scoring Subcommand
Use the command below to score an ND submission directory against a reference directory
with a scoring index file
. The submission directory
must include a system output index file
.
CCU_scoring score-nd -sf open -s <norm_submission_directory> -ref <reference_directory> -i <scoring_index_file> -f
Required Arguments
-
-sf open
: submission format of OpenCCU -
-s
: norm submission directory -
-ref
: reference directory -
-i
: file containing the file id of scoring datasets -
-f
: change reference annotation to noann when there are the same annotations but different status
Optional Arguments
-
-n
: file containing the norm to filter norm from scoring -
-t
: comma separated list of IoU thresholds -
-o
: output directory containing the score and alignment file
# an example of norm scoring
CCU_scoring score-nd \
-sf open \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/ND_open/CCU_P1_TA1_ND_NIST_mini-eval1_20231130_164235 \
-ref test/reference/LDC_reference_sample \
-i test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.ND.scoring.index.tab \
-f
The following command should be run within the CCU_validation_scoring-x.x.x/
directory.
python3 scripts/ccu_ref_analysis.py -r <reference_directory> -t <task_string> -i <scoring_index_file> -o <output_file> -f
Required Arguments
-
-r
: reference directory -
-t norms
: task -
-i
: file containing the file id of scoring datasets -
-o
: file where the statistics will be output -
-f
: change reference annotation to noann when there are the same annotations but different status
Optional Arguments
-
-xR
: character gap for the text reference instances merging -
-aR
: second gap for the time reference instances merging -
-vR
: define how to handle the adhere/violate labels for the reference norm instances merging. "class" is to use the class label only (ignoring status) to merge and "class-status" is to use the class and status label to merge
# an example of statistics computing
python3 scripts/ccu_ref_analysis.py -r test/reference/LDC_reference_sample \
-t norms \
-i test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.ND.scoring.index.tab \
-o tmp.tab \
-f
The following command should be run within the CCU_validation_scoring-x.x.x/
directory.
python3 scripts/generate_perfect_submission.py -sf open -ref <reference_directory> -t norms -i <scoring_index_file> -o <output_directory> -f
Required Arguments
-
-sf open
: submission format of OpenCCU -
-ref
: reference directory -
-t norms
: task -
-i
: file containing the file id of scoring datasets -
-o
: output directory containing a perfect submission -
-f
: change reference annotation to noann when there are the same annotations but different status
# an example of perfect submission generation
python3 scripts/generate_perfect_submission.py -sf open -ref test/reference/LDC_reference_sample \
-t norms \
-i test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.ND.scoring.index.tab \
-o tmp \
-f
Please send bug reports to [email protected]
For the bug report to be useful, please include the command line, files and text output, including the error message in your email.
A test suite has been developed and is runnable using the following command within the CCU_validation_scoring-x.x.x/
directory:
This will run the tests against a set of submissions and reference files available under test
.
pytest
Jennifer Yu <[email protected]>
Clyburn Cunningham <[email protected]>
Lukas Diduch <[email protected]>
Jonathan Fiscus <[email protected]>
Audrey Tong <[email protected]>
Full details can be found at: http://nist.gov/data/license.cfm
NIST-developed software is provided by NIST as a public service. You may use,
copy, and distribute copies of the software in any medium, provided that you
keep intact this entire notice. You may improve, modify, and create derivative
works of the software or any portion of the software, and you may copy and
distribute such modifications or works. Modified works should carry a notice
stating that you changed the software and should note the date and nature of
any such change. Please explicitly acknowledge the National Institute of
Standards and Technology as the source of the software.
NIST-developed software is expressly provided "AS IS." NIST MAKES NO WARRANTY
OF ANY KIND, EXPRESS, IMPLIED, IN FACT, OR ARISING BY OPERATION OF LAW,
INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE, NON-INFRINGEMENT, AND DATA ACCURACY. NIST NEITHER
REPRESENTS NOR WARRANTS THAT THE OPERATION OF THE SOFTWARE WILL BE
UNINTERRUPTED OR ERROR-FREE, OR THAT ANY DEFECTS WILL BE CORRECTED. NIST DOES
NOT WARRANT OR MAKE ANY REPRESENTATIONS REGARDING THE USE OF THE SOFTWARE OR
THE RESULTS THEREOF, INCLUDING BUT NOT LIMITED TO THE CORRECTNESS, ACCURACY,
RELIABILITY, OR USEFULNESS OF THE SOFTWARE.
You are solely responsible for determining the appropriateness of using and
distributing the software and you assume all risks associated with its use,
including but not limited to the risks and costs of program errors, compliance
with applicable laws, damage to or loss of data, programs or equipment, and the
unavailability or interruption of operation. This software is not intended to
be used in any situation where a failure could cause risk of injury or damage
to property. The software developed by NIST employees is not subject to
copyright protection within the United States.