Skip to content

Commit

Permalink
Updated docs on how the documents and annotation are now exported
Browse files Browse the repository at this point in the history
  • Loading branch information
twinkarma authored and ianroberts committed Feb 26, 2024
1 parent 3350bd8 commit 901bbf7
Showing 1 changed file with 21 additions and 19 deletions.
40 changes: 21 additions & 19 deletions docs/docs/manageradminguide/documents_annotations_management.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,12 +187,11 @@ possible to determine which documents were annotated by _the same person_, just
You can choose how documents are exported:

* `.json` & `.jsonl` - JSON or JSON Lines files can be generated in the format of:
* `raw` - Exports unmodified JSON. If you've originally uploaded in GATE format then choose this option.

An additional field named `annotation_sets` is added for storing annotations. The annotations are laid out in the
same way as GATE JSON format. For example if a document has been annotated by `user1` with labels and values
`text`:`Annotation text`, `radio`:`val3`, and `checkbox`:`["val2", "val4"]`, the non-anonymous export might look
like this:
* `raw` - Exports the original `JSON` combined with an additional field named `annotation_sets` for storing
annotations. The annotations are laid out in the same way as GATE
[bdocjs](https://gatenlp.github.io/gateplugin-Format_Bdoc/bdoc_document.html) format. For example if a document
has been annotated by `user1` with labels and values `text`:`Annotation text`, `radio`:`val3`, and
`checkbox`:`["val2", "val4"]`, the non-anonymous export might look like this:

```json
{
Expand All @@ -210,14 +209,12 @@ You can choose how documents are exported:
"end":10,
"id":0,
"features":{
"label":{
"text":"Annotation text",
"radio":"val3",
"checkbox":[
"val2",
"val4"
]
}
"text":"Annotation text",
"radio":"val3",
"checkbox":[
"val2",
"val4"
]
}
}
],
Expand All @@ -232,16 +229,18 @@ You can choose how documents are exported:
}
```

In anonymous mode the name `user1` would instead be the user's opaque numeric identifier (e.g. `105`).
In anonymous mode the name `user1` would instead be derived from the user's opaque numeric identifier (e.g.
`annotator105`).

The field `teamware_status` gives the usernames or anonymous IDs (depending on the "anonymize" setting) of those annotators
who rejected the document, "timed out" because they did not complete their annotation in the time allowed by the
project, or "aborted" for some other reason (e.g. they were removed from the project).

* `gate` - Convert documents to GATE JSON format and export. A `name` field is added that takes the ID value from the
ID field specified in the project configuration. Fields apart from `text` and the ID field specified in the project
config are placed in the `features` field, as is the `teamware_status` information. An `annotation_sets` field is
added for storing annotations.
* `gate` - Convert documents to GATE [bdocjs](https://gatenlp.github.io/gateplugin-Format_Bdoc/bdoc_document.html)
format and export. A `name` field is added that takes the `ID` value from the `ID field` specified in the
**project configuration**. Any top-level fields apart from `text`, `features`, `offset_type`, `annotation_sets`,
and the ID field specified in the project config are placed in the `features` field, as is the `teamware_status`
information. An `annotation_sets` field is added for storing annotations if it doesn't already exist.

For example in the case of this uploaded JSON document:
```json
Expand Down Expand Up @@ -271,6 +270,9 @@ You can choose how documents are exported:
columns with the header of `annotations.username.label` and the status information is in columns named
`teamware_status.rejected_by`, `teamware_status.timed_out` and `teamware_status.aborted`.

**Note: Documents that contains existing annotations (i.e. the `annotation_sets` field for `JSON` or `annotations` for `CSV`) are merged with the new sets of annotations. Be aware that if the document has a new annotation from an annotator with the same
username, the previous annotation will be overwritten. Existing annotations are also not anonymized when exporting the document.**

## Deleting documents and annotations

It is possible to click on the top left of corner of documents and annotations to select it, then click on the
Expand Down

0 comments on commit 901bbf7

Please sign in to comment.