Skip to content

Commit

Permalink
Updated docs on how the documents and annotation are now exported
Browse files Browse the repository at this point in the history
  • Loading branch information
twinkarma committed May 26, 2023
1 parent 0eff4f6 commit 72f848d
Showing 1 changed file with 15 additions and 15 deletions.
30 changes: 15 additions & 15 deletions docs/docs/manageradminguide/documents_annotations_management.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,10 +181,8 @@ Documents and annotations can be exported using the **Export** button. A zip fil
documents each. You can choose how documents are exported:

* `.json` & `.jsonl` - JSON or JSON Lines files can be generated in the format of:
* `raw` - Exports unmodified JSON. If you've originally uploaded in GATE format then choose this option.

An additional field named `annotation_sets` is added for storing annotations. The annotations are laid out in the
same way as GATE JSON format. For example if a document has been annotated by `user1` with labels and values
* `raw` - Exports the original `JSON` combined with an additional field named `annotation_sets` for storing annotations. The annotations are laid out in the
same way as GATE [bdocjs](https://gatenlp.github.io/gateplugin-Format_Bdoc/bdoc_document.html) format. For example if a document has been annotated by `user1` with labels and values
`text`:`Annotation text`, `radio`:`val3`, and `checkbox`:`["val2", "val4"]`:

```json
Expand All @@ -203,14 +201,12 @@ documents each. You can choose how documents are exported:
"end":10,
"id":0,
"features":{
"label":{
"text":"Annotation text",
"radio":"val3",
"checkbox":[
"val2",
"val4"
]
}
"text":"Annotation text",
"radio":"val3",
"checkbox":[
"val2",
"val4"
]
}
}
],
Expand All @@ -220,9 +216,10 @@ documents each. You can choose how documents are exported:
}
```

* `gate` - Convert documents to GATE JSON format and export. A `name` field is added that takes the ID value from the
ID field specified in the project configuration. Fields apart from `text` and the ID field specified in the project
config are placed in the `features` field. An `annotation_sets` field is added for storing annotations.
* `gate` - Convert documents to GATE [bdocjs](https://gatenlp.github.io/gateplugin-Format_Bdoc/bdoc_document.html) format and export.
A `name` field is added that takes the `ID` value from the
`ID field` specified in the **project configuration**. Any top-level fields apart from `text`, `features`, `offset_type`, `annotation_sets`, and the ID field specified in the project
config are placed in the `features` field. An `annotation_sets` field is added for storing annotations if it doesn't already exist.

For example in the case of this uploaded JSON document:
```json
Expand All @@ -249,6 +246,9 @@ documents each. You can choose how documents are exported:
* `.csv` - The JSON documents will be flattened to csv's column based format. Annotations are added as additional
columns with the header of `annotations.username.label`.

**Note: Documents that contains existing annotations (i.e. the `annotation_sets` field for `JSON` or `annotations` for `CSV`) are merged with the new sets of annotations. Be aware that if the document has a new annotation from an annotator with the same
username, the previous annotation will be overwritten. Existing annotations are also not anonymized when exporting the document.**

## Deleting documents and annotations

It is possible to click on the top left of corner of documents and annotations to select it, then click on the
Expand Down

0 comments on commit 72f848d

Please sign in to comment.