Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[folia2salt] Question: are List annotations supported yet? #45

Open
pirolen opened this issue Oct 7, 2021 · 4 comments
Open

[folia2salt] Question: are List annotations supported yet? #45

pirolen opened this issue Oct 7, 2021 · 4 comments

Comments

@pirolen
Copy link

pirolen commented Oct 7, 2021

In LaMachine I tried out folia2salt, but I got:

Exception: Unable to init layer for element <ListItem at 140407838899560 id=FA-MBK-4-3_035245008_0020_abpproc_partransf.text.1.div.1.p.1.list.1.item.1 set=None class=None>

I wonder if List annotations are supported.

@proycon
Copy link
Owner

proycon commented Oct 7, 2021

The folia2salt implementation is a more like of first proof-of-concept at this stage, so expect things not to be implemented yes. I don't think anybody has seriously used it yet. You can follow the status in this issue: proycon/folia#85 .. I'm not currently working on that now though as I don't think there's anybody currently interested in it anymore.

@pirolen
Copy link
Author

pirolen commented Oct 7, 2021

I'd then rather try find a workaround, using another FoLiA converter.

@proycon
Copy link
Owner

proycon commented Oct 7, 2021

What's the aim you're trying to achieve?

@pirolen
Copy link
Author

pirolen commented Oct 8, 2021

I am investigating options for a future workflow, and checking compatibility between FoLiA and INCEpTION (https://inception-project.github.io/documentation/) that imports/exports using formats like Weblicht TCF and UIMA CAS. The use case at hand seems to require very meticulous and repetitive manual annotations, basically on each token, just like POS tagging, but with domain-specific entities.

I am looking at ways how to set this up in LaMachine/FLAT and/or INCEpTION/cassis (https://github.com/dkpro/dkpro-cassis).
The INCEpTION tool would be handy for doing active learning of entities, suggesting new entity annotations on the fly. Another handy functionality is the entity linking to knowledge bases, which this project would need at some point.
I suspect it is more sustainable for me to stay in LaMachine/FLAT/foliapy and see if I can provide the users with similar functionalities, even if not on th efly, but e.g. by regularly generating gazetteers and NER taggers and pre-processing new documents using them. What do you think?

By a brief test anyway, it seems that the folia2html output would likely encode sufficient information to feed in to INCEpTION. About converting its export to FoLiA I'll write separately. I tried TEI and UIMA CAS so far...

Any suggestions are most welcome, also in separate threads. Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants