Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

German output has no chunker information #30

Open
fotisj opened this issue Jun 2, 2017 · 1 comment
Open

German output has no chunker information #30

fotisj opened this issue Jun 2, 2017 · 1 comment
Assignees

Comments

@fotisj
Copy link

fotisj commented Jun 2, 2017

At the moment the German pipeline does not support chunking, probably because opennlp for German has no model for chunking. Now, the treetagger supports chunking for German and the DKPRO-Wrapper supports Treetagger, so it should be possible to integrate chunking. Would be great to have this included in the next update :-)

@fotisj fotisj changed the title German out has no chunker information German output has no chunker information Jun 2, 2017
@thvitt thvitt self-assigned this Jun 6, 2017
@thvitt
Copy link
Member

thvitt commented Jun 6, 2017

While the tree tagger chunker already is part of the wrapper, we get this exception

2017-06-06T12:17:37,729 ERROR   org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl       Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:401) [ddw-0.4.7-SNAPSHOT.jar:?]
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:308) [ddw-0.4.7-SNAPSHOT.jar:?]
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570) [ddw-0.4.7-SNAPSHOT.jar:?]
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412) [ddw-0.4.7-SNAPSHOT.jar:?]
        at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344) [ddw-0.4.7-SNAPSHOT.jar:?]
        at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265) [ddw-0.4.7-SNAPSHOT.jar:?]
        at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269) [ddw-0.4.7-SNAPSHOT.jar:?]
        at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:150) [ddw-0.4.7-SNAPSHOT.jar:?]
        at de.tudarmstadt.ukp.dariah.pipeline.RunPipeline.main(RunPipeline.java:645) [ddw-0.4.7-SNAPSHOT.jar:?]
Caused by: java.lang.NullPointerException
        at org.annolab.tt4j.TreeTaggerWrapper.removeProblematicTokens(TreeTaggerWrapper.java:707) ~[ddw-0.4.7-SNAPSHOT.jar:?]
        at org.annolab.tt4j.TreeTaggerWrapper.process(TreeTaggerWrapper.java:579) ~[ddw-0.4.7-SNAPSHOT.jar:?]
        at de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerChunker.process(TreeTaggerChunker.java:293) ~[ddw-0.4.7-SNAPSHOT.jar:?]
        at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48) ~[ddw-0.4.7-SNAPSHOT.jar:?]
        at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:385) ~[ddw-0.4.7-SNAPSHOT.jar:?]
        ... 8 more

for the following config:

useChunker = true
chunker = de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerChunker
chunkerArguments = executablePath,string,/opt/tree-tagger/bin/tree-tagger,\
        modelLocation,string,/opt/tree-tagger/lib/german-chunker.par,\
        modelEncoding,string,utf-8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants