FREME-NER datasets for training different classifier implementations #179

reckart · 2017-02-21T09:34:22Z

Are the FREME-NER datasets available to train alternative classifier implementations, e.g. Apache OpenNLP NER?

m1ci · 2017-02-21T09:49:39Z

Hi, yes, we trained on the dbpedia abstracts dataset, see: http://wiki.dbpedia.org/nif-abstract-datasets
The data is in the NIF format, so you'll need to write small script which reads NIF and creates the input for the learning. This is how we created a script for NIF to StanfordNER input.
It would be great to have NIF2Any input learning converter.

reckart · 2017-02-21T09:54:18Z

DKPro Core might help you out here :)

We have a NIF reader - I have tested it on some NIF samples I found on the net but not on the DBPedia datasets. Also it's presently only available in SNAPSHOT builds.
We have writers for all kinds of formats
It is pretty straight forward to create a script to convert from one format to another
We even have started adding some training components, e.g. for Stanford NER and OpenNLP NER

m1ci · 2017-02-21T10:00:07Z

glad to hear that! will definitely look into it.

m1ci · 2017-02-22T10:36:30Z

@reckart we have just released the latest version of DBpedia abstracts for several languages. See http://downloads.dbpedia.org/2016-10/core-i18n/ which are nice source for training NER.

Let us know if you have any questions.

Best,
Milan

reckart · 2017-02-24T18:37:05Z

@m1ci puh, these files are huge! I was kind of hoping for one a ZIP with ttl file per article. How do you work with such large files? Would you recommend some RDF store?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FREME-NER datasets for training different classifier implementations #179

FREME-NER datasets for training different classifier implementations #179

reckart commented Feb 21, 2017

m1ci commented Feb 21, 2017

reckart commented Feb 21, 2017

m1ci commented Feb 21, 2017

m1ci commented Feb 22, 2017

reckart commented Feb 24, 2017

FREME-NER datasets for training different classifier implementations #179

FREME-NER datasets for training different classifier implementations #179

Comments

reckart commented Feb 21, 2017

m1ci commented Feb 21, 2017

reckart commented Feb 21, 2017

m1ci commented Feb 21, 2017

m1ci commented Feb 22, 2017

reckart commented Feb 24, 2017