Skip to content

Small Java library for written text linguistics. Text linguistics is a branch of linguistics concerned with the description and analysis of extended texts (either spoken or written) in communicative contexts. This library aims to be independent of the natural language used.

License

Notifications You must be signed in to change notification settings

handspy/linguini

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Linguini - Java Library for Linguistic Analysis

Linguini is a small Java library to perform Linguistic Analysis in written texts. It aims to be independent of the natural language used, but will firstly focus on the portuguese (Portugal) language.

The portuguese replacements and toponyms were mostly copied from NLPPORT.

Maven Central Bintray Travis CI

Release docs Development docs

Release javadocs Development javadocs

Features

The project by default comes with a useful series of features:

  • a simple text analysis providing an overall summary of stats from the text
  • a lexical diversity analysis, which can use either MTLD or HD-D
  • an emotional analysis based on an annotated dictionary

Documentation

Documentation is always generated for the latest release, kept in the 'master' branch:

Documentation is also generated from the latest snapshot, taken from the 'develop' branch:

Building the docs

The documentation site is actually a Maven site, and its sources are included in the project. If required it can be generated by using the following Maven command:

$ mvn verify site

The verify phase is required, otherwise some of the reports won't be generated.

Usage

The application is coded in Java, using Maven to manage the project.

It is a Java library, meant to be included as a dependency on any project which may want to make use of it.

Prerequisites

The project has been tested on the following Java versions:

  • JDK 8

All other dependencies are handled through Maven, and noted in the included POM file.

Installing

The recommended way to install the project is by setting it up as a dependency. To get the configuration information for this check the Bintray repository, or the Maven Central Repository.

It is always possible to install it by using the usual Maven command:

$ mvn install

Collaborate

Any kind of help with the project will be well received, and there are two main ways to give such help:

  • reporting errors and asking for extensions through the issues management
  • or forking the repository and extending the project

Issues management

Issues are managed at the GitHub project issues tracker, where any Github user may report bugs or ask for new features.

Getting the code

If you wish to fork or modify the code, visit the GitHub project page, where the latest versions are always kept. Check the 'master' branch for the latest release, and the 'develop' for the current, and stable, development version.

License

The project has been released under the MIT License.

About

Small Java library for written text linguistics. Text linguistics is a branch of linguistics concerned with the description and analysis of extended texts (either spoken or written) in communicative contexts. This library aims to be independent of the natural language used.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages