NLP Service

This service provides some Natural Language Processing (NLP) functionalities used in our system.

APIs

GET /docsim: Calculate the similarity between two pieces of text.

Request parameters:

a: first piece of text
b: second piece of text
model: language model in use, currently only 'news' is available.

Response (JSON):

A scalar floating point value representing the similarity, ranging from 0.0 (completely irrelevant) to 1.0 (exactly the same).

--

POST /docsim_1ton: Compute similarity for one piece of text with many.

Request body should be in JSON, with the following fields:

one (string): the pivot
many (list of strings): the candidates to be compared
model (string): the same as for GET /docsim API

Response (JSON):

A list of length the same as the length of the input array many, each with a floating point value representing the similarity between the pivot and the corresponding text from many.

--

POST /ner: Extract an entity name (usually a company) from a sentence (usually a news title)

Request parameters:

q: the sentence
threshold (float [0,1]):, default to be 0.5, indicating how high probability should take for a single character be considered in a part of the entity name. higher value corresponds to lower false positive rate, while lower value corresponds to lower false negative rate.
return_raw (true or false, optional): whether return a list of probability of each character being a part of the entity name

Response (JSON):

entity (a string or null): representing the entity extracted
threshold: same as the value passed in
positive_confidence (float [0,1], or null): how confident the algorithm determined that the entity found is correct; if no entity is found, this field will be null
overall_confidence (float [0,1]): how confident the algorithm determined that all the character been classified correctly
raw (list of pairs): only present when requested with return_raw; each pair contains a character and the probability for that character being a part of the entity name

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
algorithm		algorithm
corpus		corpus
models		models
tools		tools
Dockerfile		Dockerfile
Jenkinsfile		Jenkinsfile
README.md		README.md
requirements.txt		requirements.txt
service.py		service.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP Service

APIs

About

Releases

Packages

Contributors 2

Languages

zhulux/nlpservice

Folders and files

Latest commit

History

Repository files navigation

NLP Service

APIs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages