Skip to content

Dockerized Stanza as API. Easy config and cached models.

License

Notifications You must be signed in to change notification settings

vivalence/dockerized-stanza-nlp

Repository files navigation

Stanza NLP Service

Features

  • On-Demand Model Downloads: Automatically downloads necessary Stanza models for different languages as required.
  • Pipeline Caching: Quick second-response times by caching pipelines.

Usage

Run the Docker Container

docker run -p 5000:5000 -v stanza_resources:/root/stanza_resources vivalence/dockerized-stanza-nlp

Run Docker Compose

version: '3'
services:
  stanza-nlp:
    image: vivalence/dockerized-stanza-nlp
    ports:
      - "5000:5000"
    volumes:
      - stanza_resources:/root/stanza_resources
volumes:
  stanza_resources: # model cache dir

Data Structure / API Interface

  • POST request to /nlp endpoint with the following JSON structure:
    {
      "language": "en",
      "text": "Hello World",
      "processors": "tokenize,mwt,pos,lemma,depparse"
    }
        

Response

  • The service returns a JSON response containing processed NLP data:
    {
      "sentences": [
        {
          "text": "Hello World",
          "tokens": [
            {"text": "Hello", "lemma": "hello", "pos": "INTJ"},
            {"text": "World", "lemma": "world", "pos": "NOUN"}
          ],
          "dependencies": [
            {"dep": "root", "governor": 0, "dependent": 2},
            {"dep": "discourse", "governor": 2, "dependent": 1}
          ]
        }
      ],
      "entities": []
    }
        
    • Includes tokenized sentences, POS tags, lemmas, and dependency parse information.