-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dockerfile cuzzo/stanford-pos-tagger #3
Comments
Hey @blindsteal, That sounds like an awesome idea. I'm working on Dockerizing CoreNLP right now, actually. That being said, do you have any idea for how to set the language? I've only ever used English. But I'd love to add support for others. Thanks! |
AFAIK you only need to change the When you say CoreNLP, you mean the CoreNLP dedicated server? I would love to see this. If you've got a repo somewhere I would be glad to help. |
Unfortunately, I think the Dockerfile is lost. I thought it was a part of this repo, but considering I don't have the computer I wrote it on anymore, I'm doubtful I'll find it. Supposedly, there's a way to generate a Dockerfile from an image, but I don't think it'll be very helpful here. But as far as CoreNLP goes, there's a Python server that makes most of it pretty easy. It uses this awful command line parsing logic to get the results rather than using RPC, probably because StanfordNLP doesn't support that by default. The cool thing about this is that it used a plugin to get the results via RPC, so it was a lot more efficient. The downside is that it only works with POS tagging. This Scala RPC service seems mighty promising. It's already Dockerized. Dunno if it's configurable, but if it isn't, I'd love to make it so and document it! Do you have a Gmail? Would be nice to chat about this. Seems like you know more about CoreNLP than me [= Cheers, |
Sorry for the delay, work is keeping me busy... I probably know less about CoreNLP than you, actually I just found it a couple of days ago, but I know Docker pretty well. After having a quick look I noticed they have their own dedicated server exposing a restful interface (which should be easy enough to dockerize), is there a reason why you prefer RPC? Concerning the Dockerfile for your image: my first comment contains a link to one of the sites mentioned in the SO thread, and after looking at it again I think it should actually be pretty easy to reconstruct it from there (you can see all commands, including installed packages etc). Quick C&P: FROM ubuntu:latest # correct base image ?
RUN sed 's/main$/main universe/' -i /etc/apt/sources.list
RUN apt-get update && apt-get install -y software-properties-common python-software-properties
RUN add-apt-repository ppa:webupd8team/java -y
RUN sudo apt-get update
RUN echo oracle-java7-installer shared/accepted-oracle-license-v1-1 select true | /usr/bin/debconf-set-selections
RUN apt-get install -y oracle-java7-installer
RUN mkdir -p /home/cuzzo/stanford
ADD . /home/cuzzo/stanford
EXPOSE 9000 9000
CMD --port 9000
ENTRYPOINT /bin/sh -c /home/cuzzo/stanford/run-server.sh /home/cuzzo/stanford/models/left3words-distsim-wsj-0-18.tagger 9000 I do have Gmail too: |
Hey @blindsteal, Awesome find. This definitely didn't exist when I started using CoreNLP, but now that it does, I'm definitely going to take advantage of it. The default server even lets you specify the annotators on the fly, which is really cool. Haven't seen any other servers that let you do that. I'm working on a new (much smaller) Docker image to just use the standard CoreNLP server. Thanks! |
Hi, I was wondering if there is any chance you could make the Dockerfile for your image available on github/docker repo? From what I see here it is mainly installing deps and running your server script. I would like to make the dictionary configurable for other languages (probably an env variable works best for that), currently this works only by using your image as a base and setting a new entrypoint.
Greetings and nice work dockerizing this 👍
The text was updated successfully, but these errors were encountered: