-
Notifications
You must be signed in to change notification settings - Fork 546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python-3.x support #27
Comments
I have ported it to python3 version, but meteor metrix doesn't work. You can have a see. coco-caption |
I have implemented Python 3 support for the evaluation metrics. I am using my version of the eval tools together with the pycocotools from here: https://github.com/cocodataset/cocoapi |
I have created a fork that is both Python 3 compatible and that uses the new Word Mover's Distance metric. It would be nice to merge with this repository. |
I just modified the code to support Python 3, with support for Chinese. |
What's the status on this? :) |
@rubencart They said "We are currently focusing on more of the object detection / segmentation challenges, and have decided to leave the captioning leaderboard open but not make additional updates to it." |
Another pure Python 3.x fork with no support for Python 2 with some tiny bugs fixed as well --> https://github.com/ozancaglayan/coco-caption |
Thanks for your contribution.
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str))
self.meteor_p.stdin.write(score_line+'\n') cannot support py2 and I changed it to if sys.version_info[0] == 2: # python2
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).encode('utf-8').strip()
self.meteor_p.stdin.write(str(score_line+b'\n'))
else: # assume python3+
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).strip()
self.meteor_p.stdin.write(score_line+'\n')
# There's a situation that the prediction is all punctuations
# (see definition of PUNCTUATIONS in pycocoevalcap/tokenizer/ptbtokenizer.py)
# then the prediction will become [''] after tokenization
# which means res[i][0] == '' and self._stat will failed with this input
if len(res[i][0]) == 0:
res[i][0] = 'a' The complete code of #!/usr/bin/env python
# Python wrapper for METEOR implementation, by Xinlei Chen
# Acknowledge Michael Denkowski for the generous discussion and help
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import sys
import subprocess
import threading
# Assumes meteor-1.5.jar is in the same directory as meteor.py. Change as needed.
METEOR_JAR = 'meteor-1.5.jar'
# print METEOR_JAR
class Meteor:
def __init__(self):
self.env = os.environ
self.env['LC_ALL'] = 'en_US.UTF_8'
self.meteor_cmd = ['java', '-jar', '-Xmx2G', METEOR_JAR,
'-', '-', '-stdio', '-l', 'en', '-norm']
self.meteor_p = subprocess.Popen(self.meteor_cmd,
cwd=os.path.dirname(os.path.abspath(__file__)),
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
env=self.env, universal_newlines=True, bufsize=1)
# Used to guarantee thread safety
self.lock = threading.Lock()
def compute_score(self, gts, res):
assert(gts.keys() == res.keys())
imgIds = sorted(list(gts.keys()))
scores = []
eval_line = 'EVAL'
self.lock.acquire()
for i in imgIds:
assert(len(res[i]) == 1)
# There's a situation that the prediction is all punctuations
# (see definition of PUNCTUATIONS in pycocoevalcap/tokenizer/ptbtokenizer.py)
# then the prediction will become [''] after tokenization
# which means res[i][0] == '' and self._stat will failed with this input
if len(res[i][0]) == 0:
res[i][0] = 'a'
stat = self._stat(res[i][0], gts[i])
eval_line += ' ||| {}'.format(stat)
# Send to METEOR
self.meteor_p.stdin.write(eval_line + '\n')
# Collect segment scores
for i in range(len(imgIds)):
score = float(self.meteor_p.stdout.readline().strip())
scores.append(score)
# Final score
final_score = float(self.meteor_p.stdout.readline().strip())
self.lock.release()
return final_score, scores
def method(self):
return "METEOR"
def _stat(self, hypothesis_str, reference_list):
# SCORE ||| reference 1 words ||| reference n words ||| hypothesis words
hypothesis_str = hypothesis_str.replace('|||', '').replace(' ', ' ')
if sys.version_info[0] == 2: # python2
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).encode('utf-8').strip()
self.meteor_p.stdin.write(str(score_line+b'\n'))
else: # assume python3+
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).strip()
self.meteor_p.stdin.write(score_line+'\n')
return self.meteor_p.stdout.readline().strip()
def __del__(self):
self.lock.acquire()
self.meteor_p.stdin.close()
self.meteor_p.kill()
self.meteor_p.wait()
self.lock.release() |
Your code assumes that there will only ever be a version 2 and 3 for
python. Don't assume that if the version is not 3 then it is 2. Instead
check if it is 2 and if not then assume that the code for version 3 will
work in the future as well. So switch your if/else around to 'if
sys.version_info[0] == 2: ... else: ...
…On Tue, 8 Oct 2019, 09:42 Yupan Huang, ***@***.***> wrote:
Thanks for your contribution.
Based on @mtanti <https://github.com/mtanti> 's implementation, I
modified two places to support meteor evalution for both py2 and py3.
1. It seems that the code of
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str))
self.meteor_p.stdin.write(score_line+'\n')
cannot support py2 and I changed it to
if sys.version_info[0] == 3: # python3
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).strip()
self.meteor_p.stdin.write(score_line+'\n')
else: # python2
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).encode('utf-8').strip()
self.meteor_p.stdin.write(str(score_line+b'\n'))
1. Add a judgement in compute_score
# There's a situation that the prediction is all puctuations
# see definition of PUNCTUATIONS in pycocoevalcap/tokenizer/ptbtokenizer.py
# then the prediction will become [''] after tokenization
# which means res[i][0] == '' and self._stat will failed with this input
if len(res[i][0]) == 0:
res[i][0] = 'a'
The complete code of meteor.py is as following
#!/usr/bin/env python
# Python wrapper for METEOR implementation, by Xinlei Chen# Acknowledge Michael Denkowski for the generous discussion and help from __future__ import absolute_importfrom __future__ import divisionfrom __future__ import print_function
import osimport sysimport subprocessimport threading
# Assumes meteor-1.5.jar is in the same directory as meteor.py. Change as needed.METEOR_JAR = 'meteor-1.5.jar'# print METEOR_JAR
class Meteor:
def __init__(self):
self.env = os.environ
self.env['LC_ALL'] = 'en_US.UTF_8'
self.meteor_cmd = ['java', '-jar', '-Xmx2G', METEOR_JAR,
'-', '-', '-stdio', '-l', 'en', '-norm']
self.meteor_p = subprocess.Popen(self.meteor_cmd,
cwd=os.path.dirname(os.path.abspath(__file__)),
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
env=self.env, universal_newlines=True, bufsize=1)
# Used to guarantee thread safety
self.lock = threading.Lock()
def compute_score(self, gts, res):
assert(gts.keys() == res.keys())
imgIds = sorted(list(gts.keys()))
scores = []
eval_line = 'EVAL'
self.lock.acquire()
for i in imgIds:
assert(len(res[i]) == 1)
# There's a situation that the prediction is all puctuations
# see definition of PUNCTUATIONS in pycocoevalcap/tokenizer/ptbtokenizer.py
# then the prediction will become [''] after tokenization
# which means res[i][0] == '' and self._stat will failed with this input
if len(res[i][0]) == 0:
res[i][0] = 'a'
stat = self._stat(res[i][0], gts[i])
eval_line += ' ||| {}'.format(stat)
# Send to METEOR
self.meteor_p.stdin.write(eval_line + '\n')
# Collect segment scores
for i in range(len(imgIds)):
score = float(self.meteor_p.stdout.readline().strip())
scores.append(score)
# Final score
final_score = float(self.meteor_p.stdout.readline().strip())
self.lock.release()
return final_score, scores
def method(self):
return "METEOR"
def _stat(self, hypothesis_str, reference_list):
# SCORE ||| reference 1 words ||| reference n words ||| hypothesis words
hypothesis_str = hypothesis_str.replace('|||', '').replace(' ', ' ')
if sys.version_info[0] == 3: # python3
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).strip()
self.meteor_p.stdin.write(score_line+'\n')
else: # python2
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).encode('utf-8').strip()
self.meteor_p.stdin.write(str(score_line+b'\n'))
return self.meteor_p.stdout.readline().strip()
def __del__(self):
self.lock.acquire()
self.meteor_p.stdin.close()
self.meteor_p.kill()
self.meteor_p.wait()
self.lock.release()
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#27?email_source=notifications&email_token=ABLFWDZA7EXTKJ5V6TN75SDQNQ2YDA5CNFSM4EMTXEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEATHB5Y#issuecomment-539390199>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABLFWD4E2MNIUXJV3RLSCVDQNQ2YDANCNFSM4EMTXECQ>
.
|
Python 2 will be end-of-life next year. Why do you bother supporting it still? |
Thanks @mtanti for pointing it out! I've modified the code. |
Thanks, your solution help me solve the proc.stdout.readline() hanged problem! |
I just stumbled across this and our https://github.com/Maluuba/nlg-eval supports Python 3 |
It has been 3 years since I first commented and a lot has changed in the meantime. So, I'm now working with a much more elegant toolkit, facebookresearch/vizseq, which supports visualization with extension to multiple modalities (video, audio) and more recent embedding-based metrics. |
Hi all,
I'd like to know whether you have plans to port the codebase to Python-3. Since most of the people have switched to Python-3, it'd be nice to have Python-3 support so that other projects (for e.g. ImageCaptioning PyTorch ) dependent on
coco-caption
can also be implemented in Python-3.Thanks!
The text was updated successfully, but these errors were encountered: