Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

So many UNK in captions #6

Open
okisy opened this issue Jun 28, 2018 · 3 comments
Open

So many UNK in captions #6

okisy opened this issue Jun 28, 2018 · 3 comments
Assignees
Labels
in progress Assignees are working on the issue

Comments

@okisy
Copy link

okisy commented Jun 28, 2018

Thanks to your kindness, I managed to run your code.
By the way, here is one more question.
I ran

python evaluate.py -useGPU \
    -startFrom checkpoints/abot_rl_ep20.vd \
    -qstartFrom checkpoints/qbot_rl_ep20.vd \
    -evalMode dialog \
    -cocoDir /my/path/to/coco/images/ \
    -cocoInfo /my/path/to/coco.json \
    -beamSize 5

then implemented

cd dialog_output/
python -m http.server 8000

however, I found that the visualized captions were quite different from those on your pic
There were so many "UNK" in my result. Is it natural? Or not?
And can you tell me in what condition I could make similar results to yours?

0622-1_rl_ep_20

2018-06-28 23 24 26

@nirbhayjm nirbhayjm added the in progress Assignees are working on the issue label Jun 28, 2018
@nirbhayjm
Copy link
Member

Regarding why your dialog visualization does not match the figure in the README - the command had a missing line for giving as input the generated caption file instead of the GT one. 7f3e7e2 fixes this, the updated command should give a similar dialog visualization now.

Coming back to the UNKs in the ground truth captions - It seems like some of the UNKs at the start are for the word "a", which is odd because the same word is not an UNK elsewhere. This might be a preprocessing issue, will look into it.

@okisy
Copy link
Author

okisy commented Jun 30, 2018

Thanks for your quick response.
In addition to the question, I would like to ask you a minor setting with regard to this.
When you generated the figure in README, which "inference"(greedy or sample) did you choose?

beamSize=beamSize, inference='greedy')

beamSize=beamSize, inference='greedy')

@fristonio
Copy link

@nirbhayjm I think the UNKs are produced when there is a capitalized alphabet in the caption or question and not specifically for the alphabet a.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in progress Assignees are working on the issue
Projects
None yet
Development

No branches or pull requests

4 participants