Sentiment Analysis - Error on composed words #3

alex-lairan · 2019-10-07T12:57:50Z

Hi,

I use sentiment analysis for testing purposes, and I found something with composed words.

I have this code :

require "cadmium"

sentiment = Cadmium.sentiment
pp sentiment.analyze "I realy don't like mosquitoes"
pp "I realy don't like mosquitoes".is_negative?

The result is :

{score: 2,
 comparative: 0,
 tokens: ["I", "realy", "do", "n't", "like", "moskitoes"],
 words: ["like"],
 positive: ["like"],
 negative: []}
false

Here, the don't is not followed.
I know is a bad English, but it's something you can found on twitter.

I don't know if I'm using it in a wrong way.

The text was updated successfully, but these errors were encountered:

watzon · 2019-10-08T01:18:47Z

Seems like a problem with the tokenizer. I'll look into it.

hugoabonizio · 2019-10-18T19:36:19Z

Using the pragmatic tokenizer the token don't is recognized, but I think there's a problem with the negation identification which I addressed in cadmiumcr/cadmium#27.

sentiment.tokenizer = Cadmium.pragmatic_tokenizer.new

{score: 2,
 comparative: 0.4,
 tokens: ["i", "realy", "don't", "like", "mosquitoes"],
 words: ["like"],
 positive: ["like"],
 negative: []}
false

watzon · 2019-10-19T19:52:54Z

The problem with the Pragmatic Tokenizer is that it's much much slower than the other ones. I do not recommend using it internally for anything.

hugoabonizio · 2019-10-21T16:22:47Z

@watzon it also works with aggressive_tokenizer, but the behavior varies a lot depending on the tokenizer.

watzon · 2019-10-22T16:45:45Z

Yeah the agressive_tokenizer would probably be the one to use

rmarronnier · 2019-11-03T21:45:10Z

@watzon : Can we move this issue to cadmiumcr/sentiment repo ? It makes more sense :-)

watzon · 2019-11-03T22:18:30Z

Yes, it should definitely be moved

watzon transferred this issue from cadmiumcr/cadmium Nov 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sentiment Analysis - Error on composed words #3

Sentiment Analysis - Error on composed words #3

alex-lairan commented Oct 7, 2019

watzon commented Oct 8, 2019

hugoabonizio commented Oct 18, 2019 •

edited

Loading

watzon commented Oct 19, 2019

hugoabonizio commented Oct 21, 2019

watzon commented Oct 22, 2019

rmarronnier commented Nov 3, 2019

watzon commented Nov 3, 2019

Sentiment Analysis - Error on composed words #3

Sentiment Analysis - Error on composed words #3

Comments

alex-lairan commented Oct 7, 2019

watzon commented Oct 8, 2019

hugoabonizio commented Oct 18, 2019 • edited Loading

watzon commented Oct 19, 2019

hugoabonizio commented Oct 21, 2019

watzon commented Oct 22, 2019

rmarronnier commented Nov 3, 2019

watzon commented Nov 3, 2019

hugoabonizio commented Oct 18, 2019 •

edited

Loading