'Word2VecKeyedVectors' object has no attribute 'get_mean_vector' #19

Open

Open

'Word2VecKeyedVectors' object has no attribute 'get_mean_vector'#19

while converting tokens to vector for complete sentence in preprocess_and_vectorize method ,got error "'Word2VecKeyedVectors' object has no attribute 'get_mean_vector'"

Author

i tried to convert each token in vector and then to take mean using np.mean..but while converting df['Text'] to vector form getting errors like "Key 'u.s.-based' not present","Key ' ' not present","Key '2018' not present" etc..please help.

I think he used old version of gensim library, from 3.8 to 4.0 a lot of attributes changed. I also facing same issues tried couple of thing but it didnt help at all. Poorly documentated library to be honest im seaching hours and couldnt find anything useful.

Author

`def preprocess_and_vectorize(text):
# remove stop words and lemmatize the text
doc = nlp(text)
filtered_tokens = []
arr = []
for token in doc:
if token.is_stop or token.is_punct:
continue
filtered_tokens.append(token.lemma_)
for token in filtered_tokens:
try:
arr.append(wv[token])
except:
continue

return np.mean(arr,axis=0)`

used this code.used try catch because many words have no vector in WV.

Solution to the problem

This is the alternative I have found for this problem and it's working

import spacy
import numpy as np
nlp=spacy.load("en_core_web_lg")
def preprocess_and_vectorize(text):
doc = nlp(text)
filtered_tokens = []
for token in doc:
if token.is_punct or token.is_stop:
continue
filtered_tokens.append(token.lemma_)
return np.mean(wv[filtered_tokens])

to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

Labels

No labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Participants