Skip to content

'Word2VecKeyedVectors' object has no attribute 'get_mean_vector' #19

Open
@shiv425

Description

@shiv425

while converting tokens to vector for complete sentence in preprocess_and_vectorize method ,got error "'Word2VecKeyedVectors' object has no attribute 'get_mean_vector'"

Activity

shiv425

shiv425 commented on Nov 3, 2022

@shiv425
Author

i tried to convert each token in vector and then to take mean using np.mean..but while converting df['Text'] to vector form getting errors like "Key 'u.s.-based' not present","Key ' ' not present","Key '2018' not present" etc..please help.

elandil2

elandil2 commented on Nov 3, 2022

@elandil2

I think he used old version of gensim library, from 3.8 to 4.0 a lot of attributes changed. I also facing same issues tried couple of thing but it didnt help at all. Poorly documentated library to be honest im seaching hours and couldnt find anything useful.

shiv425

shiv425 commented on Nov 4, 2022

@shiv425
Author

`def preprocess_and_vectorize(text):
# remove stop words and lemmatize the text
doc = nlp(text)
filtered_tokens = []
arr = []
for token in doc:
if token.is_stop or token.is_punct:
continue
filtered_tokens.append(token.lemma_)
for token in filtered_tokens:
try:
arr.append(wv[token])
except:
continue

return np.mean(arr,axis=0)`

used this code.used try catch because many words have no vector in WV.

meet5398

meet5398 commented on May 5, 2023

@meet5398

Solution to the problem

This is the alternative I have found for this problem and it's working

import spacy
import numpy as np
nlp=spacy.load("en_core_web_lg")
def preprocess_and_vectorize(text):
doc = nlp(text)
filtered_tokens = []
for token in doc:
if token.is_punct or token.is_stop:
continue
filtered_tokens.append(token.lemma_)
return np.mean(wv[filtered_tokens])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @elandil2@meet5398@shiv425

        Issue actions

          'Word2VecKeyedVectors' object has no attribute 'get_mean_vector' · Issue #19 · codebasics/nlp-tutorials