Skip to content
This repository has been archived by the owner on Sep 27, 2024. It is now read-only.

Save average document length after fitting #1

Open
dunefox opened this issue Feb 10, 2022 · 1 comment
Open

Save average document length after fitting #1

dunefox opened this issue Feb 10, 2022 · 1 comment
Labels

Comments

@dunefox
Copy link

dunefox commented Feb 10, 2022

I think avgdl should be saved as an attribute after fitting so it's not estimated again if transform is called for one document instead of the 'training' corpus.

So

fit(X).transform(X)

makes sense because all documents in X use the same avgdl but

fit(X).transform(X)
transform(other_document)

then estimates avgdl for this document alone again.

@arosh
Copy link
Owner

arosh commented Feb 10, 2022

You are completely right. It is terrible that this issue has been overlooked for more than several years...

@arosh arosh added the bug label Feb 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants