You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Taken from the MLJText.jl requirements for transformers:
Generate a vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element would be one of the following:
A vector of abstract strings (tokens), e.g., ["I", "like", "Sam",
".", "Sam", "is", "nice", "."] (scitype AbstractVector{Textual})
A dictionary of counts, indexed on abstract strings, e.g.,
Dict("I"=>1, "Sam"=>2, "Sam is"=>1) (scitype Multiset{Textual}})
A dictionary of counts, indexed on plain ngrams, e.g.,
Dict(("I",)=>1, ("Sam",)=>2, ("I", "Sam")=>1) (scitype
Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a
tuple of abstract strings.
The text was updated successfully, but these errors were encountered:
Taken from the MLJText.jl requirements for transformers:
Generate a vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element would be one of the following:
A vector of abstract strings (tokens), e.g., ["I", "like", "Sam",
".", "Sam", "is", "nice", "."] (scitype AbstractVector{Textual})
A dictionary of counts, indexed on abstract strings, e.g.,
Dict("I"=>1, "Sam"=>2, "Sam is"=>1) (scitype Multiset{Textual}})
A dictionary of counts, indexed on plain ngrams, e.g.,
Dict(("I",)=>1, ("Sam",)=>2, ("I", "Sam")=>1) (scitype
Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a
tuple of abstract strings.
The text was updated successfully, but these errors were encountered: