Apply function to words in documents
calls the function specified by the function handle newDocuments
= docfun(func
,documents
)func
and
passes elements of documents
as a string vector of words.
If func
accepts exactly one input argument, then
the words of newDocuments(i)
are the output of
func(string(documents(i)))
.
If func
accepts two input arguments, then the
words of newDocuments(i)
are the output of
func(string(documents(i)),details)
, where
details
contains the corresponding token details
output by tokenDetails
.
If func
changes the number of words in the
document, then docfun
removes the token details
from that document.
docfun
does not perform the calls to function
func
in a specific order.
calls the function specified by the function handle newDocuments
= docfun(func
,documents1,...,documentsN)func
and
passes elements of documents1,…,documentsN
as string vectors of
words, where N is the number of inputs to the function
func
. The words of newDocuments(i)
are
the output of
func(string(documents1(i)),...,string(documentsN(i)))
.
Each of documents1,…,documentsN
must be the same size.
addPartOfSpeechDetails
| addSentenceDetails
| bagOfNgrams
| bagOfWords
| decodeHTMLEntities
| lower
| plus
| regexprep
| replace
| tokenDetails
| tokenizedDocument
| upper