newBag = removeDocument(bag,idx)
removes the documents with indices specified by idx from the
bag-of-words or bag-of-n-grams model bag. If the removed
documents contain words or n-grams that do not appear in the remaining documents,
then the function also removes these words or n-grams from
bag.
Remove selected documents from a bag-of-words model.
documents = tokenizedDocument([ ..."an example of a short sentence""a second short sentence""a third example""a final sentence"]);
bag = bagOfWords(documents)
bag =
bagOfWords with properties:
Counts: [4x9 double]
Vocabulary: [1x9 string]
NumWords: 9
NumDocuments: 4