newBag = join(bag)
combines the elements in the array bag by merging the frequency
counts. The function combines the elements along the first dimension not equal to
1.
newBag = join(bag,dim)
combines the elements in the array bag along the dimension
dim.
Create an array of two bags-of-words models from tokenized documents.
str = [ ..."an example of a short sentence""a second short sentence"];
documents = tokenizedDocument(str);
bag(1) = bagOfWords(documents(1));
bag(2) = bagOfWords(documents(2))
If your text data is contained in multiple files in a folder, then you can import the text data and create a bag-of-words model in parallel using parfor. If you have Parallel Computing Toolbox™ installed, then the parfor loop runs in parallel, otherwise, it runs in serial. Use join to combine an array of bag-of-words models into one model.
Create a bag-of-words model from a collection of files. The examples sonnets have file names "exampleSonnetN.txt", where N is the number of the sonnet. Get a list of the files and their locations using dir.
bag — Array of bag-of-words or bag-of-n-grams models bagOfWords array | bagOfNgrams array
Array of bag-of-words or bag-of-n-grams models, specified as a bagOfWords array or a bagOfNgrams array. If bag is a
bagOfNgrams array, then each element to be joined must
have the same value for the NgramLengths property.
dim — Dimension along which to join models positive integer
Dimension along which to join models, specified as a positive integer. If
dim is not specified, then the default is the first
dimension with a size that does not equal 1.
newBag — Output model bagOfWords array | bagOfNgrams array
Output model, returned as a bagOfWords object or a bagOfNgrams object. The type of newBag is
the same as the type of bag.
newBag has the same data type as the input model
and has a size of 1 along the dimension being joined.