Remove stop words from documents
Words like "a", "and", "to", and "the" (known as stop words) can add noise to data. Use this function to remove stop words before analysis.
The function supports English, Japanese, German, and Korean text. To learn how to use
removeStopWords
for other languages, see Language Considerations.
removes the stop words from the newDocuments
= removeStopWords(documents
)tokenizedDocument
array
documents
.
Use removeStopWords
before using the
normalizeWords
function as removeStopWords
uses
information that is removed by this function.
bagOfWords
| normalizeWords
| removeLongWords
| removeShortWords
| removeWords
| stopWords
| tokenizedDocument