Convert documents to uppercase
converts each lowercase character in the input documents to the corresponding
uppercase character, and leaves all other characters unchanged.newDocuments
= upper(documents
)
decodeHTMLEntities
| erasePunctuation
| eraseTags
| eraseURLs
| lower
| tokenizedDocument