Language Support

Information on language support in Text Analytics Toolbox™

Text Analytics Toolbox supports the languages English, Japanese, German, and Korean. Most Text Analytics Toolbox functions also work with text in other languages. For more information, see Language Considerations.

Functions

expand all

tokenizedDocumentArray of tokenized documents for text analysis
removeStopWordsRemove stop words from documents
normalizeWordsStem or lemmatize words
stopWordsList of stop words
mecabOptionsOptions for MeCab tokenization
tokenDetailsDetails of tokens in tokenized document array
addSentenceDetailsAdd sentence numbers to documents
addPartOfSpeechDetailsAdd part-of-speech tags to documents
addEntityDetailsAdd entity tags to documents
addLemmaDetailsAdd lemma forms of tokens to documents
addLanguageDetailsAdd language identifiers to documents
corpusLanguageDetect language of text

Topics

English Language

Text Data Preparation

Import text data into MATLAB® and preprocess it for analysis

Modeling and Prediction

Develop predictive models using topic models and word embeddings

Display and Presentation

Visualize text data and models using word clouds and text scatter plots

Japanese Language

Japanese Language Support

Information on Japanese support in Text Analytics Toolbox.

Analyze Japanese Text Data

This example shows how to import, prepare, and analyze Japanese text data using a topic model.

German Language

German Language Support

Information on German support in Text Analytics Toolbox.

Analyze German Text Data

This example shows how to import, prepare, and analyze German text data using a topic model.

Korean Language

Korean Language Support

Information on Korean support in Text Analytics Toolbox.

Other Languages

Language Considerations

Information on using Text Analytics Toolbox features for other languages.

Language-Independent Features

Text Analytics Toolbox features that do not depend on language details.

Featured Examples