To find clusters and extract features from high-dimensional text datasets, you can use machine learning techniques and models such as LSA, LDA, and word embeddings. You can combine features created with Text Analytics Toolbox™ with features from other data sources. With these features, you can build machine learning models that take advantage of textual, numeric, and other types of data.
Create Simple Text Model for Classification
This example shows how to train a simple text classifier on word frequency counts using a bag-of-words model.
Classify Text Data Using Deep Learning
This example shows how to classify text data using a deep learning long short-term memory (LSTM) network.
Classify Text Data Using Convolutional Neural Network
This example shows how to classify text data using a convolutional neural network.
Classify Out-of-Memory Text Data Using Deep Learning
This example shows how to classify out-of-memory text data with a deep learning network using a transformed datastore.
Analyze Text Data Using Multiword Phrases
This example shows how to analyze text using n-gram frequency counts.
Analyze Text Data Using Topic Models
This example shows how to use the Latent Dirichlet Allocation (LDA) topic model to analyze text data.
Choose Number of Topics for LDA Model
This example shows how to decide on a suitable number of topics for a latent Dirichlet allocation (LDA) model.
This example shows how to compare latent Dirichlet allocation (LDA) solvers by comparing the goodness of fit and the time taken to fit the model.
Create Simple Preprocessing Function
This example shows how to create a function which cleans and preprocesses text data for analysis.
This example shows how to train a classifier for sentiment analysis using an annotated list of positive and negative sentiment words and a pretrained word embedding.
Sequence-to-Sequence Translation Using Attention
This example shows how to convert decimal strings to Roman numerals using a recurrent sequence-to-sequence encoder-decoder model with attention.
Generate Text Using Deep Learning (Deep Learning Toolbox)
This example shows how to train a deep learning long short-term memory (LSTM) network to generate text.
Pride and Prejudice and MATLAB
This example shows how to train a deep learning LSTM network to generate text using character embeddings.
Word-By-Word Text Generation Using Deep Learning
This example shows how to train a deep learning LSTM network to generate text word-by-word.
Information on using Text Analytics Toolbox features for other languages.
Information on Japanese support in Text Analytics Toolbox.
This example shows how to import, prepare, and analyze Japanese text data using a topic model.
Information on German support in Text Analytics Toolbox.
This example shows how to import, prepare, and analyze German text data using a topic model.