strsplit

Split string or character vector at specified delimiter

Description

example

C = strsplit(str) splits str at whitespace into C. A whitespace character is equivalent to any sequence in the set {' ','\f','\n','\r','\t','\v'}.

If str has consecutive whitespace characters, then strsplit treats them as one whitespace.

example

C = strsplit(str,delimiter) splits str at the delimiters specified by delimiter.

If str has consecutive delimiters, with no other characters between them, then strsplit treats them as one delimiter. For example, both strsplit('Hello,world',',') and strsplit('Hello,,,world',',') return the same output.

example

C = strsplit(str,delimiter,Name,Value) specifies additional delimiter options using one or more name-value pair arguments. For example, to treat consecutive delimiters as separate delimiters, you can specify 'CollapseDelimiters',false.

example

[C,matches] = strsplit(___) additionally returns the array, matches. The matches output argument contains all occurrences of delimiters upon which strsplit splits str. You can use this syntax with any of the input arguments of the previous syntaxes.

Examples

collapse all

str = 'The rain in Spain.';
C = strsplit(str)
C = 1x4 cell
    {'The'}    {'rain'}    {'in'}    {'Spain.'}

C is a cell array containing four character vectors.

Split a character vector that contains comma-separated values.

data = '1.21, 1.985, 1.955, 2.015, 1.885';
C = strsplit(data,', ')
C = 1x5 cell
    {'1.21'}    {'1.985'}    {'1.955'}    {'2.015'}    {'1.885'}

Split a character vector, data, which contains the units m/s with an arbitrary number of whitespace on either side of the text. The regular expression, \s*, matches any whitespace character appearing zero or more times.

data = '1.21m/s1.985m/s 1.955 m/s2.015 m/s 1.885m/s';
[C,matches] = strsplit(data,'\s*m/s\s*',...
    'DelimiterType','RegularExpression')
C = 1x6 cell
    {'1.21'}    {'1.985'}    {'1.955'}    {'2.015'}    {'1.885'}    {0x0 char}

matches = 1x5 cell
    {'m/s'}    {'m/s '}    {' m/s'}    {' m/s '}    {'m/s'}

In this case, the last character vector in C is empty. This empty character vector follows the last matched delimiter.

myPath = 'C:\work\matlab';
C = strsplit(myPath,'\')
C = 1x3 cell
    {'C:'}    {'work'}    {'matlab'}

Split a character vector on ' ' and 'ain', treating multiple delimiters as one. Specify multiple delimiters in a cell array of character vectors.

str = 'The rain in Spain stays mainly in the plain.';
[C,matches] = strsplit(str,{' ','ain'},'CollapseDelimiters',true)
C = 1x11 cell
  Columns 1 through 7

    {'The'}    {'r'}    {'in'}    {'Sp'}    {'stays'}    {'m'}    {'ly'}

  Columns 8 through 11

    {'in'}    {'the'}    {'pl'}    {'.'}

matches = 1x10 cell
  Columns 1 through 7

    {' '}    {'ain '}    {' '}    {'ain '}    {' '}    {'ain'}    {' '}

  Columns 8 through 10

    {' '}    {' '}    {'ain'}

Split the same character vector on whitespace and on 'ain', using regular expressions and treating multiple delimiters separately.

[C,matches] = strsplit(str,{'\s','ain'},'CollapseDelimiters',...
    false, 'DelimiterType','RegularExpression')
C = 1x13 cell
  Columns 1 through 6

    {'The'}    {'r'}    {0x0 char}    {'in'}    {'Sp'}    {0x0 char}

  Columns 7 through 13

    {'stays'}    {'m'}    {'ly'}    {'in'}    {'the'}    {'pl'}    {'.'}

matches = 1x12 cell
  Columns 1 through 8

    {' '}    {'ain'}    {' '}    {' '}    {'ain'}    {' '}    {' '}    {'ain'}

  Columns 9 through 12

    {' '}    {' '}    {' '}    {'ain'}

In this case, strsplit treats the two delimiters separately, so empty character vectors appear in output C between the consecutively matched delimiters.

Split text on the character vectors ', ' and ', and '.

str = 'bacon, lettuce, and tomato';
[C,matches] = strsplit(str,{', ',', and '})
C = 1x3 cell
    {'bacon'}    {'lettuce'}    {'and tomato'}

matches = 1x2 cell
    {', '}    {', '}

Because the command lists ', ' first and ', and ' contains ', ', the strsplit function splits str on the first delimiter and never proceeds to the second delimiter.

If you reverse the order of delimiters, ', and ' takes priority.

str = 'bacon, lettuce, and tomato';
[C,matches] = strsplit(str,{', and ',', '})
C = 1x3 cell
    {'bacon'}    {'lettuce'}    {'tomato'}

matches = 1x2 cell
    {', '}    {', and '}

Input Arguments

collapse all

Input text, specified as a character vector or a string scalar.

Data Types: char | string

Delimiting characters, specified as a character vector, a 1-by-n cell array of character vectors, or a 1-by-n string array. Text specified in delimiter does not appear in the output C.

Specify multiple delimiters in a cell array or a string array. The strsplit function splits str on the elements of delimiter. The order in which delimiters appear in delimiter does not matter unless multiple delimiters begin a match at the same character in str. In that case strsplit splits on the first matching delimiter in delimiter.

delimiter can include the following escape sequences:

\\

Backslash

\0

Null

\a

Alarm

\b

Backspace

\f

Form feed

\n

New line

\r

Carriage return

\t

Horizontal tab

\v

Vertical tab

Example: ','

Example: {'-',','}

Data Types: char | cell | string

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'DelimiterType','RegularExpression' instructs strsplit to treat delimiter as a regular expression.

Multiple delimiter handling, specified as the comma-separated pair consisting of 'CollapseDelimiters' and either true or false. If true, then consecutive delimiters in str are treated as one. If false, then consecutive delimiters are treated as separate delimiters, resulting in empty character vector '' elements between matched delimiters.

Example: 'CollapseDelimiters',true

Delimiter type, specified as the comma-separated pair consisting of 'DelimiterType' and one of the following character vectors.

'Simple'Except for escape sequences, strsplit treats delimiter as literal text.
'RegularExpression'strsplit treats delimiter as a regular expression.

In both cases, delimiter can include escape sequences.

Output Arguments

collapse all

Parts of the original character vector, returned as a cell array of character vectors or as a string array. C always contains one more element than matches contains. Therefore, if str begins with a delimiter, then the first element of C contains no characters. If str ends with a delimiter, then the last cell in C contains no characters.

Identified delimiters, returned as a cell array of character vectors or as a string array. matches always contains one less element than output C contains. If str is a character vector or a cell array of character vectors, then matches is a cell array. If str is a string array, then matches is a string array.

Tips

  • Starting in R2016b, the split function is recommended to split elements of a string array.

Introduced in R2013a