Search and Replace Text

You can search for text in character arrays and string arrays, and replace substrings with new text. String arrays, and new functions to search for and replace text, were introduced in R2016b. Search for substrings with functions such as the contains function. Similarly, replace text in strings with the replace function, or extract text with functions such as extractBetween. You can use any of these functions with either character vectors or string arrays. For compatibility, you can also use functions such as strfind and strrep with both character vectors and string arrays.

Search for Text

Identify text in string arrays, character vectors, or cell arrays of character vectors with the contains, startsWith, and endsWith function.

Create a string. Starting in R2017a, you can create strings using double quotes.

str = "Rosemary Jones"
str = 
"Rosemary Jones"

Determine whether str contains the substring mary. The contains function returns a logical 1 if it finds the substring any place within the string.

TF = contains(str,"mary")
TF = logical
   1

You can also use the strfind function to find matching text. strfind returns the index of the start of each match. In this case, strfind returns 5 because the m in mary is the fifth character of str.

idx = strfind(str,"mary")
idx = 5

Find multiple matches with strfind. When there are multiple matches, strfind returns the indices as an array.

idx = strfind(str,"s")
idx = 1×2

     3    14

Create a string array that contains many names. Determine which names contain the substring Ann. The contains function returns a logical array that has a 1 wherever str has an element that contains Ann. To create a new string array that includes only the matches, index into str with TF.

str = ["Rosemary Ann Jones","Peter Michael Smith","Ann Marie Young"]
str = 1x3 string
    "Rosemary Ann Jones"    "Peter Michael Smith"    "Ann Marie Young"

TF = contains(str,"Ann")
TF = 1x3 logical array

   1   0   1

matches = str(TF)
matches = 1x2 string
    "Rosemary Ann Jones"    "Ann Marie Young"

Find the strings that begin with Ann.

TF = startsWith(str,"Ann");
matches = str(TF)
matches = 
"Ann Marie Young"

Similarly, the endsWith function find strings that end with a specified piece of text.

You can also use the contains, startsWith, and endsWith functions to determine whether character vectors contains text.

chr = 'John Paul Jones'
chr = 
'John Paul Jones'
TF = contains(chr,'Paul')
TF = logical
   1

TF = endsWith(chr,'Paul')
TF = logical
   0

Use the contains function to find text in rows of a string array. census1905 contains a few rows of simulated census data for the year 1905. Each row contains a name, year of birth, and number of times that name was given in that year.

census1905 = ["Ann Mary","1905","230";
              "John","1905","5400";
              "Mary","1905","4600";
              "Maryjane","1905","304";
              "Paul","1905","1206"];

Find the rows where the name is equal to Mary.

TF = (census1905(:,1) == "Mary");
census1905(TF,:)
ans = 1x3 string
    "Mary"    "1905"    "4600"

Find the rows where the name is a variation of Mary with the contains function.

TF = contains(census1905(:,1),"Mary");
census1905(TF,:)
ans = 3x3 string
    "Ann Mary"    "1905"    "230" 
    "Mary"        "1905"    "4600"
    "Maryjane"    "1905"    "304" 

Replace Text

You can replace text in string arrays, character vectors, or cell arrays of character vectors with the replace function.

Create a string. Replace the substring mary with anne.

str = "Rosemary Jones"
str = 
"Rosemary Jones"
newStr = replace(str,"mary","anne")
newStr = 
"Roseanne Jones"

You can also replace text using the strrep function. However, the replace function is recommended.

newStr = strrep(str,"Jones","Day")
newStr = 
"Rosemary Day"

Create a string array that contains many names.

str = ["Rosemary Ann Jones","Peter Michael Smith","Ann Marie Young"]
str = 1x3 string
    "Rosemary Ann Jones"    "Peter Michael Smith"    "Ann Marie Young"

Specify multiple names to replace.

oldText = ["Ann","Michael"];
newText = ["Beth","John"]; 
newStr = replace(str,oldText,newText)
newStr = 1x3 string
    "Rosemary Beth Jones"    "Peter John Smith"    "Beth Marie Young"

Replace text in a character vector. You can use replace and replaceBetween with character vectors, as well as with strings.

chr = 'Mercury, Gemini, Apollo'
chr = 
'Mercury, Gemini, Apollo'
replace(chr,'Gemini','Mars')
ans = 
'Mercury, Mars, Apollo'

Replace text in a string array of file names. Append the file names to the address of a website. The file names contain spaces, but spaces cannot be part of web addresses. Replace the space character, " ", with %20, which is the standard for web addresses.

str = ["Financial Report.docx";
       "Quarterly 2015 Details.docx";
       "Slides.pptx"]
str = 3x1 string
    "Financial Report.docx"
    "Quarterly 2015 Details.docx"
    "Slides.pptx"

newStr = replace(str," ","%20")
newStr = 3x1 string
    "Financial%20Report.docx"
    "Quarterly%202015%20Details.docx"
    "Slides.pptx"

Append the file names to the address of a website.

filenames = "http://example.com/Documents/" + newStr
filenames = 3x1 string
    "http://example.com/Documents/Financial%20Report.docx"
    "http://example.com/Documents/Quarterly%202015%20Details.docx"
    "http://example.com/Documents/Slides.pptx"

Extract Text

Extract a substring from string arrays or character vectors with the extractAfter, extractBefore, and extractBetween functions. Use these functions to extract different substrings that precede, follow, or occur between specified pieces of text.

Create a string array that contains file names. Extract the portions of the names after C:\Temp\ with the extractAfter function.

str = ["C:\Temp\MyReport.docx";
       "C:\Temp\Data\Sample1.csv";
       "C:\Temp\Slides.pptx"]
str = 3x1 string
    "C:\Temp\MyReport.docx"
    "C:\Temp\Data\Sample1.csv"
    "C:\Temp\Slides.pptx"

filenames = extractAfter(str,"C:\Temp\")
filenames = 3x1 string
    "MyReport.docx"
    "Data\Sample1.csv"
    "Slides.pptx"

Extract customer names from a string array that encodes the names within XML tags.

str = ["<CustomerName>Elizabeth Day</CustomerName>";
       "<CustomerName>George Adams</CustomerName>";
       "<CustomerName>Sarah Young</CustomerName>"]
str = 3x1 string
    "<CustomerName>Elizabeth Day</CustomerName>"
    "<CustomerName>George Adams</CustomerName>"
    "<CustomerName>Sarah Young</CustomerName>"

names = extractBetween(str,"<CustomerName>","</CustomerName>")
names = 3x1 string
    "Elizabeth Day"
    "George Adams"
    "Sarah Young"

See Also

| | | | | | | | | | | | |

Related Topics