aa2nt

Convert amino acid sequence to nucleotide sequence

Syntax

SeqNT = aa2nt(SeqAA)
SeqNT = aa2nt(SeqAA, ...'GeneticCode', GeneticCodeValue, ...)
SeqNT = aa2nt(SeqAA, ...'Alphabet' AlphabetValue, ...)

Input Arguments

SeqAA

One of the following:

Examples: 'ARN' or [1 2 3]

GeneticCodeValue

Integer, character vector, or string specifying a genetic code number or code name from the table Genetic Code. Default is 1 or 'Standard'.

Tip

If you use a code name, you can truncate the name to the first two letters of the name.

AlphabetValue

Character vector or string specifying a nucleotide alphabet. Choices are:

  • 'DNA' (default) — Uses the symbols A, C, G, and T.

  • 'RNA' — Uses the symbols A, C, G, and U.

Output Arguments

SeqNTNucleotide sequence specified by a character vector of letter codes.

Description

SeqNT = aa2nt(SeqAA) converts an amino acid sequence, specified by SeqAA, to a nucleotide sequence, returned in SeqNT, using the standard genetic code.

In general, the mapping from an amino acid to a nucleotide codon is not a one-to-one mapping. For amino acids with multiple possible nucleotide codons, this function randomly selects a codon corresponding to that particular amino acid. For the ambiguous characters B and Z, one of the amino acids corresponding to the letter is selected randomly, and then a codon sequence is selected randomly. For the ambiguous character X, a codon sequence is selected randomly from all possibilities.

SeqNT = aa2nt(SeqAA, ...'PropertyName', PropertyValue, ...) calls aa2nt with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:

SeqNT = aa2nt(SeqAA, ...'GeneticCode', GeneticCodeValue, ...) specifies a genetic code to use when converting an amino acid sequence to a nucleotide sequence. GeneticCodeValue can be an integer, character vector, or string specifying a code number or code name from the table Genetic Code. Default is 1 or 'Standard'. The amino acid to nucleotide codon mapping for the Standard genetic code is shown in the table Standard Genetic Code.

Tip

If you use a code name, you can truncate the name to the first two letters of the name.

SeqNT = aa2nt(SeqAA, ...'Alphabet' AlphabetValue, ...) specifies a nucleotide alphabet. AlphabetValue can be 'DNA', which uses the symbols A, C, G, and T, or 'RNA', which uses the symbols A, C, G, and U. Default is 'DNA'.

Genetic Code

Code NumberCode Name
1Standard
2Vertebrate Mitochondrial
3Yeast Mitochondrial
4Mold, Protozoan, Coelenterate Mitochondrial, and Mycoplasma/Spiroplasma
5Invertebrate Mitochondrial
6Ciliate, Dasycladacean, and Hexamita Nuclear
9Echinoderm Mitochondrial
10Euplotid Nuclear
11Bacterial and Plant Plastid
12Alternative Yeast Nuclear
13Ascidian Mitochondrial
14Flatworm Mitochondrial
15Blepharisma Nuclear
16Chlorophycean Mitochondrial
21Trematode Mitochondrial
22Scenedesmus Obliquus Mitochondrial
23Thraustochytrium Mitochondrial

Standard Genetic Code

Amino Acid NameAmino Acid CodeNucleotide Codon
Alanine AGCT GCC GCA GCG
ArginineRCGT CGC CGA CGG AGA AGG
AsparagineNAAT AAC
Aspartic acid (Aspartate) DGAT GAC
CysteineCTGT TGC
GlutamineQCAA CAG
Glutamic acid (Glutamate) EGAA GAG
GlycineGGGT GGC GGA GGG
HistidineHCAT CAC
IsoleucineIATT ATC ATA
LeucineLTTA TTG CTT CTC CTA CTG
LysineKAAA AAG
MethionineMATG
PhenylalanineFTTT TTC
Proline PCCT CCC CCA CCG
SerineSTCT TCC TCA TCG AGT AGC
ThreonineTACT ACC ACA ACG
TryptophanWTGG
TyrosineYTAT, TAC
ValineVGTT GTC GTA GTG
Asparagine or Aspartic acid (Aspartate) B Random codon from D and N
Glutamine or Glutamic acid (Glutamate) ZRandom codon from E and Q
Unknown amino acid (any amino acid) XRandom codon
Translation stop *TAA TAG TGA
Gap of indeterminate length ----
Unknown character (any character or symbol not in table) ????

Examples

collapse all

Create an amino acid sequence.

seq = randseq(20,'alphabet','amino')
seq = 
'TYNYMRQLVVDVVITNHYSV'

Convert it to a nucleotide sequence using the standard genetic code.

aa2nt(seq)
ans = 
'ACATATAACTACATGAGACAGCTTGTAGTTGACGTTGTCATTACTAACCACTATAGCGTT'

Convert it using the Vertebrate Mitochondrial genetic code.

aa2nt(seq,'GeneticCode',2)
ans = 
'ACCTATAACTACATACGCCAACTCGTAGTGGATGTAGTAATTACTAATCACTATTCGGTT'

Convert using the Echinoderm Mitochondrial genetic code and the RNA alphabet.

aa2nt(seq,'GeneticCode','ec','Alphabet','RNA')
ans = 
'ACGUAUAACUACAUGCGGCAGUUAGUUGUCGACGUCGUGAUUACGAACCAUUAUAGUGUC'
Introduced before R2006a