getembl

Retrieve sequence information from EMBL database

Syntax

EMBLData = getembl(AccessionNumber)
EMBLData = getembl(..., 'ToFile', ToFileValue, ...)
EMBLSeq = getembl(..., 'SequenceOnly', SequenceOnlyValue, ...)

Input Arguments

AccessionNumber Unique identifier for a sequence record. Enter a unique combination of letters and numbers.
ToFileValue Character vector specifying a file name or a path and file name to which to save the data. If you specify only a file name, the file is stored in the current folder.
SequenceOnlyValueControls the retrieving of only the sequence without the metadata. Choices are true or false (default).

Output Arguments

EMBLData MATLAB® structure with fields corresponding to EMBL data.
EMBLSeqMATLAB character vector representing the sequence.

Description

getembl retrieves information from the European Molecular Biology Laboratory (EMBL) database for nucleotide sequences. This database is maintained by the European Bioinformatics Institute (EBI). For more details about the EMBL database, see

EMBLData = getembl(AccessionNumber) searches for the accession number in the EMBL database (https://www.ebi.ac.uk/) and returns EMBLData, a MATLAB structure with fields corresponding to the EMBL two-character line type code. Each line type code is stored as a separate element in the structure.

EMBLData contains the following fields.

Field
Identification
Accession
SequenceVersion
DateCreated
DateUpdated
Description
Keyword
OrganismSpecies
OrganismClassification
Organelle
Reference
DatabaseCrossReference
Comments
Assembly
Feature
BaseCount
Sequence

EMBLData = getembl(..., 'PropertyName', PropertyValue, ...) calls getembl with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:

EMBLData = getembl(..., 'ToFile', ToFileValue, ...) saves the information to an EMBL-formatted file. ToFileValue is a character vector specifying a file name or a path and file name to which to save the data. If you specify only a file name, the file is stored in the current folder.

Tip

Read an EMBL-formatted file back into the MATLAB software using the emblread function.

EMBLSeq = getembl(..., 'SequenceOnly', SequenceOnlyValue, ...) controls the retrieving of only the sequence without the metadata. Choices are true or false (default).

Examples

Retrieve data for the rat liver apolipoprotein A-I.

emblout = getembl('X00558')

Retrieve data for the rat liver apolipoprotein A-I and save it to the file rat_protein. If you specify a file name without a path, the file is stored in the current folder.

emblout = getembl('X00558','ToFile','c:\project\rat_protein.txt')

Retrieve only the sequence for the rat liver apolipoprotein A-I.

Seq = getembl('X00558','SequenceOnly',true)
Introduced before R2006a