blastread

Read data from NCBI BLAST report file

Syntax

blastdata = blastread(blastreport)

Description

blastdata = blastread(blastreport) reads the NCBI BLAST report data from an XML-formatted file, blastreport, and returns blastdata, a structure containing the corresponding BLAST data.

Examples

collapse all

Perform BLAST search

Open Script

Perform a BLAST search on a protein sequence and save the results to an XML file.

Get a sequence from the Protein Data Bank and create a MATLAB structure.

S = getpdb('1CIV');

Use the structure as input for the BLAST search with a significance threshold of 1e-10. The first output is the request ID, and the second output is the estimated time (in minutes) until the search is completed.

[RID1,ROTE] = blastncbi(S,'blastp','expect',1e-10);

Get the search results from the report. You can save the XML-formatted report to a file for an offline access. Use ROTE as the wait time to retrieve the results.

report1 = getblast(RID1,'WaitTime',ROTE,'ToFile','1CIV_report.xml')

Blast results are not available yet. Please wait ...

report1 = 

  struct with fields:

                RID: 'R49TJMCF014'
          Algorithm: 'BLASTP 2.6.1+'
           Database: 'nr'
            QueryID: 'Query_224139'
    QueryDefinition: 'unnamed protein product'
               Hits: [1×100 struct]
         Parameters: [1×1 struct]
         Statistics: [1×1 struct]

Use blastread to read BLAST data from the XML-formatted BLAST report file.

blastdata = blastread('1CIV_report.xml')

blastdata = 

  struct with fields:

                RID: ''
          Algorithm: 'BLASTP 2.6.1+'
           Database: 'nr'
            QueryID: 'Query_224139'
    QueryDefinition: 'unnamed protein product'
               Hits: [1×100 struct]
         Parameters: [1×1 struct]
         Statistics: [1×1 struct]

Alternatively, run the BLAST search with an NCBI accession number.

RID2 = blastncbi('AAA59174','blastp','expect',1e-10)

RID2 =

    'R49WAPMH014'

Get the search results from the report.

report2 = getblast(RID2)

Blast results are not available yet. Please wait ...

report2 = 

  struct with fields:

                RID: 'R49WAPMH014'
          Algorithm: 'BLASTP 2.6.1+'
           Database: 'nr'
            QueryID: 'AAA59174.1'
    QueryDefinition: 'insulin receptor precursor [Homo sapiens]'
               Hits: [1×100 struct]
         Parameters: [1×1 struct]
         Statistics: [1×1 struct]

Input Arguments

collapse all

`blastreport` — Name of BLAST report file
character vector | string

Name of an XML-formatted BLAST report file, specified as a character vector or string.

Example: 'blastreport.xml'

Output Arguments

collapse all

`blastdata` — BLAST report data
structure

BLAST report data, returned as a structure that contains the following fields:

Field	Description
`RID`	Request ID for retrieving results from a specific NCBI BLAST search
`Algorithm`	NCBI algorithm used to perform the BLAST search
`Database`	All databases searched
`QueryID`	Identifier of the query sequence
`QueryDefinition`	Definition of the query sequence
`Hits`	Structure containing information on the hit sequences, such as IDs, accession numbers, lengths, and HSPs (high-scoring segment pairs)
`Parameters`	Structure containing information on the input parameters used to perform the search
`Statistics`	Summary of statistical details about the performed search, such as lambda, kappa, and entropy values

More About

collapse all

Hits

This table lists each field of blastdata.Hits.

Field	Description
`ID`	ID of the subject sequence that matched the query sequence
`Definition`	Description of the subject sequence
`Accession`	Accession of the subject sequence
`Length`	Length of the subject sequence
`Hsps`	Structure containing Information on the high-scoring segment pairs (HSPs)

Hits.Hsps

This table summarizes the fields of Hits.Hsps.

Field	Description
`Score`	Pairwise alignment score for a high-scoring segment pair between the query sequence and a subject sequence.
`BitScore`	Bit score for a high-scoring segment pair.
`Expect`	Expectation value for a high-scoring segment pair.
`Identities`	Number of identical or similar residues for a high-scoring segment pair between the query sequence and a subject sequence.
`Positives`	Number of identical or similar residues for a high-scoring sequence pair between the query sequence and a subject amino acid sequence. This field applies only to translated nucleotide or amino acid query sequences and databases.
`Gaps`	Nonaligned residues for a high-scoring segment pair.
`AlignmentLength`	Length of the alignment for a high-scoring segment pair.
`QueryIndices`	Indices of the query sequence residue positions for a high-scoring segment pair.
`SubjectIndices`	Indices of the subject sequence residue positions for a high-scoring segment pair.
`Frame`	Reading frame of the translated nucleotide sequence for a high-scoring segment pair.
`Alignment`	3-by-N character array showing the alignment for a high-scoring sequence pair between the query sequence and a subject sequence. The first row is the query sequence, the second row is the alignment, and the third row is the subject sequence.

Documentation

blastread

Syntax

Description

Examples

Perform BLAST search

Input Arguments

`blastreport` — Name of BLAST report file
character vector | string

Output Arguments

`blastdata` — BLAST report data
structure

More About

Hits

Hits.Hsps

See Also

Bioinformatics Toolbox Documentation

Support

Documentation

blastread

Syntax

Description

Examples

Perform BLAST search

Input Arguments

blastreport — Name of BLAST report file character vector | string

Output Arguments

blastdata — BLAST report data structure

More About

Hits

Hits.Hsps

See Also

Bioinformatics Toolbox Documentation

Support

`blastreport` — Name of BLAST report file
character vector | string

`blastdata` — BLAST report data
structure