Read data from NCBI BLAST report file
reads the NCBI BLAST report data from an XML-formatted file,
blastdata
= blastread(blastreport
)blastreport
, and returns blastdata
, a
structure containing the corresponding BLAST data.
Perform a BLAST search on a protein sequence and save the results to an XML file.
Get a sequence from the Protein Data Bank and create a MATLAB structure.
S = getpdb('1CIV');
Use the structure as input for the BLAST search with a significance threshold of 1e-10
. The first output is the request ID, and the second output is the estimated time (in minutes) until the search is completed.
[RID1,ROTE] = blastncbi(S,'blastp','expect',1e-10);
Get the search results from the report. You can save the XML-formatted report to a file for an offline access. Use ROTE as the wait time to retrieve the results.
report1 = getblast(RID1,'WaitTime',ROTE,'ToFile','1CIV_report.xml')
Blast results are not available yet. Please wait ... report1 = struct with fields: RID: 'R49TJMCF014' Algorithm: 'BLASTP 2.6.1+' Database: 'nr' QueryID: 'Query_224139' QueryDefinition: 'unnamed protein product' Hits: [1×100 struct] Parameters: [1×1 struct] Statistics: [1×1 struct]
Use blastread
to read BLAST data from the XML-formatted BLAST report file.
blastdata = blastread('1CIV_report.xml')
blastdata = struct with fields: RID: '' Algorithm: 'BLASTP 2.6.1+' Database: 'nr' QueryID: 'Query_224139' QueryDefinition: 'unnamed protein product' Hits: [1×100 struct] Parameters: [1×1 struct] Statistics: [1×1 struct]
Alternatively, run the BLAST search with an NCBI accession number.
RID2 = blastncbi('AAA59174','blastp','expect',1e-10)
RID2 = 'R49WAPMH014'
Get the search results from the report.
report2 = getblast(RID2)
Blast results are not available yet. Please wait ... report2 = struct with fields: RID: 'R49WAPMH014' Algorithm: 'BLASTP 2.6.1+' Database: 'nr' QueryID: 'AAA59174.1' QueryDefinition: 'insulin receptor precursor [Homo sapiens]' Hits: [1×100 struct] Parameters: [1×1 struct] Statistics: [1×1 struct]
blastreport
— Name of BLAST report fileName of an XML-formatted BLAST report file, specified as a character vector or string.
Example: 'blastreport.xml'
blastdata
— BLAST report dataBLAST report data, returned as a structure that contains the following fields:
Field | Description |
---|---|
RID | Request ID for retrieving results from a specific NCBI BLAST search |
Algorithm | NCBI algorithm used to perform the BLAST search |
Database | All databases searched |
QueryID | Identifier of the query sequence |
QueryDefinition | Definition of the query sequence |
Hits | Structure containing information on the hit sequences, such as IDs, accession numbers, lengths, and HSPs (high-scoring segment pairs) |
Parameters | Structure containing information on the input parameters used to perform the search |
Statistics | Summary of statistical details about the performed search, such as lambda, kappa, and entropy values |
This table lists each field of
blastdata.Hits
.
Field | Description |
---|---|
ID | ID of the subject sequence that matched the query sequence |
Definition | Description of the subject sequence |
Accession | Accession of the subject sequence |
Length | Length of the subject sequence |
Hsps | Structure containing Information on the high-scoring segment pairs (HSPs) |
This table summarizes the fields of Hits.Hsps
.
Field | Description |
---|---|
Score | Pairwise alignment score for a high-scoring segment pair between the query sequence and a subject sequence. |
BitScore | Bit score for a high-scoring segment pair. |
Expect | Expectation value for a high-scoring segment pair. |
Identities | Number of identical or similar residues for a high-scoring segment pair between the query sequence and a subject sequence. |
Positives | Number of identical or similar residues for a high-scoring sequence pair between the query sequence and a subject amino acid sequence. This field applies only to translated nucleotide or amino acid query sequences and databases. |
Gaps | Nonaligned residues for a high-scoring segment pair. |
AlignmentLength | Length of the alignment for a high-scoring segment pair. |
QueryIndices | Indices of the query sequence residue positions for a high-scoring segment pair. |
SubjectIndices | Indices of the subject sequence residue positions for a high-scoring segment pair. |
Frame | Reading frame of the translated nucleotide sequence for a high-scoring segment pair. |
Alignment | 3-by-N character array showing the alignment for a high-scoring sequence pair between the query sequence and a subject sequence. The first row is the query sequence, the second row is the alignment, and the third row is the subject sequence. |
You have a modified version of this example. Do you want to open this example with your edits?