

First, it is difficult to obtain large quantities of homogeneous DNA. The rest (477 out of 483) are identical (see the Identities and Gaps statements above the alignment). The Plus/Plus Strand statement indicates that Query and Subject represent the same strand of the double-stranded DNA.There are three major obstacles to the analysis of the nucleotide sequence in a DNA molecule starting from a known location in the DNA molecule. In total, six Query bases have no identity with Subject. Query has four extra bases that introduce a gap in Subject (purple oval). Subject and Query translate into a protein of 118 amino acid residues (blue rectangles mark protein start and end locations). Query and Subject have two bases mismatched (orange ovals). An asterisk ( * ) marks the stop codon ( TAA) at Query location 472 (red oval).
#Nucleotide sequence analysis code
BLAST applied the standard genetic code for Query, translating GTG into valine ( V). CDS starts at Query location 69 (blue oval) the GTG (GUG) codon is an alternative protein initiation codon ( M) for Bacterial, Archaeal and Plant Plastid genetic code in Subject. Query alignment starts with its first base ( T) and it matches Subject location 22081 the end of Query alignment matches Subject location 22559 (yellow rectangles). You then see the Plus/Minus Strand statement.įigure 1: A pairwise alignment with the CDS feature display of a 483 bp-long Query against the KT780704.1 (Subject) sequence with its total length of 64127 bp.
#Nucleotide sequence analysis plus
If the strands align in opposite directions, BLAST makes the Query sequence the plus strand. In such case, you see the Plus/Plus Strand statement above the alignment. Query and Subject can represent the same strand of the double-stranded DNA. You can use the numbers to count the base- or amino acid positions. Note the numbering for each sequence row. It uses only the standard genetic code for any Query translation that may not be appropriate for the query. It shows the protein translation above the Query (top-most row of letters).īLAST uses the genetic code (translation table) from the Subject record, but it applies this code only to the Subject translation. It translates the Query based on the CDS alignment. Each code sits in the middle of its nucleotide codon (coding triplet). BLAST finds the aligned region between the Subject CDS and Query. These letters are single-letter amino acid (AA) codes. You will see the protein sequence below the Subject nucleotide sequence as a row of letters.

If you select the CDS feature option, you will see two more rows of sequence. BLAST translates the CDS annotated on the Subject into a protein. Gaps represent parts where Query or Subject have no counterpart. Dashes (-) indicate gaps in the alignment. Lines connect the matching bases between Query and Subject. The bottom row represents a database sequence, called Subject ( Sbjct). The top row represents your search sequence ( Query). A pairwise alignment can help you determine properties or problems of a sequence.Īlignments on blastn search results pages (see an example in Figure 1) consist of two rows of nucleotide sequence. See the article on blastn and CDS feature set up. In this article we describe pairwise alignments with CDS feature display. Further, you can opt to display the CDS feature on the alignment. You can view Nucleotide BLAST (blastn) search results as pairwise alignments.
