Description:
This module performs a local alignment for the input primer(s) against target sequences, and then determines a weighted score based upon the number of mismatches and gaps. A summary graph showing mismatches and weighted scores is generated for all input fasta files, as is a hits file containing details about the primer mismatches, index in the sequence, and other details about primer binding.
This module takes an input primer file and one or more fasta files. Each primer is tested against every sequence to find the best local alignment. Mismatches and gaps are calculated for the primer, along with a weighting score which gives larger penalties to gaps and mismatches in the 3’ end of the primer.
An output hits file is generated for each primer, recording information about the primer hit site, mismatches, and overall weighted score (a perfect score starts at zero and increases as penalties are added). A graph is also generated, showing mismatches/gaps and overall score information for the primer and the target sequences.
The primers input file should be generated with the following format: Comments are preceeded by a pound “#” symbol. The primer data are tab delineated with the primer name first, such as “349_v2r”, the actual nucleotide sequence next, listed in a 5’ to 3’ sequence, (example: “AATCGRACGNTYA”), and finally a comment or citation, if any, can be listed. Forward primers should be followed by a “f” while reverse primers are followed by “r”. A complete example line is listed below.
815_v34f GTGGCCNATRRCYAGAACGC Darrow,Scopes,Bryan et al. 1926
The input sequences should be in fasta format. If more than one file is supplied, they should be separated by a colon. Each fasta file passed will have its sequence coverage displayed in separate output graphics and hits files.
Usage: analyze_primers.py [options]
Input Arguments:
Note
[REQUIRED]
[OPTIONAL]
Output:
For each primer tested, an output _hits.txt file containing information about the primer hit to each sequence is generated, as well a .ps file showing overviews for the mismatches and weighted score for the primer and target sequences. Both output files are named by the primer and fasta file tested
Example:
Standard Example:
analyze_primers.py [options] {-P input_primers_filepath [required] -f input_fasta_filepath [required]}
Manually specify a primer name and sequence:
Note - primer name must end with ‘f’ or ‘r’:
analyze_primers.py -p "primer_name_f" -s "ACCTGACRGGTAATC" -f input_fasta_filepath
Use multiple target files, change scoring parameters:
Pass a primers file, two target fasta files, change the size of the 3’ region from the default 5 bases to 7 bases, and lower the 3’ mismatch penalty from the default 1.0 to 0.6:
analyze_primers.py -P primers.txt -f bacterial_seqs.fasta:eukaryotic_seqs.fasta -e 7 -t 0.6