generate_primers_denovo.py – Find conserved and/or specific DNA sequences for use as the 3’ end of a primer¶

Description:

The generate_primers_denovo module is designed to take an input sequence length (default 5) and aligned fasta file(s) to search for all Xmers of the given sequence length that are conserved in the target fasta files. Optionally, any Xmers that are found to exist above a certain threshold (1% is the default) in the excluded fasta sequences are discarded. The remaining Xmers, along with their upstream and downstream sequences are written to an output file.

Usage: generate_primers_denovo.py [options]

Input Arguments:

Note

[REQUIRED]

-i, --target_seqs: Target aligned fasta sequence files to find conserved sites for primer design. Separate multiple files with a colon.
-o, --output_filepath: Name of output filepath to write details about conserved sequence sites.

[OPTIONAL]

-e, --exclude_fasta: Excluded aligned fasta file(s). To pass multiple files, separate each file with a colon. Example: -e test1.fasta:test2.fasta. If not specified, will skip exclusion step [default: None]
-p, --percent_match: Percentage of sequence matches to primer that must match in order to retain prospective sequence in dictionary. [default: 0.6]
-s, --full_primer_length: Overall primer length to retrieve from sequences. [default: 20]
-x, --xmer_length: Xmer length to search for in target fasta sequence(s). [default: 5]
-S, --specificity_threshold: Sets specificity threshold for excluded fasta sequences. [default: 0.01]
-l, --log_file: Log filepath. If not specified, no log file will be written. [default: None]
-a, --standard_index_file: Aligned sequence file with which to assign prospective primer indices to. The alignment where a conserved sequence is found will be used to determine the unaligned index in the supplied file (for instance an E. coli sequence) and will be recorded in the output file for the purpose of giving a meaningful name to prospective primers. Only the first sequence in the file will be used for determining an index [default: None]
-r, --search_range: Range of nucleotides in the supplied aligned target sequences to search for primers. Supply the starting index and end index separated by a colon. Example -r 1500:2700 Enable this option to generate primers that target certain regions. [default: None]

Output:

The output file from generate_primers_denovo is a text file containing information about the target Xmers that met the sensitivity (and specificity if given) threshold(s). For each Xmer, the calculated sensitivity and specificity values are given, as well as the upstream and downstream sequences from the Xmer for all of the sequences with perfect matches.

Standard Example usage:

generate_primers_denovo.py [options] {-i include_fasta_filepath(s) -o output_primers_filepath}

Look for common 5mers in a given aligned fasta file, bact_sample.fasta, that are conserved at least 60% (default setting) of the time. Output to primers.txt:

generate_primers_denovo.py -i bact_sample.fasta -o primers.txt

Look for common 6mers in two input aligned fasta files (bact_sample.fasta, arch_sample.fasta), that are not found in the aligned fasta file euk_sample.fasta:

generate_primers_denovo.py -i bact_sample:arch_sample.fasta -x 6 -e euk_sample.fasta -o primers.txt

Navigation

generate_primers_denovo.py – Find conserved and/or specific DNA sequences for use as the 3’ end of a primer¶

Site index

Quick search

Navigation