c53b1c7626e433d0511d2d430ef6f72e4eb715af
max
Thu Jun 26 03:07:43 2025 -0700
adding demo version of exon primers track, refs #35830
diff --git src/hg/makeDb/trackDb/human/exonprimer.html src/hg/makeDb/trackDb/human/exonprimer.html
new file mode 100644
index 00000000000..4f91c483651
--- /dev/null
+++ src/hg/makeDb/trackDb/human/exonprimer.html
@@ -0,0 +1,265 @@
+
Description
+
+The Exon PCR Primers track displays computationally designed primer pairs for PCR amplification of individual exons across all protein-coding genes. Each track item represents a complete PCR reaction designed to amplify a single exon, with primer locations highlighted as blocks within the amplicon span. This track is designed to facilitate exon-specific PCR experiments, including mutation screening, expression analysis, and targeted sequencing applications.
+
+Display Conventions
+
+
+ - Track Items: Each item represents one PCR amplicon spanning from the forward primer to the reverse primer
+ - Color: All PCR products are displayed in green (RGB: 0,128,0)
+ - Blocks: Two thick blocks within each item indicate the exact locations of the forward and reverse primers
+ - Strand: No strand indicators are shown (strand = ".")
+ - Score: Represents the average melting temperature (Tm) of both primers × 10
+
+
+PCR products are named using the format: {transcript_id}_exon{number}_PCR
+Example: NM_001001130_exon3_PCR
+
+Each PCR product contains comprehensive metadata accessible by clicking on track items:
+
+
+
+
+ Field |
+ Description |
+ Example |
+
+
+
+
+ forwardSeq |
+ DNA sequence of the forward primer (5' → 3') |
+ ATGCGATCGTAGCATGC |
+
+
+ forwardTm |
+ Melting temperature of forward primer (°C) |
+ 59.8 |
+
+
+ forwardGc |
+ GC content of forward primer (%) |
+ 52.4 |
+
+
+ reverseSeq |
+ DNA sequence of the reverse primer (5' → 3') |
+ GCATGCTACGATCGCAT |
+
+
+ reverseTm |
+ Melting temperature of reverse primer (°C) |
+ 60.2 |
+
+
+ reverseGc |
+ GC content of reverse primer (%) |
+ 58.8 |
+
+
+ transcript |
+ RefSeq transcript identifier |
+ NM_001001130 |
+
+
+ geneSymbol |
+ HGNC gene symbol |
+ GAPDH |
+
+
+ exonNum |
+ Exon number within the transcript |
+ 3 |
+
+
+ productSize |
+ Expected PCR amplicon size (bp) |
+ 185 |
+
+
+
+
+ Applications
+
+ Research Applications
+
+ - Mutation Screening: PCR amplification of exons for Sanger sequencing or variant detection
+ - Expression Analysis: RT-PCR analysis of exon inclusion/exclusion patterns
+ - Targeted Sequencing: Primer design for custom amplicon sequencing panels
+ - Cloning: Exon-specific amplification for molecular cloning applications
+ - Functional Studies: Generation of exon-specific constructs for functional analysis
+
+
+ Clinical Applications
+
+ - Diagnostic PCR: Disease gene mutation screening in clinical samples
+ - Pharmacogenomics: Analysis of drug metabolism gene variants
+ - Genetic Testing: Targeted analysis of known pathogenic variants
+
+
+
+
Important Considerations
+
+ - Validation Required: All primers should be experimentally validated before use
+ - Specificity: Check primer specificity using BLAST or similar tools
+ - Polymorphisms: Consider known SNPs/variants that may affect primer binding
+ - Splice Variants: Primers may not amplify all transcript isoforms
+ - Pseudogenes: Be aware of potential cross-amplification with pseudogenes
+
+
+
+ Data Access
+
+ Table Browser
+ The complete dataset can be accessed through the UCSC Table Browser:
+
+ - Navigate to the Table Browser
+ - Select "Exon PCR Primers" from the track dropdown
+ - Choose desired output format (BED, GTF, or custom)
+ - Apply region or gene-based filters as needed
+
+
+ API Access
+ Programmatic access is available through the UCSC REST API:
+ https://api.genome.ucsc.edu/getData/track?genome=hg38;track=exonPrimers;chrom=chr1;start=1000000;end=2000000
+
+ Methods
+
+ Primers are designed using Primer3 with the following default parameters:
+
+
+
+
+ Parameter |
+ Default Value |
+ Description |
+
+
+
+
+ Primer Length |
+ 18-25 bp (optimal: 20 bp) |
+ Length range for primer oligonucleotides |
+
+
+ Melting Temperature |
+ 57-63°C (optimal: 60°C) |
+ Target Tm for primer annealing |
+
+
+ Product Size |
+ 100-300 bp |
+ Expected PCR amplicon length |
+
+
+ Flanking Distance |
+ 500 bp |
+ Sequence context around each exon |
+
+
+ Max Self-Complementarity |
+ 8 bp |
+ Maximum self-annealing allowed |
+
+
+ Max 3' Self-Complementarity |
+ 3 bp |
+ Maximum 3' end self-annealing |
+
+
+ Max Pair Complementarity |
+ 8 bp |
+ Maximum primer-dimer formation |
+
+
+
+
+
+
Computational Pipeline
+
The primer design pipeline consists of the following steps:
+
+
1. Input Processing
+
+ - UCSC genePred format gene annotations are parsed to extract exon coordinates
+ - Transcript information including gene symbols are retained
+ - Only protein-coding transcripts are processed
+
+
+
2. Sequence Extraction
+
+ - Genomic sequences are retrieved from 2bit genome files
+ - Flanking sequences (default 500 bp) are added to each exon
+ - Repetitive sequences are masked using standard genome masks
+
+
+
3. Primer Design
+
+ - Primer3 is executed in batch mode for computational efficiency
+ - Target regions are set to exon boundaries within the flanking sequence
+ - Primer pairs are optimized for uniform melting temperatures and minimal secondary structure
+
+
+
4. Quality Control
+
+ - Primers failing quality criteria are excluded
+ - Genomic coordinates are calculated and validated
+ - Primer specificity is assessed computationally
+
+
+
5. Output Generation
+
+ - Results are formatted as BED12 files with metadata
+ - BigBed files are generated for genome browser display
+ - Comprehensive metadata is embedded in extra fields
+
+
+
+ Limitations
+
+
+ - Computational Design: Primers are designed computationally and may require experimental optimization
+ - Single Isoform: Primers target the canonical transcript and may not amplify all splice variants
+ - Repetitive Regions: Exons in highly repetitive regions may lack suitable primers
+ - Polymorphisms: Common genetic variants may affect primer efficiency
+ - Species Specificity: Primers are designed for human sequences only
+
+
+ Data Sources and Updates
+
+
+ - Gene Annotations: UCSC Gencode V48
+ - Genome Sequence: Reference genome assemblies (hg19, hg38)
+ - Update Frequency: Updated with each major Gencode annotation release
+ - Quality Assurance: Automated validation against current genome builds
+
+
+ Technical Details
+
+ File Formats
+
+ - Track Format: BigBed 12+10
+ - Coordinate System: 0-based, half-open intervals
+ - Strand Convention: No strand displayed (strand = ".")
+ - Color Encoding: RGB values embedded in itemRgb field
+
+
+ Software used
+
+ - Primer3: Primer design and optimization
+ - UCSC Tools: Genome sequence access and file format conversion
+ - Python Libraries: twobitreader for genome access
+
+
+
+
References
+
Primer3 Software:
+ Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, Rozen SG (2012)
+ Primer3--new capabilities and interfaces. Nucleic Acids Research 40(15):e115.
+ doi: 10.1093/nar/gks596
+
+
ExonPrimer:
+ Written in the early 2000s by Tim Strom, the Exonprimer website inspired this track. The
+ tool used to be available at http://ihg.gsf.de/ihg/ExonPrimer.html, but the server seems to be offline now.
+
+