83fd535ffd08aac398121175c71d9b9554c0eee1 max Thu Dec 4 10:51:30 2025 -0800 adding a few example motifs to short match track, no redmine diff --git src/hg/htdocs/goldenPath/help/oligoMatch.html src/hg/htdocs/goldenPath/help/oligoMatch.html index a65d71f6802..be585b2a5c6 100644 --- src/hg/htdocs/goldenPath/help/oligoMatch.html +++ src/hg/htdocs/goldenPath/help/oligoMatch.html @@ -1,24 +1,147 @@ <h2>Description</h2> <p> This track shows all occurrences of a selected short motif within the displayed position range of the assembly sequence. It is useful for finding oligonucleotides, restriction sites, or other recurring short sequences within the assembly. In full display mode, each motif occurrence is labeled by the strand on which the match is located, followed by the starting coordinate of the match. In cases where the input motif sequence is identical to its reverse complement, only the match on the "+" strand is shown.</p> <p> The track may be configured to search for any short sequence of 2 - 30 bases in length. Sequences may include <a href="/goldenPath/help/iupac.html" target="_blank">IUPAC ambiguity codes</a>. To change the motif, open the track's description page (by clicking the track control label or the mini-button to the left of the track), then type a new sequence into the text box.</p> <p> To see how to create a bed file of the short match data see this mailing list question <a href="https://groups.google.com/a/soe.ucsc.edu/d/msg/genome/zlvaV8Wzeaw/xPwdY7mQ0f8J" target="_blank">here</a>.</p> +<h2>Example motifs</h2> +<table class="stdTbl"> + <tr> + <th>Category</th> + <th>Motif (Consensus)</th> + <th>Name / Function</th> + <th>Notes</th> + </tr> + + <tr> + <td>Transcription initiation</td> + <td>TATAWAAR</td> + <td>TATA box</td> + <td>Classic Pol II promoter element (~30 bp upstream of TSS).</td> + </tr> + + <tr> + <td>Transcription initiation</td> + <td>CCATNTT</td> + <td>YY1 binding motif</td> + <td>Common promoter-associated TF motif.</td> + </tr> + + <tr> + <td>Transcription initiation</td> + <td>YYANWYY</td> + <td>Initiator (Inr)</td> + <td>Anchors transcription start when no TATA box is present.</td> + </tr> + + <tr> + <td>Transcription termination</td> + <td>AATAAA</td> + <td>Polyadenylation signal (PAS)</td> + <td>Main poly(A) signal; variants include ATTAAA, TATAAA.</td> + </tr> + + <tr> + <td>RNA modification</td> + <td>DRACH</td> + <td>m⁶A methylation motif</td> + <td>Core motif for METTL3/METTL14 deposition.</td> + </tr> + + <tr> + <td>Splice donor</td> + <td>MAGGTRAGT</td> + <td>5′ splice site (donor)</td> + <td>exon–intron boundary is after MAG.</td> + </tr> + + <tr> + <td>Splice acceptor</td> + <td>YYYYYYYYYNCAGG</td> + <td>3′ splice site (acceptor), splice site is before last G</td> + <td>Long pyrimidine tract + invariant AG.</td> + </tr> + + <tr> + <td>Branch point</td> + <td>YNYURAY</td> + <td>Splicing branch point</td> + <td>Located upstream of the 3′ splice site; weak but conserved.</td> + </tr> + + <tr> + <td>Transcription factor</td> + <td>GATA</td> + <td>GATA family motif</td> + <td>Recognized by GATA1/2/3/4.</td> + </tr> + + <tr> + <td>Transcription factor</td> + <td>CACGTG</td> + <td>E-box</td> + <td>Recognized by MYC/MAX and USF families.</td> + </tr> + + <tr> + <td>Transcription factor</td> + <td>TGASTCA</td> + <td>AP-1 motif (Jun/Fos)</td> + <td>Key stress-response motif; S = G/C.</td> + </tr> + + <tr> + <td>Transcription factor</td> + <td>CCGCCC</td> + <td>SP1 motif</td> + <td>Classic GC-rich promoter element.</td> + </tr> + + <tr> + <td>Transcription factor</td> + <td>GCGTG</td> + <td>HIF1A/HRE variant</td> + <td>Hypoxia response element; canonical form is RCGTG.</td> + </tr> + + <tr> + <td>Transcription factor</td> + <td>GATTA</td> + <td>Homeobox (HOX) core</td> + <td>Generic homeodomain preference; flanking bases refine specificity.</td> + </tr> + + <tr> + <td>RNA editing</td> + <td>WAR (local context)</td> + <td>ADAR A-to-I editing preference</td> + <td>Less strict motif; enriched in dsRNA structures.</td> + </tr> + + <tr> + <td>Replication origin</td> + <td>WAWTTDDWW</td> + <td>ORC-associated origin motif</td> + <td>Weak consensus; human origins have low sequence specificity.</td> + </tr> +</table> + + <h2>Credits</h2> <p> This track was generated by <a href="mailto:kent@soe.ucsc.edu">Jim Kent</a> of the UCSC Genome Bioinformatics Group.</p>