c15a59c6ddc7428025519ec671af0a7d4649d7be gperez2 Thu Oct 30 16:50:26 2025 -0700 Releasing the new GENCODE Known Gene tracks V49, v49lift37, and VM38, refs #36169 #36167 #36165 diff --git src/hg/htdocs/goldenPath/newsarch.html src/hg/htdocs/goldenPath/newsarch.html index 7fc030f2870..031ac37915a 100755 --- src/hg/htdocs/goldenPath/newsarch.html +++ src/hg/htdocs/goldenPath/newsarch.html @@ -52,30 +52,94 @@ <p>You can sign-up to get these announcements via our <a target=_blank href="https://groups.google.com/a/soe.ucsc.edu/g/genome-announce?hl=en">Genome-announce</a> email list. We send around one short announcement email every two weeks.</p> <p>Smaller software changes are not announced here. A summary of the three-weekly release changes can be found <a target=_blank href="https://genecats.gi.ucsc.edu/builds/versions.html">here</a>. For the full list of our daily code changes head to our <a href="https://github.com/ucscGenomeBrowser/kent/commits/master" target=_blank>GitHub page</a>. Lastly, see our <a href="credits.html" target="_blank"> credits page</a> for acknowledgments of the data we host.</p> <!-- ============= 2025 archived news ============= --> <a name="2025"></a> +<a name="103125"></a> +<h2>Oct. 31, 2025 New GENCODE "knownGene" V49 for human (hg38/hg19) and VM38 +for mouse (mm39)</h2> + +<p> +We are happy to announce the new GENCODE gene annotation tracks, corresponding to +<a href="https://www.ensembl.info/2025/09/02/ensembl-115-has-been-released/" +target="_blank">Ensembl release 115</a>, along with GENCODE knownGene V49 for human +(<a href="/cgi-bin/hgTrackUi?db=hg38&position=default&g=knownGene">hg38/GRCh38</a> +and +<a href="/cgi-bin/hgTrackUi?db=hg19&position=default&g=knownGene">hg19/GRCh37</a>) +and GENCODE knownGene VM38 for mouse +(<a href="/cgi-bin/hgTrackUi?db=mm39&position=default&g=knownGene">mm39/GRCm39</a>). +The GENCODE "knownGene" V49 and VM38 tracks were built using the UCSC knownGene pipeline and the +GENCODE comprehensive gene set to generate high-quality manual annotations merged with +evidence-based automated annotations. The GENCODE "knownGene" tracks are our default +gene tracks, which have extensive associations to external sources. This allows for additional +metadata on every item as well as external links. The track description pages contain options for +configuring the display, such as showing non-coding genes, splice variants, and pseudogenes.</p> +<p> +Below is a summary of the contents found in each release. For more details, visit the +<a target="_blank" href="https://www.gencodegenes.org/">GENCODE site</a>.</p> +<p> +<table class="stdTbl"> +<tr><th COLSPAN=4>GENCODE v49 Release Stats</th></tr> +<tr align=left><th>Genes</th><th>Observed</th><th>Transcripts</th><th>Observed</th></tr> +<tr align=left><td>Protein-coding genes</td><td>19,433</td> + <td>Protein-coding transcripts</td><td>211,446</td></tr> +<tr align=left><td>Long non-coding RNA genes</td><td>35,899</td> + <td><font size="-1">- full length protein-coding</font></td><td>186,646</td></tr> +<tr align=left><td>Small non-coding RNA genes</td><td>7,563</td> + <td><font size="-1">- partial length protein-coding</font></td><td>24,800</td></tr> +<tr align=left><td>Pseudogenes</td><td>14,701</td> + <td>Nonsense mediated decay transcripts</td><td>21,949</td></tr> +<tr align=left><td>Immunoglobulin/T-cell receptor gene segments</td> + <td>649</td><td>Long non-coding RNA loci transcripts</td><td>191,079</td></tr> +<tr align=left><td>Total No of distinct translations</td><td>129,801</td> + <td>Genes that have more than one distinct translations</td><td>15,498</td></tr> +</table><BR> +</p> +<p> +<table class="stdTbl"> +<tr><th COLSPAN=4>GENCODE VM38 Release Stats</th></tr> +<tr align=left><th>Genes</th><th>Observed</th><th>Transcripts</th><th>Observed</th></tr> +<tr align=left><td>Protein-coding genes</td><td>21,530</td> + <td>Protein-coding transcripts</td><td>58,647</td></tr> +<tr align=left><td>Long non-coding RNA genes</td><td>36,108</td> + <td><font size="-1">- full length protein-coding</font></td><td>45,050</td></tr> +<tr align=left><td>Small non-coding RNA genes</td><td>6,105</td> + <td><font size="-1">- partial length protein-coding</font></td><td>13,597</td></tr> +<tr align=left><td>Pseudogenes</td><td>13,809</td> + <td>Nonsense mediated decay transcripts</td><td>7,250</td></tr> +<tr align=left><td>Immunoglobulin/T-cell receptor gene segments</td><td>701</td> + <td>Long non-coding RNA loci transcripts</td><td>155,914</td></tr> +<tr align=left><td>Total No of distinct translations</td><td>44,974</td> + <td>Genes that have more than one distinct translations</td><td>10,853</td></tr> +</table><BR> +</p> +<p> +We would like to thank the <a target="_blank" +href="https://www.gencodegenes.org/pages/gencode.html">GENCODE project</a> for providing these +annotations. We would also like to thank Jonathan Casper, Mark Diekhans, and Gerardo Perez for the +development and release of these tracks.</p> + <a name="101625"></a> <h2>Oct. 16, 2025 SpliceAI Wildtype tracks for hg38</h2> <p> We are pleased to announce the release of the <a href="/cgi-bin/hgTrackUi?db=hg38&position=default&g=spliceAIWt" target="_blank">SpliceAI Wildtype tracks</a> for hg38, available in the <a href="/cgi-bin/hgTrackUi?db=hg38&position=default&g=spliceImpactSuper" target="_blank">Splicing Impact superTrack</a>. These tracks show the scores for the genome sequence itself, without variants, from predicted splice donor (5' intron boundaries) and splice acceptor (3' intron boundaries) sites. Predictions are strand-specific, with separate subtracks for the plus and minus strands. <ul> <li><b>SpliceAI Acceptor Plus</b> – Splice acceptor sites, plus strand <li><b>SpliceAI Acceptor Minus</b> – Splice acceptor sites, minus strand <li><b>SpliceAI Donor Plus</b> – Splice donor sites, plus strand