c15a59c6ddc7428025519ec671af0a7d4649d7be gperez2 Thu Oct 30 16:50:26 2025 -0700 Releasing the new GENCODE Known Gene tracks V49, v49lift37, and VM38, refs #36169 #36167 #36165 diff --git src/hg/htdocs/goldenPath/newsarch.html src/hg/htdocs/goldenPath/newsarch.html index 7fc030f2870..031ac37915a 100755 --- src/hg/htdocs/goldenPath/newsarch.html +++ src/hg/htdocs/goldenPath/newsarch.html @@ -52,30 +52,94 @@
You can sign-up to get these announcements via our Genome-announce email list. We send around one short announcement email every two weeks.
Smaller software changes are not announced here. A summary of the three-weekly release changes can be found here. For the full list of our daily code changes head to our GitHub page. Lastly, see our credits page for acknowledgments of the data we host.
+ ++We are happy to announce the new GENCODE gene annotation tracks, corresponding to +Ensembl release 115, along with GENCODE knownGene V49 for human +(hg38/GRCh38 +and +hg19/GRCh37) +and GENCODE knownGene VM38 for mouse +(mm39/GRCm39). +The GENCODE "knownGene" V49 and VM38 tracks were built using the UCSC knownGene pipeline and the +GENCODE comprehensive gene set to generate high-quality manual annotations merged with +evidence-based automated annotations. The GENCODE "knownGene" tracks are our default +gene tracks, which have extensive associations to external sources. This allows for additional +metadata on every item as well as external links. The track description pages contain options for +configuring the display, such as showing non-coding genes, splice variants, and pseudogenes.
++Below is a summary of the contents found in each release. For more details, visit the +GENCODE site.
++
| GENCODE v49 Release Stats | |||
|---|---|---|---|
| Genes | Observed | Transcripts | Observed |
| Protein-coding genes | 19,433 | +Protein-coding transcripts | 211,446 |
| Long non-coding RNA genes | 35,899 | +- full length protein-coding | 186,646 |
| Small non-coding RNA genes | 7,563 | +- partial length protein-coding | 24,800 |
| Pseudogenes | 14,701 | +Nonsense mediated decay transcripts | 21,949 |
| Immunoglobulin/T-cell receptor gene segments | +649 | Long non-coding RNA loci transcripts | 191,079 |
| Total No of distinct translations | 129,801 | +Genes that have more than one distinct translations | 15,498 |
+
| GENCODE VM38 Release Stats | |||
|---|---|---|---|
| Genes | Observed | Transcripts | Observed |
| Protein-coding genes | 21,530 | +Protein-coding transcripts | 58,647 |
| Long non-coding RNA genes | 36,108 | +- full length protein-coding | 45,050 |
| Small non-coding RNA genes | 6,105 | +- partial length protein-coding | 13,597 |
| Pseudogenes | 13,809 | +Nonsense mediated decay transcripts | 7,250 |
| Immunoglobulin/T-cell receptor gene segments | 701 | +Long non-coding RNA loci transcripts | 155,914 |
| Total No of distinct translations | 44,974 | +Genes that have more than one distinct translations | 10,853 |
+We would like to thank the GENCODE project for providing these +annotations. We would also like to thank Jonathan Casper, Mark Diekhans, and Gerardo Perez for the +development and release of these tracks.
+We are pleased to announce the release of the SpliceAI Wildtype tracks for hg38, available in the Splicing Impact superTrack. These tracks show the scores for the genome sequence itself, without variants, from predicted splice donor (5' intron boundaries) and splice acceptor (3' intron boundaries) sites. Predictions are strand-specific, with separate subtracks for the plus and minus strands.