f75be6cdc179d7ce11539664b23406306976052d gperez2 Tue May 21 12:27:38 2024 -0700 Staging and releasing the new GENCODE Known Gene tracks V46 and VM35, refs #33080 #33083 diff --git src/hg/htdocs/goldenPath/newsarch.html src/hg/htdocs/goldenPath/newsarch.html index b825546..87093f0 100755 --- src/hg/htdocs/goldenPath/newsarch.html +++ src/hg/htdocs/goldenPath/newsarch.html @@ -51,30 +51,80 @@
You can sign-up to get these announcements via our Genome-announce email list. We send around one short announcement email every two weeks.
Smaller software changes are not announced here. A summary of the three-weekly release changes can be here. For the full list of our daily code changes head to our GitHub page.
+ ++We are pleased to announce the release of the +GENCODE V46 (hg38) +and the +GENCODE VM35 (mm39) gene tracks. +The GENCODE "KnownGene" V46 and VM35 gene tracks were built using a UCSC +pipeline (KnownGene) and the GENCODE comprehensive gene set to generate high-quality manual +annotations merged with evidence-based automated annotations. The GENCODE "KnownGene" +tracks are our default gene tracks, which have extensive associations to external sources. This +allows for additional metadata on every item as well as external links. The track description pages +contain options for configuring the display, such as showing non-coding genes, splice variants, and +pseudogenes. The track description pages for these tracks contain options for configuring the +display such as also showing non-coding genes, splice variants, and pseudogenes.
+ ++Below is a summary of the contents found in each release. For more details visit the GENCODE site.
++
GENCODE v46 Release Stats | |||
---|---|---|---|
Genes | Observed | Transcripts | Observed |
Protein-coding genes | 19,411 | Protein-coding transcripts | 89,581 |
Long non-coding RNA genes | 20,310 | - full length protein-coding | 64,695 |
Small non-coding RNA genes | 7,565 | - partial length protein-coding | 24,886 |
Pseudogenes | 14,716 | Nonsense mediated decay transcripts | 21,774 |
Immunoglobulin/T-cell receptor gene segments | 648 | Long non-coding RNA loci transcripts | 59,927 |
Total No of distinct translations | 65,650 | Genes that have more than one distinct translations | 13,620 |
+
GENCODE VM35 Release Stats | |||
---|---|---|---|
Genes | Observed | Transcripts | Observed |
Protein-coding genes | 21,423 | Protein-coding transcripts | 58,457 |
Long non-coding RNA genes | 15,126 | - full length protein-coding | 44,851 |
Small non-coding RNA genes | 6,105 | - partial length protein-coding | 13,606 |
Pseudogenes | 13,756 | Nonsense mediated decay transcripts | 7,243 |
Immunoglobulin/T-cell receptor gene segments | 701 | Long non-coding RNA loci transcripts | 27,096 |
Total No of distinct translations | 44,819 | Genes that have more than one distinct translations | 10,833 |
+We would like to thank the GENCODE project for providing these +annotations. We would also like to thank Jonathan Casper and Gerardo Perez for the development and +release of these tracks.
+ +We are excited to announce the release of the AbSplice scores track that was previously only on GRCh38/hg38 and is now also available on the human GRCh37/hg19 genome assembly. AbSplice is a method that predicts aberrant splicing across human tissues, as described in Wagner, Çelik et al., 2023. This track consists of an aberrant splicing benchmark dataset that spans over 8.8 million rare variants in 49 human tissues from the Genotype-Tissue Expression (GTEx) dataset and displays precomputed AbSplice scores for all possible single-nucleotide variants genome-wide. The AbSplice score is a probability estimate of how likely aberrant splicing of some sort takes place in a given tissue. Aberrant splicing predictions for