198c9b8daecc44fbda6a6494c566c723920f030a lrnassar Wed Mar 11 18:25:21 2026 -0700 Fixing a few hundred clear typos with the help of Claude. Some are less important in code comments, but majority of them are in user-facing places. I manually approved 60%+ of the changes and didn't see any that were an incorrect suggestion, at worst it was potentially uncessesary, like a code comment having cant instead of can't. No RM. diff --git src/hg/htdocs/goldenPath/newsarch.html src/hg/htdocs/goldenPath/newsarch.html index 33d8b77a519..f86cda14dbd 100755 --- src/hg/htdocs/goldenPath/newsarch.html +++ src/hg/htdocs/goldenPath/newsarch.html @@ -138,31 +138,31 @@

The JASPAR database is a joint effort among several labs (please see the latest JASPAR paper). Binding site predictions and UCSC tracks were computed by the CBGR team at NCMBM using code developed at the Wasserman Lab. We would like to thank Luis Nassar and Gerardo Perez for their efforts on this release.

Feb. 09, 2026 Phased variants track for human (hg38 and hg19)

We are pleased to announce the release of the Phased Variants container track for hg38/GRCh38 -and hg19/GRCh19. +and hg19/GRCh37. This new track brings together phased individual-level genotype data from four projects: Human Diversity Genome Project, Simons Genome Diversity Project, gnomad's HGDP+1000 Genomes callset, and the Mexico Biobank.

The Phased Variants track includes the following subtracks:

Mexico Biobank (MXB) 6k Array – Phased alleles from array genotyping of 6,011 individuals sampled across all 32 states of @@ -585,31 +585,31 @@
We would like to thank the European Variation Archive for making these data publicly available. We would also like to thank Gerardo Perez, Luis Nassar, and Angie Hinrichs for the creation and release of these tracks.

Dec. 15, 2025    Ancient Hominids track for hg38

We are happy to announce the release of the Ancient Hominids track featuring data from Archaic Sequence Hub (ArcSeqHub). This track shows variants identified by ArcSeqHub's remapping of high-quality Altai Neanderthal and Denisovan genomes onto the hg38/GRCh38 genome. Variants are divided into two subtracks, -one for Denisovian variants and another for Neanderthal variants. +one for Denisovan variants and another for Neanderthal variants. UCSC has removed those positions from the VCF without an alternate allele to show only variants that are present in the ancient genomes.

We would like to thank the ArcSeqHub authors for making the data available. We would also like to thank Maximilian Haeussler and Matthew Speir for the creation and release of this track.

Dec. 03, 2025    New gnomAD Missense Deleteriousness Prediction by Constraint (MPC) track for hg19

We are happy to announce the release of the @@ -778,31 +778,31 @@ alt="Panmask Easy 151b Regions track for the BRCA1 exon 19" width='75%'>

The pm151 regions are used to filter spurious variant calls in centromeres, long repeats, and other genomic regions where short-read mapping is often problematic. They cover 88.2% of hg38, 92.2% of coding regions, and 96.3% of ClinVar pathogenic variants. The track can be used to filter variant calls for clinical or research human samples. It shows regions that are easy to sequence, rather than those that are problematic. The data was derived from the HPRC assemblies, and this track presents the 151b-easy panmask set.

We would like to thank Heng Li's group at Harvard Medical School for making this data available. We would also like to thank Max Haeussler and Gerardo Perez for their efforts on this release.
-
Sep. 24, 2025    CoLoRSdb small and structure variants for hg38 and hs1
+
Sep. 24, 2025    CoLoRSdb small and structural variants for hg38 and hs1

We are excited to announce the release of the CoLoRSdb Small and Structural Variant tracks for the human assemblies GRCh38/hg38 and CHM13/hs1. These tracks provide a comprehensive catalog of genetic variation discovered through long-read whole genome sequencing, contributed by the international Consortium of Long Read Sequencing (CoLoRS). The small variant tracks (DeepVariant + GLnexus) contain single nucleotide polymorphisms (SNPs) and short indels, while the structural variant tracks (pbsv + Jasmine) display larger events including insertions, deletions, and inversions. Long-read sequencing technology improves sensitivity in repetitive regions and provides more precise breakpoint resolution than short-read approaches, enabling accurate visualization of complex loci in the Genome Browser.

Each track includes allele frequency and sample count annotations, with additional filtering options @@ -971,33 +971,33 @@ Australia PanelApp for providing guidance. We would also like to thank Beagan Nguy, Lou Nassar, and Gerardo Perez of the Genome Browser team for the development and release of this track.

Aug. 01, 2025    PubTator Variants track for human, hg38 and hg19

We are excited to announce the release of the PubTator Variants track for human assemblies, hg38 and hg19. These tracks were created using PubTator3 data and are freely accessible to the research community. PubTator3 is a web-based system that offers a comprehensive set of features and tools that allow researchers to explore biomedical literature for knowledge discovery. It uses text mining and AI techniques to annotate and unify bio-entities and their corresponding relations for semantic and relation searches.

-We would like to thank the PubTator 3.0 authors for generating and making the data publically +We would like to thank the PubTator 3.0 authors for generating and making the data publicly available. We would also like to thank Max Haeussler and Johannes Birgmeier for creating the tracks, -and Jairo Navarro the release of the tracks. +and Jairo Navarro for the release of the tracks.

July 31, 2025    New bedMethyl and bigMethyl track type

We are excited to announce support for a new track format for visualizing DNA methylation data: bedMethyl. This format, and its binary-indexed counterpart, bigMethyl, is designed to represent methylation calls from bisulfite sequencing or similar methods at single-base resolution across the genome.

The bedMethyl format extends the standard BED 9 format to include additional fields @@ -1071,31 +1071,31 @@ for more information and interpretation guidelines.

We would like to thank the authors of MutScore and M-CAP for creating and providing these data. We would also like to thank Max Haeussler and Lou Nassar for the development and release of these tracks.

July 15, 2025    ENCODE4 Long-read RNA-seq Transcripts

We are pleased to announce the release of the ENCODE4 long-read RNA-seq transcripts track for hg38 and mm10. -This track annotates trancripts using numerical triplets representing the +This track annotates transcripts using numerical triplets representing the identity of the start site, exon junction chain, and transcript end site of each transcript. This is presented alongside sample enrichment information to show how promoter selection, splice pattern, and 3’ processing are deployed across human tissues.

Transcripts are labeled with triplets, e.g. [1,1,1] or [1,1,3] or [2,1,3]. If transcripts share a number in any of the positions that means they share that feature, e.g. sharing a 8 in the second position but different numbers in the others means those two transcripts share the same set of exons, but different @@ -2592,56 +2592,56 @@

These variants are classified by EVA into one of the following sequence ontology terms:

substitution — A single nucleotide in the reference is replaced by another, alternate allele
deletion — One or more nucleotides are deleted. The representation in the database is to display one additional nucleotide in both the Reference field (Ref) and the Alternate Allele field (Alt). E.g. a variant that is a deletion of an A - maybe be represented as Ref = GA and Alt = G. + may be represented as Ref = GA and Alt = G.
insertion — One or more nucleotides are inserted. The representation in the database is to display one additional nucleotide in both the Reference field (Ref) and the Alternate Allele field (Alt). E.g. a variant that is an insertion of a T may be represented as Ref = G and Alt = GT
delins — Similar to a tandem repeat, in that the runs of Ref and Alt Alleles are of different length, except that there is more than one type of nucleotide, e.g., Ref = CCAAAAACAAAAACA, Alt = ACAAAAAC.
multipleNucleotideVariant — More than one nucleotide is substituted by an equal number of different nucleotides, e.g., Ref = AA, Alt = GC.
sequence alteration — A parent term meant to signify a deviation from another sequence. Can be assigned to variants that have not been characterized yet.

The variants have also been annotated with our Variant Annotation Integrator tool with functional classes such as synonymous variant, missense variant, stop gained, etc. For additional details on the track colors, as well as the filters and metadata on each variant, see the track description page.

We would like to thank the European Variation -Archive for making these data publically available. We would also like to thank Luis Nassar, Chris Lee, +Archive for making these data publicly available. We would also like to thank Luis Nassar, Chris Lee, and Angie Hinrichs for the creation and release of these tracks.

Jul. 12, 2024 First update to hg19's UCSC Genes track since 2013

The UCSC Genome Browser is getting ready to update hg19's UCSC Genes dataset for the first time since 2013. In this update, the UCSC Genes track will now use GENCODE v45 gene models lifted to hg19 and replace the old UCSC transcript IDs with the official GENCODE IDs.

The anticipated release date for this update is July 31, 2024.

As an example of what to expect, here are some GENCODE IDs that will replace the UCSC IDs in the @@ -3233,62 +3233,62 @@ target="_blank">hg38, hg19

mm39, mm10, mm9

The VISTA Enhancers track contains potential enhancers whose activity was experimentally validated in transgenic mice. Most of these non-coding elements were selected for testing based on their extreme conservation in other vertebrates or epigenomic evidence (ChIP-Seq) of putative enhancer marks. The goal of VISTA Enhancers project is to identify distant-acting transcriptional enhancers -in the human and mouse genomes. More information about can be found on the +in the human and mouse genomes. More information can be found on the VISTA Enhancer Browser website.

We would like to thank the Lawrence Berkeley National Laboratory and the VISTA Enhancer team for providing this data. We would also like to thank Gerardo Perez and Jairo Navarro for the creation and release of these tracks.

Nov. 30, 2023 Support for previous RefSeq transcripts while searching on hg38

Have you ever found a variant in a paper and searched for it on the Genome Browser only to receive an error that the sequence cannot be found? Or perhaps looked up a familiar NM_ identifier and suddenly found no results?

We are pleased to share that we now have support for searching previous RefSeq transcript versions on hg38. This support works for both NM_ accessions and HGVS searching as demonstrated below:

Searching for the latest transcript which always worked:

Sequence search: NM_198056.3
HGVS search: NM_198056.3:c.1A>C

-Searching for a previous version that now works:: +Searching for a previous version that now works:

Sequence search: NM_198056.2
HGVS search: NM_198056.2:c.1A>C

We thank NCBI and Terence Murphy for creating the archive of deprecated transcripts that allows this feature to work. We would also like to thank the users who wrote requesting the feature allowing us to prioritize it effectively. Finally, we would like to thank Chris Lee, Max Haeussler, Gerardo Perez, and Lou Nassar for developing and testing this feature.

@@ -3439,50 +3439,50 @@ target="_blank">(danRer11)

Variants are classified by EVA into one of the following sequence ontology terms:

substitution — A single nucleotide in the reference is replaced by another, alternate allele
deletion — One or more nucleotides is deleted. The representation in the database is to display one additional nucleotide in both the Reference field (Ref) and the Alternate Allele field (Alt). E.g. a variant that is a deletion of an A - maybe be represented as Ref = GA and Alt = G. + may be represented as Ref = GA and Alt = G.
insertion — One or more nucleotides is inserted. The representation in the database is to display one additional nucleotide in both the Reference field (Ref) and the Alternate Allele field (Alt). E.g. a variant that is an insertion of a T maybe be represented as Ref = G and Alt = GT
delins — Similar to tandemRepeat, in that the runs of Ref and Alt Alleles are of different length, except that there is more than one type of nucleotide, e.g., Ref = CCAAAAACAAAAACA, Alt = ACAAAAAC.
multipleNucleotideVariant — More than one nucleotide is substituted by an equal number of different nucleotides, e.g., Ref = AA, Alt = GC.
sequence alteration — A parent term meant to signify a deviation from another sequence. Can be assigned to variants that have not been characterized yet.

We would like to thank the European Variation -Archive making this data publically available. We would also like to thank Luis Nassar and Jairo +Archive making this data publicly available. We would also like to thank Luis Nassar and Jairo Navarro for the creation and release of these tracks.

Sep. 15, 2023 New COSMIC Track for hg38

We are pleased to announce the release of the new COSMIC track for hg38. The Catalogue Of Somatic Mutations In Cancer (COSMIC) is an online database of expert manually curated somatic mutation information relating to human cancers. This new track displays data from the COSMIC v98 release, which consists of 410,000 new genomic variants, 585,000 new coding mutations, 290,000 non-coding @@ -3784,50 +3784,50 @@ target="_blank">(danRer11)

Variants are classified by EVA into one of the following sequence ontology terms:

substitution — A single nucleotide in the reference is replaced by another, alternate allele
deletion — One or more nucleotides is deleted. The representation in the database is to display one additional nucleotide in both the Reference field (Ref) and the Alternate Allele field (Alt). E.g. a variant that is a deletion of an A - maybe be represented as Ref = GA and Alt = G. + may be represented as Ref = GA and Alt = G.
insertion — One or more nucleotides is inserted. The representation in the database is to display one additional nucleotide in both the Reference field (Ref) and the Alternate Allele field (Alt). E.g. a variant that is an insertion of a T maybe be represented as Ref = G and Alt = GT
delins — Similar to tandemRepeat, in that the runs of Ref and Alt Alleles are of different length, except that there is more than one type of nucleotide, e.g., Ref = CCAAAAACAAAAACA, Alt = ACAAAAAC.
multipleNucleotideVariant — More than one nucleotide is substituted by an equal number of different nucleotides, e.g., Ref = AA, Alt = GC.
sequence alteration — A parent term meant to signify a deviation from another sequence. Can be assigned to variants that have not been characterized yet.

Apr. 24, 2023 New DGV Gold Standard track for hg38

We are pleased to announce the addition of the new DGV Gold Standard track for hg38. The track displays curated variants from a selected number of studies in the Database of Genomic Variants (DGV) with a criterion that requires a variant to be found in at least two different studies and found in at least two different samples. More information on this track can be found on the track description page.

@@ -5980,31 +5980,31 @@ SARS-CoV-2 browser, the updated Variants of Concern (VOC) track. This data track includes amino acid and nucleotide annotations for 10 different COVID variants, including the Delta and Mu variants, mapped along the SARS-CoV-2 reference genome. These variants are classified by the WHO into several categories: Variants of Concern (VOC), Variants of Interest (VOI), and Variants under Investigation (VUM). These tracks help provide a more clear understanding of the mutations that comprise each named variant. This track's items also include links to Outbreak.info, providing geographic distibutions for each variant.

-The underlying data is publically accessible and compatible with many analysis tools, including +The underlying data is publicly accessible and compatible with many analysis tools, including our Table Browser, Data Integrator, and JSON API. More information on this track can be found on the Variants of Concern (VOC) track description page.

Oct. 18, 2021 Addition of GRCh38 patch 13 sequences to hg38

We are pleased to announce the addition of GRCh38 patch release 13 to the hg38 assembly. hg38 has been updated with patches since its release in 2013. The GRC patch releases do not change any previously existing sequences; they simply add new sequences for fix patches or alternate haplotypes that correspond to specific regions of the main chromosome sequences. For most users, the patches are unlikely to make a difference and may complicate the analysis as they introduce more duplication.

Feb. 09, 2026 Phased variants track for human (hg38 and hg19)

Dec. 15, 2025 Ancient Hominids track for hg38

Dec. 03, 2025 New gnomAD Missense Deleteriousness Prediction by Constraint (MPC) track for hg19

Sep. 24, 2025 CoLoRSdb small and structure variants for hg38 and hs1

Sep. 24, 2025 CoLoRSdb small and structural variants for hg38 and hs1

Aug. 01, 2025 PubTator Variants track for human, hg38 and hg19

July 31, 2025 New bedMethyl and bigMethyl track type

July 15, 2025 ENCODE4 Long-read RNA-seq Transcripts

Jul. 12, 2024 First update to hg19's UCSC Genes track since 2013

Nov. 30, 2023 Support for previous RefSeq transcripts while searching on hg38

Sep. 15, 2023 New COSMIC Track for hg38

Apr. 24, 2023 New DGV Gold Standard track for hg38

Oct. 18, 2021 Addition of GRCh38 patch 13 sequences to hg38