dfe490296adf7fc6025c431b5f08a90706ccb2be
max
Tue May 13 03:19:05 2025 -0700
linsight track for hg19, refs #35730
diff --git src/hg/makeDb/trackDb/human/constraintSuper.html src/hg/makeDb/trackDb/human/constraintSuper.html
index 52ba63f67dd..19a058b1431 100644
--- src/hg/makeDb/trackDb/human/constraintSuper.html
+++ src/hg/makeDb/trackDb/human/constraintSuper.html
@@ -48,34 +48,46 @@
MetaDome - Tolerance Landscape Score (hg19 only):
MetaDome Tolerance Landscape scores are computed as a missense over synonymous
variant count ratio, which is calculated in a sliding window (with a size of 21
codons/residues) to provide
a per-position indication of regional tolerance to missense variation. The
variant database was gnomAD and the score corrected for codon composition. Scores
<0.7 are considered intolerant. This score covers only coding regions.
MTR - Missense Tolerance Ratio (hg19 only):
Missense Tolerance Ratio (MTR) scores aim to quantify the amount of purifying
selection acting specifically on missense variants in a given window of
protein-coding sequence. It is estimated across sliding windows of 31 codons
(default) and uses observed standing variation data from the WES component of
- gnomAD / the Exome Aggregation Consortium Database (ExAC), version 2.0. Scores
+ gnomAD version 2.0. Scores
were computed using Ensembl v95 release. The number of gnomAD 2 exomes used here
is higher than the number of gnomAD 3 samples (125 exoms versus 76k full genomes),
- but this score only covers coding regions.
+ and this score only covers coding regions so gnomAD 2 was more appropriate.
+
+
+ LINSIGHT (hg19 only):
+ LINSIGHT is a statistical model for estimating negative selection on
+ noncoding sequences in the human genome. The LINSIGHT score measures the
+ probability of negative selection on noncoding sites which can be used to
+ prioritize SNVs associated with genetic diseases or quantify evolutionary
+ constraint on regulatory sequences, e.g., enhancers or promoters. More
+ specifically, if a noncoding site is under negative selection, it will be
+ less likely to have a substitution or SNV in the human lineage. In
+ addition, even if we see a SNV at the site, it will tend to segregate at
+ low frequency because of selection. See (Huang et al, Nat Genet 2017).
UK Biobank depletion rank score (hg38 only):
Halldorsson et al. tabulated the number of UK Biobank variants in each
500bp window of the genome and compared this number to an expected number
given the heptamer nucleotide composition of the window and the fraction of
heptamers with a sequence variant across the genome and their mutational
classes. A variant depletion score was computed for every overlapping set
of 500-bp windows in the genome with a 50-bp step size. They then assigned
a rank (depletion rank (DR)) from 0 (most depletion) to 100 (least
depletion) for each 500-bp window. Since the windows are overlapping, we
plot the value only in the central 50bp of the 500bp window, following
advice from the author of the score,
Hakon Jonsson, deCODE Genetics. He suggested that the value of the central
window, rather than the worst possible score of all overlapping windows, is
@@ -310,15 +322,26 @@
Nucleic Acids Res. 2019 Jul 2;47(W1):W121-W126.
PMID: 31170280; PMC: PMC6602522
Halldorsson BV, Eggertsson HP, Moore KHS, Hauswedell H, Eiriksson O, Ulfarsson MO, Palsson G,
Hardarson MT, Oddsson A, Jensson BO et al.
The sequences of 150,119 genomes in the UK Biobank.
Nature. 2022 Jul;607(7920):732-740.
PMID: 35859178; PMC: PMC9329122
+
+
+Huang YF, Gulko B, Siepel A.
+
+Fast, scalable prediction of deleterious noncoding variants from functional and population genomic
+data.
+Nat Genet. 2017 Apr;49(4):618-624.
+PMID: 28288115; PMC: PMC5395419
+
+