f3ac6bdaf95d5e9a70554ffbff762b1d5229098f jnavarr5 Wed Jun 18 16:41:01 2025 -0700 Updating the revel track from feedback in MLQ #35933 diff --git src/hg/makeDb/trackDb/human/revel.html src/hg/makeDb/trackDb/human/revel.html index 7d6d994807d..f919a61de89 100644 --- src/hg/makeDb/trackDb/human/revel.html +++ src/hg/makeDb/trackDb/human/revel.html @@ -1,166 +1,174 @@
This track collection shows Rare Exome Variant Ensemble Learner (REVEL) scores that can be -used as evidence for pathogenicity annotations. +used as evidence for pathogenicity classifications.
REVEL is an ensemble method for predicting a score for missense variants based on a combination of scores from 13 individual tools: MutPred, FATHMM v2.3, VEST 3.0, PolyPhen-2, SIFT, PROVEAN, MutationAssessor, MutationTaster, LRT, GERP++, SiPhy, phyloP, and phastCons. REVEL was trained using recently discovered pathogenic and rare neutral missense variants, excluding those previously used to train its constituent tools. The REVEL score for an individual missense variant can range from 0 to 1, with higher scores reflecting greater likelihood that the variant is -disease-causing. +damaging.
Most authors of deleteriousness scores argue against using fixed cutoffs in diagnostics. But to give an idea of the meaning of the score value, the REVEL authors note: "For example, 75.4% of disease mutations but only 10.9% of neutral variants (and 12.4% of all ESVs) have a REVEL score above 0.5, corresponding to a sensitivity of 0.754 and specificity of 0.891. Selecting a more stringent REVEL score threshold of 0.75 would result in higher specificity but lower sensitivity, with 52.1% of disease mutations, 3.3% of neutral variants, and 4.1% of all ESVs being classified as pathogenic". (Figure S1 of the reference below)
There are five subtracks for this track:
Four lettered subtracks, one for every nucleotide, showing -scores for mutation from the reference to that +scores for the variant from the reference to that nucleotide. All subtracks show the REVEL ensemble score on mouseover. Across the exome, there are three values per position, one for every possible -nucleotide mutation. The fourth value, "no mutation", representing +nucleotide variant. The fourth value, "no variant", representing the reference allele, e.g. A to A, is always set to zero, "0.0". REVEL only -takes into account amino acid changes, so a nucleotide change that results in no +takes into account amino acid changes, so a nucleotide variant that predicts no amino acid change (synonymous) also receives the score "0.0".
In rare cases, two scores are output for the same variant at a genome position. This happens when there are two transcripts with -different splicing patterns and since some input scores for REVEL take into account -the sequence context, the same mutation can get two different scores. In these cases, +distinct splicing patterns and since some input scores for REVEL take into account +the sequence context, the same variant can get two different scores. In these cases, only the maximum score is shown in the four per-nucleotide subtracks. The complete set of scores are shown in the Overlaps track.
One subtrack, Overlaps, shows alternate REVEL scores when applicable. In rare cases (0.05% of genome positions), multiple scores exist with a single variant, due to multiple, overlapping transcripts. For example, if there are two transcripts and one covers only half of an exon, then the amino acids -that overlap both transcripts will get two different REVEL scores, since some of the underlying +that overlap both transcripts will get two distinct REVEL scores, since some of the underlying scores (polyPhen for example) take into account the amino acid sequence context and this context is different depending on the transcript. For these cases, this subtrack contains at least two graphical features, for each affected genome position. Each feature is labeled -with the mutation (A, C, T or G). The transcript IDs and resulting score is +with the reference or variant (A, C, T or G). The transcript IDs and resulting score is shown when hovering over the feature or clicking it. For the large majority of the genome, this subtrack has no features. This is because REVEL usually outputs only a single score per nucleotide and most transcript-derived amino acid sequence contexts are identical.
-Note that in most diagnostic assays, variants are called using WGS +Note that in most diagnostic testing scenarios, variants are called using WGS pipelines, not RNA-seq. As a result, variants are originally located on the genome, not on transcripts, and the choice of transcript is made by a variant calling software using a heuristic. In addition, clinically, in the field, some transcripts have been agreed-on as more relevant for a disease, e.g. because only certain transcripts may be expressed in the relevant tissue. So the choice of the most relevant transcript, and as such the REVEL score, may be a question of manual curation standards rather than a result of the variant itself.
++Note further that these thresholds represent the recommended score +cutoffs for genes with no Variant Curation Expert Panel (VCEP) rules. +For genes with published VCEP rules, the VCEP might +select different thresholds which are adjusted for the frequency of the +relevant disorders. These are available in the ClinGen Criteria +Specification.
When using this track, zoom in until you can see every basepair at the top of the display. Otherwise, there are several nucleotides per pixel under your mouse cursor and no score will be shown on the mouseover tooltip.
Track colors
-This track is colored according to Table 2 in Pejaver et al. The colors represent the recommended ACMG/AMP score cutoffs. +This track is colored according to Table 2 in Pejaver et al. The colors represent the recommended +ClinGen score cutoffs.
Range | Classification |
---|---|
≥ 0.644 | Pathogenic_supporting |
0.643 - 0.291 | Neutral |
≤ 0.290 | Benign_supporting |
More details on these scoring ranges can be found in Bergquist et al. Genet Med 2025, Table 2:
For hg38, note that the data was converted from the hg19 data using the UCSC +
For hg38, note that the data were converted from the hg19 data using the UCSC
liftOver program, by the REVEL authors. This can lead to missing values or
duplicated values. When a hg38 position is annotated with two scores due to the
lifting, the authors removed all the scores for this position. They did the same when
-the reference allele has changed from hg19 to hg38. Also, on hg38, the track has
+the reference nucleotide has changed from hg19 to hg38. Also, on hg38, the track has
the "lifted" icon to indicate
this . You can double-check if a nucleotide
position is possibly affected by the lifting procedure by activating the track
"Hg19 Mapping" under "Mapping and Sequencing".
REVEL scores are available at the REVEL website. The site provides precomputed REVEL scores for all possible human missense variants to facilitate the identification of pathogenic variants among the large number of rare variants discovered in sequencing studies.
The REVEL data on the UCSC Genome Browser can be explored interactively with the
Table Browser or the
Data Integrator. The previous overlap bigBed version file is
available in the
archives of our downloads server.
For automated download and analysis, the genome annotation is stored at UCSC in bigWig
files that can be downloaded from
our download server.
The files for this track are called a.bw, c.bw, g.bw, t.bw. Individual
-regions or the whole genome annotation can be obtained using our tool bigWigToWig
+regions or the genome annotation can be obtained using our tool bigWigToWig
which can be compiled from the source code or downloaded as a precompiled
binary for your system. Instructions for downloading source code and binaries can be found
here.
The tools can also be used to obtain features confined to given range, e.g.
bigWigToBedGraph -chrom=chr1 -start=100000 -end=100500 http://hgdownload.soe.ucsc.edu/gbdb/$db/revel/a.bw stdout
Data were converted from the files provided on the REVEL Downloads website. As with all other tracks,