src/hg/makeDb/trackDb/human/revel.html 17ade49f24bb4fb6398e1d5b508a35a770329a6c

17ade49f24bb4fb6398e1d5b508a35a770329a6c
dschmelt
  Tue May 31 12:58:25 2022 -0700
Minor grammar edits to revel overlaps refs #29475

diff --git src/hg/makeDb/trackDb/human/revel.html src/hg/makeDb/trackDb/human/revel.html
index d504510..e31cfbb 100644
--- src/hg/makeDb/trackDb/human/revel.html
+++ src/hg/makeDb/trackDb/human/revel.html
@@ -20,70 +20,69 @@
 diagnostics. But to give an idea of the meaning of the score value, the REVEL
 authors note: "For example, 75.4% of disease mutations but only 10.9% of
 neutral variants (and 12.4% of all ESVs) have a REVEL score above 0.5,
 corresponding to a sensitivity of 0.754 and specificity of 0.891. Selecting a
 more stringent REVEL score threshold of 0.75 would result in higher specificity
 but lower sensitivity, with 52.1% of disease mutations, 3.3% of neutral
 variants, and 4.1% of all ESVs being classified as pathogenic". (Figure S1 of
 the reference below)
 </p>
 
 <h2>Display Conventions and Configuration</h2>
 <p>
 There are five subtracks for this track:
 <ul>
 <li>
-<p>Four subtracks, one for every nucleotide showing
-a score for the mutation represented by a mutation from the reference to that
-nucleotide. All subtracks show the REVEL ensemble score on mouseover, representing
-each of the possible &#126;9 billion SNVs in the genome. In rare cases, two scores
-are output for a genome position. This happens when there are two transcripts with
-different splicing patterns and since some input scores for REVEL take into account
-the sequence context, the same mutation can get two different scores. In these cases,
-only the maximum score is shown in the four per-nucleotide subtracks.
-</p>
-<p>
-For single nucleotide variants (SNV), at every
-genome position, there are three values per position, one for every possible
+<p>Four lettered subtracks, one for every nucleotide, showing
+scores for mutation from the reference to that
+nucleotide. All subtracks show the REVEL ensemble score on mouseover. Across the exome, 
+there are three values per position, one for every possible
 nucleotide mutation. The fourth value, &quot;no mutation&quot;, representing
 the reference allele, e.g. A to A, is always set to zero, "0.0". REVEL only
 takes into account amino acid changes, so a nucleotide change that results in no
 amino acid change (synonymous) also receives the score "0.0". 
+</p><p>
+In rare cases, two scores are output for the same variant at a 
+genome position. This happens when there are two transcripts with
+different splicing patterns and since some input scores for REVEL take into account
+the sequence context, the same mutation can get two different scores. In these cases,
+only the maximum score is shown in the four per-nucleotide subtracks. The complete set of 
+scores are shown in the Overlaps track.
 </p>
 
 <li>
-<p>One subtrack for duplicated scores. There are rare cases where multiple scores are possible
-at a genome position, due to multiple, overlapping transcripts. For example, if there are 
-two transcript and one covers only half on an exon, then the first amino acids
-of this exon will get two different REVEL scores, since some of the underlying 
-scores (polyPhen, for example), take into account the amino acid sequence context, and 
-this context is different, depending on the transcript.
-For these cases, the last subtrack contains at least two
+<p>One subtrack, Overlaps, shows alternate REVEL scores when applicable. 
+In rare cases (0.05% of genome positions), multiple scores exist with a single variant, 
+due to multiple, overlapping transcripts. For example, if there are 
+two transcripts and one covers only half of an exon, then the amino acids
+that overlap both transcripts will get two different REVEL scores, since some of the underlying 
+scores (polyPhen for example) take into account the amino acid sequence context and 
+this context is different depending on the transcript.
+For these cases, this subtrack contains at least two
 graphical features, for each affected genome position. Each feature is labeled
-with the mutation (A, C, T or G) and the transcript IDs and resulting score for
-this transcript is shown when hovering the mouse over the feature or clicking
-it. For the large majority of the genome, this subtrack has no features,
-because REVEL usually output only a single score per nucleotide, as the most
-genome positions the transcript-derived amino acid sequence context is 
-identical.
+with the mutation (A, C, T or G). The transcript IDs and resulting score is 
+shown when hovering over the feature or clicking
+it. For the large majority of the genome, this subtrack has no features.
+This is because REVEL usually outputs only a single score per nucleotide and 
+most transcript-derived amino acid sequence contexts are identical.
 </p>
 <p>
 Note that in most diagnostic assays, variants are called using WGS
 pipelines, not RNA-seq. As a result, variants are originally located on the
-genome, not on transcripts, and a choice of the transcript is possibly made by
+genome, not on transcripts, and the choice of transcript is made by
 a variant calling software using a heuristic. In addition, clinically, in the
-field, some transcripts have been agreed as more relevant for a disease, e.g.
+field, some transcripts have been agreed-on as more relevant for a disease, e.g.
 because only certain transcripts may be expressed in the relevant tissue. So
 the choice of the most relevant transcript, and as such the REVEL score, may be
 a question of manual curation standards rather than a result of the variant itself.
 </p>
 </ul>
 
 <p>
 When using this track, zoom in until you can see every basepair at the
 top of the display. Otherwise, there are several nucleotides per pixel under 
 your mouse cursor and no score will be shown on the mouseover tooltip.
 </p>
 
 <p>For hg38, note that the data was converted from the hg19 data using the UCSC
 liftOver program, by the REVEL authors. This can lead to missing values or
 duplicated values. When a hg38 position is annotated with two scores due to the