0906b687e064630d4f016448385f5742ddb409fe
jnavarr5
  Fri Apr 2 15:19:31 2021 -0700
Adding a FAQ about single or more than 3 alleles for a SNP, refs #27313

diff --git src/hg/htdocs/FAQ/FAQdownloads.html src/hg/htdocs/FAQ/FAQdownloads.html
index 6f886dd..c2e92cd 100755
--- src/hg/htdocs/FAQ/FAQdownloads.html
+++ src/hg/htdocs/FAQ/FAQdownloads.html
@@ -34,30 +34,31 @@
 <li><a href="#download28">Converting genome coordinates between assemblies</a></li>
 <li><a href="#download33">Linking gene name with accession number</a></li>
 <li><a href="#download31">Obtaining a list of Known Genes</a></li>
 <li><a href="#download16">Repeat-masking data</a></li>
 <li><a href="#download17">Availability of repeat-masked data</a></li>
 <li><a href="#download24">RepeatMasker version differences - UCSC vs. Repeatmasker website</a></li> 
 <li><a href="#download18">Obtaining promoter sequence</a></li>
 <li><a href="#download19">Data from Evolutionary Conservation Score tracks</a></li>
 <li><a href="#download20">Minus strand coordinates - axtNet files</a></li>
 <li><a href="#download21">Mapping UCSC STS marker IDS to those of other groups</a></li>
 <li><a href="#download22">deCODE map data</a></li>
 <li><a href="#download29">Direct MariaDB (MySQL) access to data</a></li>
 <li><a href="#download34">Name of fourth column in BED output</a></li>
 <li><a href="#download36">Track data access</a></li>
 <li><a href="#snp">How do I download dbSNP data?</a></li>
+<li><a href="#snpAlleles">Why doesn't this SNP have two alleles?</a></li>
 <li><a href="#download37">Known issues with Table Browser GTF output</a></li>
 <li><a href="#download38">Table Browser output file not ordered</a></li>
 <li><a href="#download39">'Permisssion denied' error when trying to use command-line utilities</a></li>
 <li><a href="#download40">Restricted Track Data</a></li>
 <li><a href="#downloadAnalysis">What is the genome analysis set?</a></li>
 </ul>
 <hr>
 <p>
 <a href="index.html">Return to FAQ Table of Contents</a></p>
 
 <a name="download1"></a>
 <h2>Downloading sequence and annotation data</h2>
 <h6>How do I obtain the sequence and/or annotation data for a release?</h6>
 <p> 
 Sequence and annotation data downloads are usually made available within the first week of the 
@@ -1031,30 +1032,57 @@
 that can be used to retrieve values from a particular chromosome range.
 A list of rs# IDs can also be pasted/uploaded in the
 <a href="/cgi-bin/hgVai" target=_blank>Variant Annotation Integrator</a>
 tool in order to find out which genes (if any) the variants are located in,
 as well as functional effect such as intron, coding-synonymous, missense, frameshift, etc.
 </p><p>
 See our searchable
 <A HREF="https://groups.google.com/a/soe.ucsc.edu/forum/?hl=en&fromgroups#!search/download+snps"
 target=_blank>mailing list archives</a>
 for more information and example queries. We also have information on
 <a href="http://genome.ucsc.edu/blog/">our blog</a> about
 <a href="http://genome.ucsc.edu/blog/?s=programmatic"> Accessing the Genome Browser Programmatically</a>
 to acquire data.
 </p>
 
+<a name="snpAlleles"></a>
+<h2>Why doesn't this SNP have two alleles?</h2>
+<p>
+When using the SNP tracks, some records may contain information about one or more alleles instead of
+the usual two alleles for the SNP. The following information information should explain how this is
+possible.</p>
+<dl>
+  <dt>One allele (i.e. reference only):</dt>
+  <dd>
+    The human genome reference has gone through many different assembly versions. The reference
+    genome has always been a mosaic of sequences from multiple individuals, so it contains some
+    rare or singleton mutations and is not entirely free of errors. Some SNPs were discovered on
+    previous assembly versions, and the latest assembly version has the corrected or common allele,
+    which turns out to be the only observed allele (so the SNP was an artifact of the reference
+    assembly having a rare mutation or error in the past, not a real SNP).</dd>
+  <dt>Three alleles:</dt>
+  <dd>
+    It's rare, but possible, for the same base to be mutated to different values in different
+    people.</dd>
+  <dt>Four alleles:</dt>
+  <dd>
+    This would be even rarer than three alleles. In the past, it has often been a symptom of strand
+    errors, for example, the same variant is reported separately as A/G on the forward strand and
+    C/T on the reverse strand, but then the strand information being lost in processing and the
+    reports merged to A/C/G/T.</dd>
+</dl>
+
 <a name="download37"></a>
 <h2>Obtaining GTF (Gene Transfer Format)</h2>
 <h6>What is the best method for obtaining GTF output?</h6>
 <p>
 Currently, the <a href="../cgi-bin/hgTables">Table Browser</a> option return data in
 <a href="../FAQ/FAQformat.html#format4">GTF format</a> is limited as explained below.
 To convert custom GenePred format data into GTF, the best method is to use the 
 command-line format conversion utility, <code>genePredToGtf</code>. This can optionally be set up 
 to automatically connect to the UCSC public SQL database and return GTF files in a few minutes using 
 <a href="http://genomewiki.ucsc.edu/index.php/Genes_in_gtf_or_gff_format#Using_kent_commands_with_the_public_database_server">
 this short guide</a>.</p>
 <p>
 For simplicity, GTF files have been generated using the <code>genePredToGtf</code> method 
 described above and are available on our download server for the main gene transcript sets.
 These can be found at the following download server address: