d5f33c3bbb34ad7fd6c08714156b9dfbf292d3bf
jnavarr5
  Tue May 5 16:26:38 2020 -0700
Adding a FAQ about the asmEquivalent table, refs #21074

diff --git src/hg/htdocs/FAQ/FAQreleases.html src/hg/htdocs/FAQ/FAQreleases.html
index 59b3caa..c45e048 100755
--- src/hg/htdocs/FAQ/FAQreleases.html
+++ src/hg/htdocs/FAQ/FAQreleases.html
@@ -310,30 +310,64 @@
 Browser?</h6>
 <p> 
 All the assembly data displayed in the UCSC Genome Browser are obtained from external sequencing 
 centers. To determine the data source and version for a given assembly, see the assembly's 
 description on the Genome Browser <a href="../cgi-bin/hgGateway">Gateway</a> page or the 
 <a href="#release1">List of UCSC Genome Releases</a>.</p>
 <p>
 The annotations accompanying an assembly are obtained from a variety of sources. The UCSC Genome 
 Bioinformatics Group generates several of the tracks; the remainder are contributed by collaborators
 at other sites. Each track has an associated description page that credits the authors of the 
 annotation.</p> 
 <p>
 For detailed information about the individuals and organizations who contributed to a specific 
 assembly, see the <a href="../goldenPath/credits.html">Credits</a> page.</p>
 
+<a name="asmEquivalent"></a>
+<h6>Which UCSC assemblies are equivalent to Ensembl or NCBI assemblies?</h6>
+<p>
+The asmEquivalent table on the hgFixed database is available on the public MySQL server to show
+which assemblies versions are identical (or almost identical) to each other between UCSC, Ensembl,
+Genbank, and RefSeq assemblies.</p>
+<pre>
+mysql --user=genome --host=genome-mysql.soe.ucsc.edu -A -e 'desc asmEquivalent;' hgFixed
++----------------------+-------------------------------------------+
+| Field                | Type                                      |
++----------------------+-------------------------------------------+
+| source               | varchar(255)                              |
+| destination          | varchar(255)                              |
+| sourceAuthority      | enum('ensembl','ucsc','genbank','refseq') |
+| destinationAuthority | enum('ensembl','ucsc','genbank','refseq') |
+| matchCount           | bigint(20)                                |
+| sourceCount          | bigint(20)                                |
+| destinationCount     | bigint(20)                                |
++----------------------+-------------------------------------------+</pre>
+<p>
+The &quot;Count&quot; indications are the count of individual sequences in the assembly. When all
+three counts are identical, <code>matchCount == sourceCount == destinationCount</code>, then the
+match between genome assemblies is perfectly identical.</p>
+<p>
+Non-perfect matches can be due to a number of factors:
+  <ol>
+    <li>different or not included chrMT genome sequences in an assembly</li>
+    <li>identical duplicated sequences present or absent from an assembly</li>
+    <li>some smaller contigs not included in an assembly</li>
+    <li>slight differences in versions of assemblies where some contain sequences not in the other
+        assembly</li>
+  </ol>
+</p>
+
 <a name="release4"></a>
 <h2>Comparison of UCSC and NCBI human assemblies</h2>
 <h6>How do the human assemblies displayed in the UCSC Genome Browser differ from the NCBI human 
 assemblies?</h6>
 <p> 
 Human assemblies displayed in the Genome Browser (hg10 and higher) are near identical to the 
 NCBI assemblies when it comes to primary sequence. Minor differences may be present, however.
 Sources include:</p>
 <ul>
   <li>NCBI genomes are repeat masked with RepeatMasker, however, UCSC genomes are independently 
       masked with both RepeatMasker (with different flags) and WindowMasker, ultimately using 
       the program output with the highest percentage masked for the base sequence</li>
   <li>In genome download files, UCSC uses the 'chr1' nomenclature for sequence identifiers, 
       whereas the primary NCBI sequence identifiers are RefSeq accessions</li>
   <li>The mitochondrion for hg19 differs from the one in NCBI (GRCh37)</li>