src/hg/makeDb/trackDb/human/homoSapiensChainNet.html addf10d08062993a60b45e52df8ddcdba57a8778

addf10d08062993a60b45e52df8ddcdba57a8778
hiram
  Fri Apr 1 16:46:52 2022 -0700
documentation for the human chainNets refs #29189

diff --git src/hg/makeDb/trackDb/human/homoSapiensChainNet.html src/hg/makeDb/trackDb/human/homoSapiensChainNet.html
new file mode 100644
index 0000000..e2eb84f
--- /dev/null
+++ src/hg/makeDb/trackDb/human/homoSapiensChainNet.html
@@ -0,0 +1,194 @@
+<H2>Description</H2>
+<P>
+This track shows regions of the human genome that are alignable
+to other Homo spiens genomes (&quot;chain&quot; subtracks) or in synteny (&quot;net&quot; subtracks).
+The alignable parts are shown with thick blocks that look like exons. 
+Non-alignable parts between these are shown with thin lines like introns.
+More description on this display can be found below.
+</P>
+
+<H3>Chain Track</H3>
+<P>
+The chain track shows alignments of the human genome to other
+Homo sapiens genomes using a gap scoring system that allows longer gaps 
+than traditional affine gap scoring systems. It can also tolerate gaps in both
+source and target assemblies simultaneously. These 
+&quot;double-sided&quot; gaps can be caused by local inversions and 
+overlapping deletions in both species. 
+<P>
+The chain track displays boxes joined together by either single or
+double lines. The boxes represent aligning regions.
+Single lines indicate gaps that are largely due to a deletion in the
+$o_organism assembly or an insertion in the $organism 
+assembly.  Double lines represent more complex gaps that involve substantial
+sequence in both species. This may result from inversions, overlapping
+deletions, an abundance of local mutation, or an unsequenced gap in one
+species.  In cases where multiple chains align over a particular region of
+the $organism genome, the chains with single-lined gaps are often 
+due to processed pseudogenes, while chains with double-lined gaps are more 
+often due to paralogs and unprocessed pseudogenes.</P> 
+<P>
+In the &quot;pack&quot; and &quot;full&quot; display
+modes, the individual feature names indicate the chromosome, strand, and
+location (in thousands) of the match for each matching alignment.</P>
+
+<H3>Net Track</H3>
+<P>
+The net track shows the best Homo sapiens chain for 
+every part of this target human genome. It is useful for
+finding syntenic regions, possibly orthologs, and for studying genome
+rearrangement.
+</P>
+
+<H2>Display Conventions and Configuration</H2>
+<H3>Chain Track</H3>
+<P>By default, the chains to chromosome-based assemblies are colored
+based on which chromosome they map to in the aligning organism. To turn
+off the coloring, check the &quot;off&quot; button next to: Color
+track based on chromosome.</P>
+<P>
+To display only the chains of one chromosome in the aligning
+organism, enter the name of that chromosome (e.g. chr4) in box next to: 
+Filter by chromosome.</P>
+
+<H3>Net Track</H3>
+<P>
+In full display mode, the top-level (level 1)
+chains are the largest, highest-scoring chains that
+span this region.  In many cases gaps exist in the
+top-level chain.  When possible, these are filled in by
+other chains that are displayed at level 2.  The gaps in 
+level 2 chains may be filled by level 3 chains and so
+forth. </P>
+<P>
+In the graphical display, the boxes represent ungapped 
+alignments; the lines represent gaps.  Click
+on a box to view detailed information about the chain
+as a whole; click on a line to display information
+about the gap.  The detailed information is useful in determining
+the cause of the gap or, for lower level chains, the genomic
+rearrangement. </P> 
+<P> 
+Individual items in the display are categorized as one of four types
+(other than gap):</P>
+<P><UL>
+<LI><B>Top</B> - the best, longest match. Displayed on level 1.
+<LI><B>Syn</B> - line-ups on the same chromosome as the gap in the level above
+it.
+<LI><B>Inv</B> - a line-up on the same chromosome as the gap above it, but in 
+the opposite orientation.
+<LI><B>NonSyn</B> - a match to a chromosome different from the gap in the 
+level above.
+</UL></P>
+
+<H2>Methods</H2>
+<H3>Chain track</H3>
+<P>
+The target and query genomes were aligned with lastz.
+The resulting alignments were converted into axt format using the lavToAxt
+program. The axt alignments were fed into axtChain, which organizes all
+alignments between a single query chromosome and a single
+target chromosome into a group and creates a kd-tree out
+of the gapless subsections (blocks) of the alignments. A dynamic program
+was then run over the kd-trees to find the maximally scoring chains of these
+blocks.
+
+<pre>
+#       A     C     G     T
+# A    90  -330  -236  -356
+# C  -330   100  -318  -236
+# G  -236  -318   100  -330
+# T  -356  -236  -330    90
+</pre>
+
+Chains scoring below a minimum score of &quot;5,000&quot; were discarded;
+the remaining chains are displayed in this track.  The linear gap
+matrix used with axtChain:<BR>
+
+<pre>
+tableSize   11
+smallSize  111
+position  1   2   3   11  111 2111  12111 32111  72111 152111  252111
+qGap    350 425 450  600  900 2900  22900 57900 117900 217900  317900
+tGap    350 425 450  600  900 2900  22900 57900 117900 217900  317900
+bothGap 750 825 850 1000 1300 3300  23300 58300 118300 218300  318300
+</pre>
+
+</P>
+
+<H3>Net track</H3>
+<P>
+Chains were derived from lastz alignments, using the methods
+described on the chain tracks description pages, and sorted with the 
+highest-scoring chains in the genome ranked first. The program
+chainNet was then used to place the chains one at a time, trimming them as 
+necessary to fit into sections not already covered by a higher-scoring chain. 
+During this process, a natural hierarchy emerged in which a chain that filled 
+a gap in a higher-scoring chain was placed underneath that chain. The program 
+netSyntenic was used to fill in information about the relationship between 
+higher- and lower-level chains, such as whether a lower-level
+chain was syntenic or inverted relative to the higher-level chain. 
+The program netClass was then used to fill in how much of the gaps and chains 
+contained <em>N</em>s (sequencing gaps) in one or both species and how much
+was filled with transposons inserted before and after the two organisms 
+diverged.</P>
+
+<H2>Credits</H2>
+<P>
+Lastz (previously known as blastz) was developed at
+<A HREF="http://www.bx.psu.edu/miller_lab/" 
+TARGET=_blank>Pennsylvania State University</A> by 
+Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from
+Ross Hardison.</P>
+<P>
+Lineage-specific repeats were identified by Arian Smit and his 
+<A HREF="http://www.repeatmasker.org" TARGET=_blank>RepeatMasker</A>
+program.</P>
+<P>
+The axtChain program was developed at the University of California at 
+Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler.</P>
+<P>
+The browser display and database storage of the chains and nets were created
+by Robert Baertsch and Jim Kent.</P>
+<P>
+The chainNet, netSyntenic, and netClass programs were
+developed at the University of California
+Santa Cruz by Jim Kent.</P>
+<P>
+
+<h2>References</h2>
+
+<p>
+Harris, R.S.
+<a href="http://www.bx.psu.edu/~rsharris/lastz/"
+target=_blank>(2007) Improved pairwise alignment of genomic DNA</a>
+Ph.D. Thesis, The Pennsylvania State University
+</p>
+
+<p>
+Chiaromonte F, Yap VB, Miller W.
+<A HREF="http://psb.stanford.edu/psb-online/proceedings/psb02/chiaromonte.pdf"
+TARGET=_blank>Scoring pairwise genomic sequence alignments</A>.
+<em>Pac Symp Biocomput</em>. 2002:115-26.
+PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/11928468" target="_blank">11928468</a>
+</p>
+
+<P>
+Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.
+<A HREF="https://www.pnas.org/content/100/20/11484"
+TARGET=_blank>Evolution's cauldron:
+duplication, deletion, and rearrangement in the mouse and human genomes</A>.
+<em>Proc Natl Acad Sci U S A</em>. 2003 Sep 30;100(20):11484-9.
+PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/14500911" target="_blank">14500911</a>; PMC: <a
+href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC208784/" target="_blank">PMC208784</a>
+</p>
+
+<P>
+Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,
+Haussler D, Miller W.
+<A HREF="https://genome.cshlp.org/content/13/1/103.abstract"
+TARGET=_blank>Human-mouse alignments with BLASTZ</A>.
+<em>Genome Res</em>. 2003 Jan;13(1):103-7.
+PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/12529312" target="_blank">12529312</a>; PMC: <a
+href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC430961/" target="_blank">PMC430961</a>
+</p>