22f0c874d90f86f1bfa4014ee29f9c0ee0bff517 markd Tue Jun 25 23:17:29 2019 -0700 load transMap V5 diff --git src/hg/makeDb/trackDb/transMapAlnSplicedEst.html src/hg/makeDb/trackDb/transMapAlnSplicedEst.html deleted file mode 100644 index 8373a8e..0000000 --- src/hg/makeDb/trackDb/transMapAlnSplicedEst.html +++ /dev/null @@ -1,133 +0,0 @@ -<H2>Description</H2> - -<P> -This track contains GenBank spliced EST alignments produced by -the <em>TransMap</em> cross-species alignment algorithm -from other vertebrate species in the UCSC Genome Browser. -For closer evolutionary distances, the alignments are created using -syntenically filtered BLASTZ alignment chains, resulting in a prediction of the -orthologous genes in $organism. -</P> - -<!-- everything below here common to all transMap*.html pages --> - -<em>TransMap</em> maps genes and related annotations in one species to another -using synteny-filtered pairwise genome alignments (chains and nets) to -determine the most likely orthologs. For example, for the mRNA TransMap track -on the human assembly, more than 400,000 mRNAs from 23 vertebrate species were -aligned at high stringency to the native assembly using BLAT. The alignments -were then mapped to the human assembly using the chain and net alignments -produced using blastz, which has higher sensitivity than BLAT for diverged -organisms. -<P> -Compared to translated BLAT, TransMap finds fewer paralogs and aligns more UTR -bases. For closely related low-coverage assemblies, a reciprocal-best -relationship is used in the chains and nets to improve the synteny prediction. -<P> - -<H2>Display Conventions and Configuration</H2> - -<P> -This track follows the display conventions for -<A HREF="../goldenPath/help/hgTracksHelp.html#PSLDisplay" -TARGET=_blank>PSL alignment tracks</A>. </P> -<P> -This track may also be configured to display codon coloring, a feature that -allows the user to quickly compare cDNAs against the genomic sequence. For more -information about this option, click -<A HREF="../goldenPath/help/hgCodonColoringMrna.html" TARGET=_blank>here</A>. -Several types of alignment gap may also be colored; -for more information, click -<A HREF="../goldenPath/help/hgIndelDisplay.html" TARGET=_blank>here</A>. - -<H2>Methods</H2> - -<P> - <ol> - <li> Source transcript alignments were obtained from vertebrate organisms - in the UCSC Genome Browser Database. BLAT alignments of RefSeq Genes, GenBank - mRNAs, and GenBank Spliced ESTs to the cognate genome, along with UCSC Genes, - were used as available. - <li> For all vertebrate assemblies that had BLASTZ alignment chains and - nets to the $organism ($db) genome, a subset of the alignment chains were - selected as follows: - <ul> - <li> For organisms whose branch distance was no more than 0.5 - (as computed by <tt>phyloFit</tt>, see Conservation track description for details), - syntenic filtering was used. Reciprocal best nets were used if available; - otherwise, nets were selected with the <tt>netfilter -syn</tt> command. - The chains corresponding to the selected nets were used for mapping. - <li> For more distant species, where the determination of synteny is difficult, - the full set of chains was used for mapping. This allows for more genes to - map at the expense of some mapping to paralogus regions. The - post-alignment filtering step removes some of the duplications. - </ul> - <li> The <tt>pslMap</tt> program was used to do a base-level projection of - the source transcript alignments via the selected chains - to the $organism genome, resulting in pairwise alignments of the source transcripts to - the genome. - <li> The resulting alignments were filtered with <tt>pslCDnaFilter</tt> - with a global near-best criteria of 0.5% in finished genomes - (human and mouse) and 1.0% in other genomes. Alignments - where less than 20% of the transcript mapped were discarded. - </ol> -</P> - -<P> -To ensure unique identifiers for each alignment, cDNA and gene accessions were -made unique by appending a suffix for each location in the source genome and -again for each mapped location in the destination genome. The format is: -<pre> - accession.version-srcUniq.destUniq -</pre> - -Where <tt>srcUniq</tt> is a number added to make each source alignment unique, and -<tt>destUniq</tt> is added to give the subsequent TransMap alignments unique -identifiers. -</P> -<P> -For example, in the cow genome, there are two alignments of mRNA <tt>BC149621.1</tt>. -These are assigned the identifiers <tt>BC149621.1-1</tt> and <tt>BC149621.1-2</tt>. -When these are mapped to the human genome, <tt>BC149621.1-1</tt> maps to a single -location and is given the identifier <tt>BC149621.1-1.1</tt>. However, <tt>BC149621.1-2</tt> -maps to two locations, resulting in <tt>BC149621.1-2.1</tt> and <tt>BC149621.1-2.2</tt>. Note -that multiple TransMap mappings are usually the result of tandem duplications, where both -chains are identified as syntenic. -</P> - -<H2>Credits</H2> - -<P> -This track was produced by Mark Diekhans at UCSC from cDNA sequence data -submitted to the international public sequence databases by -scientists worldwide.</P> - -<H2>References</H2> -<p> -Siepel A, Diekhans M, Brejová B, Langton L, Stevens M, Comstock CL, Davis C, Ewing B, Oommen S, -Lau C <em>et al</em>. -<a href="https://genome.cshlp.org/content/17/12/1763.long" target="_blank"> -Targeted discovery of novel human exons by comparative genomics</a>. -<em>Genome Res</em>. 2007 Dec;17(12):1763-73. -PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/17989246" target="_blank">17989246</a>; PMC: <a -href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2099585/" target="_blank">PMC2099585</a> -</p> - -<p> -Stanke M, Diekhans M, Baertsch R, Haussler D. -<a href="https://academic.oup.com/bioinformatics/article/24/5/637/202844/Using-native-and- -syntenically-mapped-cDNA" target="_blank"> -Using native and syntenically mapped cDNA alignments to improve de novo gene finding</a>. -<em>Bioinformatics</em>. 2008 Mar 1;24(5):637-44. -PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/18218656" target="_blank">18218656</a> -</p> - -<p> -Zhu J, Sanborn JZ, Diekhans M, Lowe CB, Pringle TH, Haussler D. -<a href="http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.0030247" -target="_blank"> -Comparative genomics search for losses of long-established genes on the human lineage</a>. -<em>PLoS Comput Biol</em>. 2007 Dec;3(12):e247. -PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/18085818" target="_blank">18085818</a>; PMC: <a -href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2134963/" target="_blank">PMC2134963</a> -</p>