bb98a72134bf91feda97ed3cb163a7b6880f2eeb lrnassar Fri Jul 5 10:41:19 2019 -0700 Adding versioning files for transMap track #23729 diff --git src/hg/makeDb/trackDb/transMapTailerV5.html src/hg/makeDb/trackDb/transMapTailerV5.html new file mode 100644 index 0000000..2d96382 --- /dev/null +++ src/hg/makeDb/trackDb/transMapTailerV5.html @@ -0,0 +1,139 @@ + +

Display Conventions and Configuration

+ +

+This track follows the display conventions for +PSL alignment tracks.

+

+This track may also be configured to display codon coloring, a feature that +allows the user to quickly compare cDNAs against the genomic sequence. For more +information about this option, click +here. +Several types of alignment gap may also be colored; +for more information, click +here. + +

Methods

+ +

+

    +
  1. Source transcript alignments were obtained from vertebrate organisms + in the UCSC Genome Browser Database. BLAT alignments of RefSeq Genes, GenBank + mRNAs, and GenBank Spliced ESTs to the cognate genome, along with UCSC Genes, + were used as available. +
  2. For all vertebrate assemblies that had BLASTZ alignment chains and + nets to the $organism ($db) genome, a subset of the alignment chains were + selected as follows: + +
  3. The pslMap program was used to do a base-level projection of + the source transcript alignments via the selected chains + to the $organism genome, resulting in pairwise alignments of the source transcripts to + the genome. +
  4. The resulting alignments were filtered with pslCDnaFilter + with a global near-best criteria of 0.5% in finished genomes + (human and mouse) and 1.0% in other genomes. Alignments + where less than 20% of the transcript mapped were discarded. +
+

+ +

+To ensure unique identifiers for each alignment, cDNA and gene accessions were +made unique by appending a suffix for each location in the source genome and +again for each mapped location in the destination genome. The format is: +

+   accession.version-srcUniq.destUniq
+
+ +Where srcUniq is a number added to make each source alignment unique, and +destUniq is added to give the subsequent TransMap alignments unique +identifiers. +

+

+For example, in the cow genome, there are two alignments of mRNA BC149621.1. +These are assigned the identifiers BC149621.1-1 and BC149621.1-2. +When these are mapped to the human genome, BC149621.1-1 maps to a single +location and is given the identifier BC149621.1-1.1. However, BC149621.1-2 +maps to two locations, resulting in BC149621.1-2.1 and BC149621.1-2.2. Note +that multiple TransMap mappings are usually the result of tandem duplications, where both +chains are identified as syntenic. +

+ +

Data Access

+ +

+The raw data for these tracks can be accessed interactively through the +Table Browser or the +Data Integrator. +For automated analysis, the annotations are stored in +bigPsl files (containing a +number of extra columns) and can be downloaded from our +download server, +or queried using our API. For more +information on accessing track data see our +Track Data Access FAQ. +The files are associated with these tracks in the following way: +

+Individual regions or the whole genome annotation can be obtained using our tool +bigBedToBed which can be compiled from the source code or downloaded as +a precompiled binary for your system. Instructions for downloading source code and +binaries can be found +here. +The tool can also be used to obtain only features within a given range, for example: +

+bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/$db/transMap/V5/$db.refseq.transMapV5.bigPsl +-chrom=chr6 -start=0 -end=1000000 stdout + + +

Credits

+ +

+This track was produced by Mark Diekhans at UCSC from cDNA and EST sequence data +submitted to the international public sequence databases by +scientists worldwide and annotations produced by the RefSeq, +Ensembl, and GENCODE annotations projects.

+ +

References

+

+Siepel A, Diekhans M, Brejová B, Langton L, Stevens M, Comstock CL, Davis C, Ewing B, Oommen S, +Lau C et al. + +Targeted discovery of novel human exons by comparative genomics. +Genome Res. 2007 Dec;17(12):1763-73. +PMID: 17989246; PMC: PMC2099585 +

+ +

+Stanke M, Diekhans M, Baertsch R, Haussler D. + +Using native and syntenically mapped cDNA alignments to improve de novo gene finding. +Bioinformatics. 2008 Mar 1;24(5):637-44. +PMID: 18218656 +

+ +

+Zhu J, Sanborn JZ, Diekhans M, Lowe CB, Pringle TH, Haussler D. + +Comparative genomics search for losses of long-established genes on the human lineage. +PLoS Comput Biol. 2007 Dec;3(12):e247. +PMID: 18085818; PMC: PMC2134963 +