src/hg/makeDb/trackDb/wgEncodeNhgriBip.html 1.5

1.5 2009/07/17 22:26:25 tdreszer
Added data release policy
Index: src/hg/makeDb/trackDb/wgEncodeNhgriBip.html
===================================================================
RCS file: /projects/compbio/cvsroot/kent/src/hg/makeDb/trackDb/wgEncodeNhgriBip.html,v
retrieving revision 1.4
retrieving revision 1.5
diff -b -B -U 1000000 -r1.4 -r1.5
--- src/hg/makeDb/trackDb/wgEncodeNhgriBip.html	7 Jul 2009 18:23:26 -0000	1.4
+++ src/hg/makeDb/trackDb/wgEncodeNhgriBip.html	17 Jul 2009 22:26:25 -0000	1.5
@@ -1,124 +1,133 @@
 <h2> Description </h2>
 <p>Bidirectional promoters are the regulatory regions that fall between
 pairs of genes, where the 5' ends of the genes within a pair are positioned
 in close proximity to one another. This spacing facilitates the initiation
 of transcription of both genes, creating two transcription forks that advance
 in opposite directions. The formal definition of a bidirectional promoter
 requires that the transcription initiation sites are separated by no more than
 1,000 bp from one another. Using these criteria we have comprehensively
 annotated the human and mouse genomes for the presence of bidirectional
 promoters, using in silico approaches. The identification of these promoters
 is contingent upon the presence of adjacent, oppositely oriented pairs of
 genes, because few distinguishing features are available to uniquely identify
 bidirectional promoters de novo. Genomic annotations used for our
 identification phase include:
 <ul>
 <li><em>A</em>) <a href="hgTrackUi?g=knownGene">UCSC known genes</a>
 annotations (items with score=800).
 <li><em>B</em>) <a href="hgTrackUi?g=mrna">GenBank mRNA</a> annotations
 (score=600).
 <li><em>C</em>) <a href="hgTrackUi?g=intronEst">spliced ESTs</a> (score=400).
 </ul>
 The annotations for protein coding genes (<em>A</em>) are strongly supported
 and therefore provide a high quality dataset for mapping bidirectional
 promoters. In contrast, bidirectional promoters supported by spliced ESTs
 (<em>C</em>) alone have varying levels of evidence, ranging from one
 characterized transcript to hundreds of them. For this reason, the mRNA
 annotation (<em>B</em>) from GenBank provides a stringent level of validation
 for the start sites of the EST transcripts. As a large class of regulatory
 sequences, bidirectional promoters exemplify a rich source of unexplored
 biological information in the human genome. When compared to the mouse genome,
 these promoters are identifiable as truly orthologous locations, being
 maintained in regions of conserved synteny (including both genes and the
 intervening promoter region) that have undergone no rearrangements since the
 last common ancestor of mammals, and in some cases fish. We use this approach
 to annotate orthologous bidirectional promoters in nonhuman species until
 genomic annotations become available.
 </p>
 
 <h2> Methods </h2>
 
 <h3> Assigning Orthologous Regions </h3>
 <p>A multi-stage approach to mapping orthology at bidirectional promoters was
 developed. Orthology assignments are strongest in coding regions. Therefore we
 began by mapping single human genes regulated by bidirectional promoters from
 the Known Genes annotations onto the mouse genome. Orthology assignments were
 determined using the "chains and nets" data from the UCSC Human Genome Browser
 mysql tables. Chains in the Genome Browser represent sequences of gapless
 aligned blocks. Nets provide a hierarchical ordering of those chains. Level 1
 chains contain the longest, best-scoring sequence chains that span any
 selected region. Subsequent levels in the net represent the results of
 rearrangements, duplications, insertions and deletions that may have disrupted
 the presence of conserved synteny derived from an ancestral sequence.
 </p>
 
 <h3> Confirming Orthologous Genes </h3>
 <p>After determining the orthology assignments using the UCSC chains and nets
 data, we used the Known Gene annotations or spliced ESTs to search the identity
 of genes within the corresponding region. Known Genes represent protein-coding
 genes and therefore can be verified by chains and nets alignments, followed by
 confirmation of protein identity in both species. Spliced ESTs carry less
 descriptive information than protein coding genes and therefore were validated
 in the second species by their presence in an orthologous region, showing
 conserved synteny of the two genes within a pair, and meeting the criteria of
 less than 1,000 bp of intergenic distance between those transcripts. Our method
 for mapping bidirectional promoters in spliced EST datasets is described in
 more detail in a previous publication. If the program verified evidence for
 orthology and conserved-syntenic gene arrangement, then the orthologous
 bidirectional promoter was confirmed. After orthologous assignments were
 confirmed for pairs of human genes, the reciprocal assignments were analyzed
 from mouse to human.
 
 Currently orthologous bidirectional promoter regions (that have been identified
 using UCSC known genes) have been mapped in human, chimp, macaque, mouse, rat,
 dog and cow genomes).
 </p>
 
 <h2> Credits </h2>
 
 <p>These data were produced by Mary Q. Yang in the
 <a href="http://www.genome.gov/12514761" title="http://www.genome.gov/12514761"
  rel="nofollow" TARGET=_BLANK>Elnitski lab</a> at NHGRI, NIH. (contact:
  <A HREF="mailto:&#101;&#108;&#110;&#105;&#116;&#115;ki&#64;&#109;&#97;&#105;l.n&#105;&#104;.g&#111;&#118;">
  &#101;&#108;&#110;&#105;&#116;&#115;ki&#64;&#109;&#97;&#105;l.n&#105;&#104;.g&#111;&#118;</A>)
  <!-- above address is elnitski at mail.nih.gov -->
 </p>
 
 <h2> References </h2>
 <p>Yang MQ, Elnitski L.
  <a href="http://www.springerlink.com/content/q86486k52kr84j06/"
  title="http://www.springerlink.com/content/q86486k52kr84j06/"
  rel="nofollow" TARGET=_BLANK>
  A computational study of bidirectional promoters in the human genome</a>.
 <i>Springer Lecture Series: Notes in Bioinformatics</i> 2007.
 
 </p><p>Yang MQ, Elnitski L. Orthology of Bidirectional Promoters Enables Use of
  a Multiple Class Predictor for Discriminating Functional Elements in the
  Human Genome.
  <a href="http://www.world-academy-of-science.org/worldcomp07/ws/BIOCOMP07"
  title="http://www.world-academy-of-science.org/worldcomp07/ws/BIOCOMP07"
  rel="nofollow" TARGET=_BLANK>
  Proceedings of the 2007 International Conference on Bioinformatics &
  Computational Biology.</a> pp. 218-228. 2007. ISBN: 1-60132-042-6.
 </p><p>Yang MQ, Koehly L, Elnitski L.
  <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0030072"
  title="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0030072"
  rel="nofollow" TARGET=_BLANK>
  Comprehensive annotation of human bidirectional promoters identifies
  co-regulatory relationships among somatic breast and ovarian cancer
  genes</a>. <i>PLOS Computational Biolog</i>y 2007.
 </p><p>Yang MQ, Taylor J, Elnitski L.
  <a href="http://www.biomedcentral.com/1471-2105/9/S6/S9"
  title="http://www.biomedcentral.com/1471-2105/9/S6/S9"
  rel="nofollow" TARGET=_BLANK>
  Comparative analyses of bidirectional promoters in vertebrates</a>.
  <i>BMC Bioinformatics</i> May 2008.
 </p><p>Piontkivska H, Yang MQ, Larkin DM, Lewin HA, Reecy J, Elnitski L.
 <a href="http://www.biomedcentral.com/1471-2164/10/189"
  title="http://www.biomedcentral.com/1471-2164/10/189"
  rel="nofollow" TARGET=_BLANK>
  Cross-species mapping of bidirectional promoters enables prediction of
  unannotated 5' UTRs and identification of species-specific transcripts</a>.
  <i>BMC Genomics</i>. 2009 Apr 24;10:189. PMID: 19393065.
 </p>
 
+<H2> Data Release Policy </H2>
+
+<P>Data users may freely use ENCODE data, but may not, without prior
+consent, submit publications that use an unpublished ENCODE dataset until
+nine months following the release of the dataset.  This date is listed in
+the table<EM>metadata</EM> as <EM>dateUnrestricted</EM> and on the
+download page.  The full data release policy for ENCODE is available
+<A HREF="../ENCODE/terms.html" TARGET=_BLANK>here</A>.</P>
+