src/hg/makeDb/trackDb/knownGene.html 1.18
1.18 2010/01/23 00:53:44 ann
Finally decided to take the plunge and edit the citations for the genbank
references.
Index: src/hg/makeDb/trackDb/knownGene.html
===================================================================
RCS file: /projects/compbio/cvsroot/kent/src/hg/makeDb/trackDb/knownGene.html,v
retrieving revision 1.17
retrieving revision 1.18
diff -b -B -U 1000000 -r1.17 -r1.18
--- src/hg/makeDb/trackDb/knownGene.html 3 Apr 2007 23:12:25 -0000 1.17
+++ src/hg/makeDb/trackDb/knownGene.html 23 Jan 2010 00:53:44 -0000 1.18
@@ -1,100 +1,100 @@
<H2>Description</H2>
<P>
The UCSC Known Genes track shows known protein-coding genes based on
protein data from SWISS-PROT, TrEMBL, and TrEMBL-NEW and their
corresponding mRNAs from
<A HREF="http://www.ncbi.nlm.nih.gov/Genbank/index.html"
TARGET=_blank>GenBank</A>.</P>
<H2>Display Conventions and Configuration</H2>
<P>
This track follows the display conventions for
<A HREF="../goldenPath/help/hgTracksHelp.html#GeneDisplay">gene prediction
tracks</A>. Black coloring indicates features that have corresponding entries
in the Protein Databank (PDB). Blue indicates features associated with
mRNAs from NCBI RefSeq or (dark blue) items having associated proteins in
the SWISS-PROT database. The variation in blue shading of RefSeq items
corresponds to the level of review the RefSeq record has undergone:
predicted (light), provisional (medium), or reviewed (dark). </P>
<P>
This track contains an optional codon coloring
feature that allows users to quickly validate and compare gene predictions.
To display codon colors, select the <em>genomic codons</em> option from the
<em>Color track by codons</em> pull-down menu. Click
<A HREF="../goldenPath/help/hgCodonColoring.html">here</A> for more
information about this feature. </P>
<H2>Methods</H2>
<P>
mRNA sequences were aligned against the $organism genome using blat. When a
single mRNA aligned in multiple places, only alignments having at least 98%
base identity with the genomic sequence were kept. This set of mRNA
alignments was further reduced by keeping only those mRNAs referenced by a
protein in SWISS-PROT, TrEMBL, or TrEMBL-NEW.</P>
<P>
Among multiple mRNAs referenced by a single protein, the best mRNA was
selected, based on a quality score derived from its length, the level of the
match between its translation and the protein sequence, and its release date.
The resulting mRNA and protein pairs were further filtered by removing
short invalid entries and consolidating entries with identical CDS regions.
</P>
<P>
Finally, RefSeq entries derived from DNA sequences instead of
mRNA sequences were added to produce the final data set shown in this track.
Disease annotations were obtained from SWISS-PROT.</P>
<H2>Credits</H2>
<P>
The Known Genes track was produced at UCSC based primarily on cross-references
between proteins from
<A HREF="http://www.expasy.org/sprot/" TARGET=_blank>SWISS-PROT</A>
(including TrEMBL and TrEMBL-NEW) and mRNAs from
<A HREF="http://www.ncbi.nlm.nih.gov/Genbank/index.html"
TARGET=_blank>GenBank</A>
contributed by scientists worldwide.
<A HREF="http://www.ncbi.nlm.nih.gov/RefSeq/" TARGET=_blank>NCBI RefSeq</A>
data were also included in this track.</P>
<H2>Data Use Restrictions</H2>
<P>
The UniProt data have the following terms of use, UniProt copyright(c) 2002 -
2004 UniProt consortium:</P>
<P>
For non-commercial use, all databases and documents in the UniProt FTP
directory may be copied and redistributed freely, without advance
permission, provided that this copyright statement is reproduced with
each copy.</P>
<P>
For commercial use, all databases and documents in the UniProt FTP
directory except the files
<UL>
<LI>ftp://ftp.uniprot.org/pub/databases/uniprot/knowledgebase/uniprot_sprot.dat.gz
<LI>ftp://ftp.uniprot.org/pub/databases/uniprot/knowledgebase/uniprot_sprot.xml.gz
</UL>
may be copied and redistributed freely, without advance permission,
provided that this copyright statement is reproduced with each copy.
More information for commercial users can be found
<A HREF="http://www.expasy.org/announce/sp_98.html" TARGET=_blank>here</A>.
<P>
From January 1, 2005, all databases and documents in the UniProt FTP
directory may be copied and redistributed freely by all entities,
without advance permission, provided that this copyright statement is
reproduced with each copy.</P>
<H2>References</H2>
<P>
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J,
Wheeler DL.
-<A HREF="http://nar.oupjournals.org/cgi/content/full/32/suppl_1/D23"
+<A HREF="http://nar.oupjournals.org/cgi/content/abstract/32/suppl_1/D23"
TARGET=_blank>GenBank: update</A>.
<em>Nucleic Acids Res.</em> 2004 Jan 1;32:D23-6.</P>
<P>
Hsu F, Kent WJ, Clawson H, Kuhn RM, Diekhans M, Haussler D.
<A HREF="http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btl048?ijkey=qbTw0578H4cXuFw&keytype=ref"
TARGET=_blank>The UCSC Known Genes</A>.
<em>Bioinformatics</em>. 2006 May 1;22(9):1036-46.</P>
<P>
Kent WJ.
<A HREF="http://www.genome.org/cgi/content/abstract/12/4/656"
TARGET=_blank>BLAT - the BLAST-like alignment tool</A>.
<em>Genome Res.</em> 2002 Apr;12(4):656-64.</P>