8c2f7318d8d821de9b2a25750586a94ab5e8c1bb lrnassar Fri Nov 15 18:50:19 2024 -0800 Giving the UI link cronjob some love by fixing all the 301 redirects. These are the bulk of the items listed on the cron. No RM. diff --git src/hg/makeDb/trackDb/human/encodeGencodeGeneOct.html src/hg/makeDb/trackDb/human/encodeGencodeGeneOct.html index 9bede4a..2304e3a 100755 --- src/hg/makeDb/trackDb/human/encodeGencodeGeneOct.html +++ src/hg/makeDb/trackDb/human/encodeGencodeGeneOct.html @@ -1,178 +1,178 @@ <H2>Description</H2> <P> The Gencode Gene track shows high-quality manual annotations in the ENCODE regions generated by the <A HREF="http://genome.imim.es/gencode/" TARGET=_blank>GENCODE project</A>. A companion track, Gencode Introns, shows experimental gene structure validations for these annotations.</P> <P> The gene annotations are colored based on the Havana annotation type. Known and validated transcripts are colored <em>dark green</em>, putative and unconfirmed are <em>light green</em>, pseudogenes are <em>blue</em>, and artifacts are <em>grey</em>. The transcript types are defined in more detail in the accompanying table. <P> The Gencode project recommends that the annotations with known and validated transcripts; i.e., the types <em>Known</em>, <em>Novel_CDS</em>, <em>Novel_transcript_gencode_conf</em>, and <em>Putative_gencode_conf</em> (which are colored dark green in the track display) be used as the reference annotation. <P> <TABLE BORDER=1 BORDERCOLOR="#aaaaaa" CELLPADDING=4> <TR> <TH align=left>Type</TH> <TH align=left>Color</TH> <TH align=left>Description</TH> </TR> <TR> <TD>Known</TD> <TD><FONT COLOR=#21i5B33>dark green</FONT></TD> <TD>Known protein coding genes (referenced in Entrez Gene, NCBI)</TD> </TR> <TR> <TD>Novel_CDS</TD> <TD><FONT COLOR=#21i5B33>dark green</FONT></TD> <TD>Novel protein coding genes annotated by Havana (not referenced in Entrez Gene, NCBI)</TD> </TR> <TR> <TD>Novel_transcript_gencode_conf</TD> <TD><FONT COLOR=#21i5B33>dark green</FONT></TD> <TD> Novel transcripts annotated by Havana (no ORF assigned) with at least one junction validated by RT-PCR</TD> </TR> <TR> <TD>Putative_gencode_conf</TD> <TD><FONT COLOR=#21i5B33>dark green</FONT></TD> <TD>Putative transcripts (similar to "novel transcripts", EST supported, short, no viable ORF) with at least one junction validated by RT-PCR</TD> </TR> <TR> <TD>Novel_transcript</TD> <TD><FONT COLOR=#54BC00>light green</FONT></TD> <TD>Novel transcripts annotated by Havana (no ORF assigned) not validated by RT-PCR</TD> </TR> <TR> <TD>Putative</TD> <TD><FONT COLOR=#54BC00>light green</FONT></TD> <TD>Putative transcripts (similar to "novel transcripts", EST supported, short, no viable ORF) not validated by RT-PCR</TD> </TR> <TR> <TD>TEC</TD> <TD><FONT COLOR=#54BC00>light green</FONT></TD> <TD>Single exon objects (supported by multiple ESTs with polyA sites and signals) undergoing experimental validation/extension. <TD> </TR> <TR> <TD>Processed_pseudogene</TD> <TD><FONT COLOR=005BBF> blue</FONT></TD> <TD>Pseudogenes arising via retrotransposition (exon structure of parent gene lost) </TD> </TR> <TR> <TD>Unprocessed_pseudogene</TD> <TD><FONT COLOR=005BBF> blue</FONT></TD> <TD>Pseudogenes arising via gene duplication (exon structure of parent gene retained)</TD> </TR> <TR> <TD>Artifact</TD> <TD><FONT COLOR=#636863>grey</FONT></TD> <TD>Transcript evidence and/or its translation equivocal</TD> </TR> </TABLE></P> <H2>Methods</H2> <P> The Human and Vertebrate Analysis and Annotation manual curation process -(<A HREF="http://www.sanger.ac.uk/HGP/havana/" TARGET=_blank>HAVANA</A>) was +(<A HREF="https://www.sanger.ac.uk/HGP/havana/" TARGET=_blank>HAVANA</A>) was used to produce these annotations. <P> Finished genomic sequence was analyzed on a clone-by-clone basis using a combination of similarity searches against DNA and protein databases, as well as a series of <em>ab initio</em> gene predictions. Nucleotide sequence databases were searched with WUBLASTN and significant hits were realigned to the unmasked genomic sequence by EST2GENOME. WUBLASTX was used to search the Uniprot protein database, and the accession numbers of significant hits were retrieved from the Pfam database. Hidden Markov models for Pfam protein domains were aligned against the genomic sequence using Genewise to provide annotation of protein domains. <P> A number of <em>ab initio</em> prediction algorithms were also run: Genscan and Fgenesh for genes, tRNAscan to find tRNA genes, and Eponine TSS for transcription start site predictions. <P> The annotators used the (AceDB-based) Otterlace interface to create and edit gene objects, which were then stored in a local database named <em>Otter</em>. In cases where predicted transcript structures from Ensembl are available, these can be viewed from within the Otterlace interface and may be used as starting templates for gene curation. Annotation in the Otter database is submitted to the EMBL/Genbank/DDBJ nucleotide database.</P> <H2>Verification</H2> <P> The gene objects selected for verification came from various computational prediction methods and HAVANA annotations. <P>RT-PCR and RACE experiments were performed on them, using a variety of human tissues, to confirm their structure. Human cDNAs from 24 different tissues (brain, heart, kidney, spleen, liver, colon, small intestine, muscle, lung, stomach, testis, placenta, skin, peripheral blood leucocytes, bone marrow, fetal brain, fetal liver, fetal kidney, fetal heart, fetal lung, thymus, pancreas, mammary gland, prostate) were synthesized using 12 poly(A)+ RNAs from Origene, eight from Clemente Associates/Quantum Magnetics and four from BD Biosciences as described in [Reymond <em>et al.</em>, 2002a,b]. The relative amount of each cDNA was normalized by quantitative PCR using SyberGreen as intercalator and an ABI Prism 7700 Sequence Detection System.</P> <P> Predictions of human genes junctions were assayed experimentally by RT-PCR as previously described and modified [Reymond, 2002b; Mouse Genome Sequencing Consortium, 2002; Guigo, 2003]. <P> Similar amounts of <em>Homo sapiens</em> cDNAs were mixed with JumpStart REDTaq ReadyMix (Sigma) and four ng/ul primers (Sigma-Genosys) with a BioMek 2000 robot (Beckman). The ten first cycles of PCR amplification were performed with a touchdown annealing temperatures decreasing from 60 to 50°C; annealing temperature of the next 30 cycles was carried out at 50°C. Amplimers were separated on "Ready to Run" precast gels (Pharmacia) and sequenced. RACE experiments were performed with the BD SMART RACE cDNA Amplification Kit following the manufacturer instructions (BD Biosciences).</P> <H2>Credits</H2> <P> Click <A HREF="http://genome.imim.es/gencode/participants.html" TARGET=_blank>here</A> for a complete list of people who participated in the GENCODE project. </P> <H2>References</H2> <P> Ashurst, J.L. <em>et al</em>. <A HREF="http://nar.oupjournals.org/cgi/content/abstract/33/suppl_1/D459" TARGET=_blank>The Vertebrate Genome Annotation (Vega) database</A>. <em>Nucleic Acids Res</em> <B>33</B> (Database Issue), D459-65 (2005).</P> <P> Guigo, R. <em>et al</em>. <A HREF="http://www.pnas.org/cgi/content/abstract/100/3/1140" TARGET=_blank>Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes</A>. <em>Proc Natl Acad Sci U S A</em> <B>100</B>(3), 1140-5 (2003). </P> <P> Mouse Genome Sequencing Consortium. <A HREF="http://www.nature.com/nature/journal/v420/n6915/abs/nature01262_fs.html" TARGET=_blank>Initial sequencing and comparative analysis of the mouse genome</A>. <em>Nature</em> <B>420</B>(6915), 520-62 (2002).</P> <P> Reymond, A. <em>et al</em>. <A HREF="http://www.nature.com/nature/journal/v420/n6915/full/nature01178_fs.html" TARGET=_blank>Human chromosome 21 gene expression atlas in the mouse</A>. <em>Nature</em> <B>420</B>(6915), 582-6 (2002).</P> <P> Reymond, A. <em>et al</em>. <A HREF="http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6WG1-4626YG7-F&_coverDate=06%2F30%2F2002&_alid=279986794&_rdoc=1&_fmt=&_orig=search&_qd=1&_cdi=6809&_sort=d&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=e60a075f4aceda0fde09effcbca4fe8d" TARGET=_blank>Nineteen additional unpredicted transcripts from human chromosome 21</A>. <em>Genomics</em> <B>79</B>(6), 824-32 (2002).</P>