9f975b6fc35649e71cfbf7c63bb8cd06c2244598 max Tue May 2 02:59:23 2023 -0700 adding more details to refseq docs page, and preparing the removal of the hg19-specific refseq docs page, refs #31118 diff --git src/hg/makeDb/trackDb/refSeqComposite.html src/hg/makeDb/trackDb/refSeqComposite.html index 80b8859..e115de7 100644 --- src/hg/makeDb/trackDb/refSeqComposite.html +++ src/hg/makeDb/trackDb/refSeqComposite.html @@ -22,50 +22,73 @@ This track is a composite track that contains differing data sets. To show only a selected set of subtracks, uncheck the boxes next to the tracks that you wish to hide. <b>Note:</b> Not all subtracts are available on all assemblies. </p> The possible subtracks include: <dl> <dt><em><strong>RefSeq aligned annotations and UCSC alignment of RefSeq annotations </strong></em></dt> <ul> <li> <em>RefSeq All</em> – all curated and predicted annotations provided by RefSeq.</li> <li> <em>RefSeq Curated</em> – subset of <em>RefSeq All</em> that includes only those annotations whose accessions begin with NM, NR, NP or YP. <small>(NP and YP are used only for - protein-coding genes on the mitochondrion; YP is used for human only.)</small></li> + protein-coding genes on the mitochondrion; YP is used for human only.)</small> They were + manually curated, based on publications describing transcripts and manual reviews of + evidence which includes EST and full-length cDNA alignments, protein sequences, splice sites + and any other evidence available in databases or the scientific literature. The + resulting sequences can differ from the genome, they exist independently + from a particular human genome build, and so must be aligned to the genome to create a track. + The "RefSeq Curated" track is NCBI's mapping of these transcripts to the genome. + Another alignment track exists for these, the "UCSC RefSeq" track (see beloow).</li> <li> <em>RefSeq Predicted</em> – subset of RefSeq All that includes those annotations whose - accessions begin with XM or XR.</li> + accessions begin with XM or XR. They were predicted based on protein, cDNA, EST + and RNA-seq alignments to the genome assembly by the NCBI Gnomon prediction software.</li> <li> <em>RefSeq Other</em> – all other annotations produced by the RefSeq group that do not fit the requirements for inclusion in the <em>RefSeq Curated</em> or the - <em>RefSeq Predicted</em> tracks.</li> + <em>RefSeq Predicted</em> tracks. Examples are untranscribed pseudogenes or gene clusters, such as HOX or protocadherin alpha. They were manually curated from + publications or databases but are not typical transcribed genes.</li> <li> <em>RefSeq Alignments</em> – alignments of RefSeq RNAs to the $organism genome provided by the RefSeq group, following the display conventions for <a href="../goldenPath/help/hgTracksHelp.html#PSLDisplay" target="_blank">PSL tracks</a>.</li> <li> <em>RefSeq Diffs</em> – alignment differences between the $organism reference genome(s) - and RefSeq transcripts. <small>(Track not currently available for every assembly.)</small> + and RefSeq curated transcripts. <small>(Track not currently available for every assembly.)</small> </li> <li> <em>UCSC RefSeq</em> – annotations generated from UCSC's realignment of RNAs with NM and NR accessions to the $organism genome. This track was previously known as the "RefSeq Genes" track.</li> + <li> + <em>RefSeq Select (subset, only on hg38)</em> – Subset of RefSeq Curated, transcripts marked as + part of the RefSeq Select dataset. + A single <em>Select</em> transcript is chosen as representative for each protein-coding gene. + See <a target="_blank" + href="https://www.ncbi.nlm.nih.gov/refseq/refseq_select/">NCBI RefSeq Select</a>. + </li> + <li> + <em>RefSeq HGMD (subset)</em> – Subset of RefSeq Curated, transcripts annotated by the Human + Gene Mutation Database. This track is only available on the human genomes hg19 and hg38. + It is the most restricted RefSeq subset, targeting clinical diagnostics. + </li> + </ul> +</dl> <p> The <em>RefSeq All</em>, <em>RefSeq Curated</em>, <em>RefSeq Predicted</em>, and <em>UCSC RefSeq</em> tracks follow the display conventions for <a href="../goldenPath/help/hgTracksHelp.html#GeneDisplay" target="_blank">gene prediction tracks</a>. The color shading indicates the level of review the RefSeq record has undergone: predicted (light), provisional (medium), or reviewed (dark), as defined by <a target=_blank href="https://www.ncbi.nlm.nih.gov/books/NBK21091/table/ch18.T.refseq_status_codes/?report=objectonly">RefSeq</a>. </p> <p> <table> <thead> <tr> <th style="border-bottom: 2px solid #6678B1;">Color</th> <th style="border-bottom: 2px solid #6678B1;">Level of review</th>