2f106a9cd51707e6772b96064b2fcfc30bca95b0 ccpowell Thu Jul 18 14:54:51 2019 -0700 Switching YP with NP mention in all organism execept human, refs #23818 diff --git src/hg/makeDb/trackDb/human/refSeqCompositeHuman.html src/hg/makeDb/trackDb/human/refSeqCompositeHuman.html new file mode 100644 index 0000000..4a56d12 --- /dev/null +++ src/hg/makeDb/trackDb/human/refSeqCompositeHuman.html @@ -0,0 +1,278 @@ +
+The NCBI RefSeq Genes composite track shows $organism protein-coding and non-protein-coding +genes taken from the NCBI RNA reference sequences collection (RefSeq). All subtracks use +coordinates provided by RefSeq, except for the UCSC RefSeq track, which UCSC produces by +realigning the RefSeq RNAs to the genome. This realignment may result in occasional differences +between the annotation coordinates provided by UCSC and NCBI. See the +Methods section for more details about how the different tracks were +created.
++Please visit NCBI's Feedback for Gene and Reference Sequences (RefSeq) page to make suggestions, +submit additions and corrections, or ask for help concerning RefSeq records.
+ ++For more information on the different gene tracks, see our Genes FAQ.
+ ++This track is a multi-view composite track that contains differing data set views. +Instructions for configuring multi-view tracks are +here. +To show only a selected set of subtracks, uncheck the boxes next to the tracks that you wish to +hide.
+ +The views available for this track include: ++The RefSeq All, RefSeq Curated, RefSeq Predicted, RefSeq Clinical +and UCSC RefSeq tracks follow the display conventions for +gene prediction tracks. +The color shading indicates the level of review the RefSeq record has undergone: +predicted (light), provisional (medium), or reviewed (dark), as defined by RefSeq.
+ ++
Color | +Level of review | +
---|---|
+ | Reviewed: the RefSeq record has been reviewed by NCBI staff or by a collaborator. The NCBI review process includes assessing available sequence data and the literature. Some RefSeq records may incorporate expanded sequence and annotation information. | +
+ | Provisional: the RefSeq record has not yet been subject to individual review. The initial sequence-to-gene association has been established by outside collaborators or NCBI staff. | +
+ | Predicted: the RefSeq record has not yet been subject to individual review, and some aspect of the RefSeq record is predicted. | +
+The item labels and codon display properties for features within this track can be configured +through the controls at the top of the track description page. Click the view name +(NCBI RefSeq or UCSC RefSeq) to globally modify the settings for all subtracks in +the view. To adjust the settings for an individual subtrack, click the wrench icon next to the +track name in the subtrack list (available only for views containing more than one track).
+The RefSeq Diffs track contains five different types of inconsistency between the +reference genome sequence and the RefSeq transcript sequences. The five types of differences are +as follows: +
+When reporting HGVS with RefSeq sequences, to make sure that results from +research articles can be mapped to the genome unambigously, +please specify the RefSeq annotation release displayed on the transcript's +Genome Browser details page and also the RefSeq transcript ID with version +(e.g. NM_012309.4 not NM_012309). +
+ + + ++Tracks contained in the RefSeq annotation and RefSeq RNA alignment views were created at UCSC using +data from the NCBI RefSeq project. Data files were downloaded from RefSeq in GFF file format and +converted to the genePred and PSL table formats for display in the Genome Browser. Information about +the NCBI annotation pipeline can be found +here.
+ +The RefSeq Diffs track is generated by UCSC using NCBI's RefSeq RNA alignments.
++The UCSC RefSeq Genes track is constructed using the same methods as previous RefSeq Genes tracks. +RefSeq RNAs were aligned against the $organism genome using BLAT. Those with an alignment of +less than 15% were discarded. When a single RNA aligned in multiple places, the alignment +having the highest base identity was identified. Only alignments having a base identity +level within 0.1% of the best and at least 96% base identity with the genomic sequence were +kept.
+ ++The raw data for these tracks can be accessed in multiple ways. It can be explored interactively +using the Table Browser or +Data Integrator. The tables can also be accessed programmatically through our +public MySQL server or downloaded from our +downloads server for local processing.
++The data in the RefSeq Other and RefSeq Diffs tracks are organized in +bigBed file format; more +information about accessing the information in this bigBed file can be found +below. The other subtracks are associated with database tables as follows:
++The first column of each of these tables is "bin". This column is designed +to speed up access for display in the Genome Browser, but can be safely ignored in downstream +analysis. You can read more about the bin indexing system +here.
++The annotations in the RefSeqOther and RefSeqDiffs tracks are stored in bigBed +files, which can be obtained from our downloads server here, +ncbiRefSeqOther.bb and +ncbiRefSeqDiffs.bb. +Individual regions or the whole set of genome-wide annotations can be obtained using our tool +bigBedToBed which can be compiled from the source code or downloaded as a precompiled +binary for your system from the utilities directory linked below. For example, to extract only +annotations in a given region, you could use the following command:
++bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/$db/ncbiRefSeq/ncbiRefSeqOther.bb +-chrom=chr16 -start=34990190 -end=36727467 stdout
++The genePred format tracks can also be downloaded in GTF format using the +genePredToGtf utility, available from the +utilities directory on the UCSC downloads +server. The utility can be run from the command line like so:
+genePredToGtf $db ncbiRefSeqPredicted ncbiRefSeqPredicted.gtf ++Note that using genePredToGtf in this manner accesses our public MySQL server, and you therefore +must set up your hg.conf as described on the MySQL page linked near the beginning of the Data Access +section.
++A file containing the RNA sequences in FASTA format for all items in the RefSeq All, RefSeq Curated, +and RefSeq Predicted tracks can be found on our downloads server +here.
++Please refer to our mailing list archives for questions.
+ ++This track was produced at UCSC from data generated by scientists worldwide and curated by the +NCBI RefSeq project.
+ ++Kent WJ. +BLAT - the BLAST-like +alignment tool. Genome Res. 2002 Apr;12(4):656-64. +PMID: 11932250; PMC: PMC187518
++Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, +Landrum MJ, McGarvey KM et al. +RefSeq: an update on mammalian reference sequences. +Nucleic Acids Res. 2014 Jan;42(Database issue):D756-63. +PMID: 24259432; PMC: +PMC3965018
++Pruitt KD, Tatusova T, Maglott DR. + +NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts +and proteins. +Nucleic Acids Res. 2005 Jan 1;33(Database issue):D501-4. +PMID: 15608248; PMC: PMC539979