28aa0d3b6f9ec11895b49f1eb8117d868393fa42 lrnassar Tue Oct 15 05:21:23 2024 -0700 Updating the description of the exon frame field, refs #32675 diff --git src/hg/htdocs/goldenPath/help/bigGenePred.html src/hg/htdocs/goldenPath/help/bigGenePred.html index afcf5aa..98a6290 100755 --- src/hg/htdocs/goldenPath/help/bigGenePred.html +++ src/hg/htdocs/goldenPath/help/bigGenePred.html @@ -42,45 +42,47 @@ string chrom; "Reference sequence chromosome or scaffold" uint chromStart; "Start position in chromosome" uint chromEnd; "End position in chromosome" string name; "Name or ID of item, ideally both human-readable and unique" uint score; "Score (0-1000)" char[1] strand; "+ or - for strand" uint thickStart; "Start of where display should be thick (start codon)" uint thickEnd; "End of where display should be thick (stop codon)" uint reserved; "RGB value (use R,G,B string in input file)" int blockCount; "Number of blocks" int[blockCount] blockSizes; "Comma separated list of block sizes" int[blockCount] chromStarts;"Start positions relative to chromStart" string name2; "Alternative/human readable name" string cdsStartStat; "Status of CDS start annotation (none, unknown, incomplete, or complete)" string cdsEndStat; "Status of CDS end annotation (none, unknown, incomplete, or complete)" - int[blockCount] exonFrames; "Exon frame {0,1,2}, or -1 if no frame for exon" + int[blockCount] exonFrames; "Reading frame of the start of the CDS region of the exon, in the direction of transcription (0,1,2), or -1 if there is no CDS region." string type; "Transcript type" string geneName; "Primary identifier for gene" string geneName2; "Alternative/human-readable gene name" string geneType; "Gene type" ) </code></pre> -<p>The field <pre>exonFrames</pre> is a comma-separated list of the numbers 0, 1, 2 or -1, one per exon, in order of transcription. -This order means that the first value for a transcript on the - strand is the exon most on the right of the screen -on the Genome Browser. A value of zero means that the first codon of the exon starts at the first nucleotide of the -exon. A value of one means that the first codon starts after the first nucleotide and a value of two means -that it starts after the second nucleotide. UTRs are non-coding and their exonFrame value is -1. -</p> +<p> +The field <code>exonFrames</code> is a comma-separated list of the numbers +with the possible values 0, 1, 2 or -1, one per exon, in order of transcription. +This order means that the first value for a transcript on the minus (-) strand is +the exon on the right of the screen on the Genome Browser. +A value of zero means that the first codon of the exon starts at the first nucleotide of the +exon. A value of one means that the first codon starts after the first +nucleotide and a value of two means that it starts after the second nucleotide. +UTRs are non-coding and their exonFrame value is -1.</p> -values indicate the offset of the first codon in nucleotides at the start of the exon. <p>The fields cdsStartStat and cdsEndStat have the following values: 'none' = none, 'unk' = unknown, 'incmpl' = incomplete, and 'cmpl' = complete. The values, however, are not used for our display and cannot be used to identify coding or non-coding genes. For most purposes, to get more information about a transcript, other tables will need to be used. For instance, in the case of hg38, the tables named wgEncodeGencodeAttrsVxx, where xx is the Gencode Version number. See this <a href="../../FAQ/FAQgenes.html#coding" target="_blank">coding/non-coding genes FAQ</a> for more information.</p> <p> The following bed12+8 is an example of a <a href="examples/bigGenePred.txt">pre-bigGenePred text file </a>.</p> <h2>Creating a bigGenePred track from a bed12+8 file</h2> <p> <strong>Step 1.</strong> Format your pre-bigGenePred file. The first 12 fields of pre-bigGenePred files are described by the