f3463c39dcd4e6f8aaff1f4142cf0101b567b435 max Fri Mar 4 02:01:44 2022 -0800 updating cdsStartStat docs, refs #29030 diff --git src/hg/htdocs/goldenPath/help/bigGenePred.html src/hg/htdocs/goldenPath/help/bigGenePred.html index 085f78f..edd1943 100755 --- src/hg/htdocs/goldenPath/help/bigGenePred.html +++ src/hg/htdocs/goldenPath/help/bigGenePred.html @@ -46,31 +46,32 @@ uint thickEnd; "End of where display should be thick (stop codon)" uint reserved; "RGB value (use R,G,B string in input file)" int blockCount; "Number of blocks" int[blockCount] blockSizes; "Comma separated list of block sizes" int[blockCount] chromStarts;"Start positions relative to chromStart" string name2; "Alternative/human readable name" string cdsStartStat; "Status of CDS start annotation (none, unknown, incomplete, or complete)" string cdsEndStat; "Status of CDS end annotation (none, unknown, incomplete, or complete)" int[blockCount] exonFrames; "Exon frame {0,1,2}, or -1 if no frame for exon" string type; "Transcript type" string geneName; "Primary identifier for gene" string geneName2; "Alternative/human-readable gene name" string geneType; "Gene type" ) </code></pre> -<p><b>cdsStartStat/cdsEndStat</b>: If you want only protein-coding transcripts, then filter for cdsStartStat='cmpl' and cdsEndStat='cmpl'. Non-coding transcripts have either one of these set to 'incmpl'. +<p>The fields cdsStartStat and cdsEndStat can have the values ('none','unk','incmpl','cmpl'). However, the values are not used for our display and can not be +used to subset for coding or non-coding genes. For most purposes, to get more information about a transcript, other tables will need to be used e.g. in the case of hg38, the tables named wgEncodeGencodeAttrsVxx, where xx is the Gencode Version number. </p> <p> The following bed12+8 is an example of a <a href="examples/bigGenePred.txt">pre-bigGenePred text file </a>.</p> <h2>Creating a bigGenePred track from a bed12+8 file</h2> <p> <strong>Step 1.</strong> Format your pre-bigGenePred file. The first 12 fields of pre-bigGenePred files are described by the <a href="../../FAQ/FAQformat.html#format1">BED file format</a>. Your file must also contain the 8 extra fields described in the autoSql file definition shown above: <code>name2, cdsStartStat, cdsEndStat, exonFrames, type, geneName, geneName2, geneType</code>. For example, you can use this bed12+8 input file, <a href="examples/bigGenePred.txt">bigGenePred.txt</a>. Your pre-bigGenePred file must be sorted