dbed701b3a98a3c91c0e03766090dea21cc2ebb3 mspeir Wed Jan 21 10:35:18 2026 -0800 making corrections based on CR feedback, refs #36979 diff --git src/hg/htdocs/FAQ/FAQgenes.html src/hg/htdocs/FAQ/FAQgenes.html index d7eaecbb263..7f2d9ef4f38 100755 --- src/hg/htdocs/FAQ/FAQgenes.html +++ src/hg/htdocs/FAQ/FAQgenes.html @@ -859,70 +859,73 @@ filtering for NR identifiers. Note that a pseudogene of mRNA is not an unambiguous concept, and there may be a desire to look further to select certain subset types as mentioned above.

If using the UCSC knownGene table, one can filter for where the coding start and coding end fields of the table are equivalent, e.g. knownGene.cdsStart = knownGene.cdsEnd, which would ensure the selected entries are non-coding genes.

You can also search our mailing-list archives to read further details about only obtaining non-coding genes from the UCSC Genome Browser.

How do I interpret the exon frame information in the BED per-exon output for gene tracks?

-The per-exon option for BED output on the Table Browser outputs likes like so -when using a gene track: +The per-exon option for BED output on the Table Browser outputs lines like so +when using a gene track:

chr1 1046829 1047018 NM_001077977_utr3_2_0_chr1_1046830_f 0 +
 chr1 1099124 1099325 NM_001077124_utr3_0_0_chr1_1099125_r 0 -
 
+

The name column contains several pieces of information separated by underscores: NM_001077124_utr3_0_0_chr1_1099125_r. Here's a breakdown of that information:

  1. NM_001077124 - Transcript accession
  2. utr3 - will be cds or utr3/5
  3. 0 - exon inFrame - This indicates the offset at the beginning of a feature (like an exon) to reach the first base of the next complete codon.
  4. 0 - exon outFrame - This indicates the offset remaining at the end of a feature.
  5. chr1_1099125 - chromosome and start position of the exon
  6. r - strand, "r" for reverse or "-" and "f" for forward or "+"
+

+

Here we're going to focus on the inFrame and outFrame specifically. The values typically range from 0 to 2. These numbers are a representation of where in the frame the exon starts and ends. -
Value Meaning
0 The feature starts/ends exactly at the beginning of a codon. No offset is required.
1 There is 1 "extra" nucleotide before/after the complete codons start.
2 There are 2 "extra" nucleotides before/after the complete codons start.
+

In the example lines above, the exons have "0" for both inFrame and outFrame because they are UTR exons.

Finally, it should be noted that when the amino acid output is split per exons (where a split codon is impossible to -denote), the amino acid for split codon is placed in the exon with most of the bases. +denote), the amino acid for split codon is placed in the exon with most of the bases.