9bfd58221b1539193cb7f0a317b4e959c1c7e49a
max
Thu May 21 01:00:45 2026 -0700
varFreqs: AI generated text sounds bad, hard to read, so remove typical AI language. "humanizer" pass on all 31 varFreqs description pages — cut em dashes, copula avoidance ("serves as", "stands as"), "-ing" puffery, and boilerplate filler ("We provide documentation that indicates how..."). Title-case headings and meaningful emphasis preserved. No facts/URLs/counts/versions changed. tpmi.html added as a new file (was previously uncommitted). refs #36642
Co-Authored-By: Claude Sonnet 4.6
stop_gained,frameshift is selected by either the "Stop Gained"
or the "Frameshift" filter. The "Other" bucket catches the less
common Sequence Ontology
consequence terms emitted by bcftools csq that don't fit the named
- buckets above — for example
+ buckets above. Examples include
splice_region (variant near a splice site but outside the canonical
donor/acceptor),
start_lost / stop_lost (variant disrupts the start codon
or replaces the stop codon with a coding amino acid),
stop_retained (variant changes the stop codon but keeps it a stop),
inframe_insertion / inframe_deletion (in-frame indel
- adding or removing whole codons), and
+ that adds or removes whole codons), and
coding_sequence (CDS variant where the precise impact is undetermined).
- Including "Other" in the filter selection guarantees that no records are
+ If you include "Other" in the filter selection, no records will be
hidden by the consequence filter.How to find protein-truncating variants: Set the Consequence filter to include only "Stop Gained", "Frameshift", "Splice Donor", and "Splice Acceptor". These will appear as red items in the track display.
The annotated VCF was converted to bigBed format using a custom Python script
(vcfToBigBed.py) that reads frequency data from each source VCF in parallel,
matches variants by position/ref/alt, and writes a BED file with consequence coloring,
per-database allele counts and frequencies, and population breakdowns.
The database configuration (which VCFs to include, field mappings, and population definitions)
is stored in two TSV files
(databases.tsv and
populations.tsv)
-to make future updates easy.
+so that future updates only require editing these files.
-We provide documentation that indicates how all source files of the varFreqs track were -converted in the +The track's makeDoc file of the track. +target="_blank">makeDoc file documents how each source VCF was converted. Scripts are available from Github.
The data can be explored interactively with the Table Browser or the Data Integrator. For programmatic access, our REST API can be used; the track name is varFreqsAll.
Because the merged callset includes data from multiple sources whose redistribution