aa61ebc800429515f9ced7e28f669c6042219f43 max Wed Mar 18 09:09:13 2026 -0700 varFreqs supertrack: add GREGoR track, update all HTML docs, move scripts to varFreqs/, refs #36642 Add GREGoR R04 WGS track to varFreqs superTrack. Update Data Access and Methods sections for all 20+ subtrack HTML files with consistent formatting, sequencing methods from source papers, and links to makeDoc and Github scripts. Move all varFreqs conversion scripts into scripts/varFreqs/ subdirectory and update makeDoc paths accordingly. Co-Authored-By: Claude Opus 4.6 diff --git src/hg/makeDb/trackDb/human/abraom.html src/hg/makeDb/trackDb/human/abraom.html new file mode 100644 index 00000000000..eb13c7ed552 --- /dev/null +++ src/hg/makeDb/trackDb/human/abraom.html @@ -0,0 +1,63 @@ +

Description

+

+The Arquivo Brasileiro Online de +Mutações (ABraOM) provides genomic variants obtained with whole-genome sequencing +from SABE, a census-based sample of elderly individuals from São Paulo, Brazil's largest +city. The Brazilian population is constituted by ~500 years of admixture between Africans, +Europeans, and Native Americans. Additionally, the cohort presents ~3% of individuals with +non-admixed Japanese ancestry (early 20th century migration). Coverage 38.6x. TEs, HLAs and +new sequence are also available. +

+ +

Data Access

+

+The data can be explored interactively with the +Table Browser or the +Data Integrator. +For programmatic access, our REST API can be used; the +track name is abraom. +For bulk download, the VCF file can be obtained from +our download server. +

+

+The original data can also be downloaded from the ABraOM website. +

+ +

Methods

+

+For academic use only. Licensing for commercial use might be available under request and agreement. +By using this resource you agree to cite the flagship paper (Naslavsky et al. Nat Comm 2022). +

+

+Whole-genome sequencing was performed at Human Longevity Inc. using TruSeq Nano DNA HT libraries +sequenced on Illumina HiSeqX instruments with 150 bp paired-end reads targeting 30x coverage, and +reads were mapped to GRCh38 using ISIS software. Sample sex was validated by comparing CPMs of X +chromosome and male-specific Y (MSY) reads relative to autosomes, yielding the expected female +(~55,000 X CPM, <200 MSY CPM) and male (~27,500 X CPM, >550 MSY CPM) patterns. Germline SNVs +and indels were called following GATK Best Practices (GATK v3.7) via per-sample GVCFs +(HaplotypeCaller), joint genotyping (CombineGVCFs, GenotypeGVCFs), and Variant Quality Score +Recalibration (VQSR-AS); multiallelic variants were split with an in-house script, left-aligned with +BCFtools, and annotated using Annovar and custom scripts against dbSNP, 1000 Genomes, and gnomAD, +with putative loss-of-function variants identified using LOFTEE v0.3-beta irrespective of confidence +labels. Variant and genotype quality was further assessed using the in-house CEGH-Filter two-step +algorithm based on depth and allele balance, and analyses retained only GATK VQSR-AS PASS variants +and higher-confidence CEGH-Filter calls. Relatedness was assessed using KING and PC-Relate +(GENESIS), retaining a single proband per related pair and excluding one contaminated sample +(>3% by verifyBAMID), resulting in a final dataset of 1,171 unrelated individuals. Final samples +achieved mean coverages ranging from 31.3x to 64.8x, with an average of 38.65x and a median of +36.6x. +We provide documentation that indicates how all source files of the varFreqs track were converted in the makeDoc file of the track. +For some tracks, python scripts were necessary and are also available from Github. +

+ +

References

+

+Naslavsky MS, Scliar MO, Yamamoto GL, Wang JYT, Zverinova S, Karp T, Nunes K, Ceroni JRM, de +Carvalho DL, da Silva Simões CE et al. + +Whole-genome sequencing of 1,171 elderly admixed individuals from São Paulo, Brazil. +Nat Commun. 2022 Mar 4;13(1):1004. +PMID: 35246524; PMC: PMC8897431 +