644eae7700eb62343b142a44258ee91a944f4853
max
  Tue Jun 9 15:17:03 2026 -0700
lrSv: rebuild merged lrSvAll track with HPRC v2.1 data

Repoint the HPRC entry in databases.tsv from the removed v2.0 hprc2.bb to
the new hprc2v21.bb and re-run the merge. The combined track now holds
2,359,011 variants (was 2,694,871); the drop is the smaller v2.1 HPRC
callset. Regenerated lrSvAll.ra and updated the overview table on the
container page. The HPRC Jasmine track is intentionally not part of the
merge, refs #36258

diff --git src/hg/makeDb/trackDb/human/lrSv.html src/hg/makeDb/trackDb/human/lrSv.html
index 7d5560c1eac..803d8904725 100644
--- src/hg/makeDb/trackDb/human/lrSv.html
+++ src/hg/makeDb/trackDb/human/lrSv.html
@@ -1,500 +1,500 @@
 <h2>Description</h2>
 <p>
 This track collection contains structural variant (SV) calls derived from long-read sequencing
 studies. Structural variants are genomic rearrangements larger than ~50 bp, including
 deletions, insertions, duplications, inversions, and translocations. Long-read sequencing
 technologies can span repetitive regions and resolve complex rearrangements
 that are difficult to detect with short-read methods.
 </p>
 
 <h3>Available Datasets</h3>
 <p>
 SV length statistics (min / median / max) are computed from the <tt>svLen</tt>
 field of each track, in base pairs. Some tracks include sites with
 <tt>svLen=0</tt> (complex events where the reference and alternate alleles
 differ in sequence but not in length).
 </p>
 <p>
 For short-read structural-variant comparators (CCDG 17,795, 1KG 3202,
 ToMMo 48K CNV) see the companion
 <a href="hgTrackUi?g=srSv">Short-read SVs</a> supertrack.
 </p>
 <p>
 Polymorphic <b>Mobile Element Insertions</b> (Alu, L1, SVA, HERVK,
 snRNA) called from HGSVC3 long-read assemblies are released as a
 separate track collection; see the
 <a href="hgTrackUi?g=mei">Mobile Insertions</a> tracks. Those MEIs are
 the insertions identified in the 65 HGSVC3 samples relative to the
 reference, available on both GRCh38/hg38 and T2T-CHM13/hs1.
 </p>
 <table class="stdTbl">
 <tr>
   <th>Dataset</th>
   <th>N samples</th>
   <th>Cohort / disease</th>
   <th>Disease cases</th>
   <th>Coverage</th>
   <th>SV count</th>
   <th>Min</th>
   <th>Median</th>
   <th>Max</th>
 </tr>
 <tr>
   <td><a href="hgTrackUi?g=lrSvAll"><b>All merged</b></a></td>
   <td>&mdash;</td>
   <td>All long-read SV datasets merged on identical position+type+length, with per-database AC</td>
   <td>mixed</td>
   <td>mixed (PacBio HiFi, ONT)</td>
-  <td>2,694,871</td>
-  <td>50</td>
-  <td>200</td>
+  <td>2,359,011</td>
+  <td>0</td>
+  <td>147</td>
   <td>190,088,223</td>
 </tr>
 <tr>
   <td><a href="hgTrackUi?g=colorsDbSv">CoLoRSdb</a></td>
   <td>1,427</td>
   <td>Consortium of Long-Read Sequencing, joint callset</td>
   <td>No</td>
   <td>mixed (HiFi)</td>
   <td>426,239</td>
   <td>20</td>
   <td>33</td>
   <td>101,381</td>
 </tr>
 <tr>
   <td><a href="hgTrackUi?g=han945Sv">Han 945</a></td>
   <td>945</td>
   <td>Han Chinese, general population</td>
   <td>No</td>
   <td>~17x ONT</td>
   <td>111,288</td>
   <td>0</td>
   <td>254</td>
   <td>99,743</td>
 </tr>
 <tr>
   <td><a href="hgTrackUi?g=gustafsonSv">1KG ONT 100</a></td>
   <td>100</td>
   <td>1000 Genomes, 5 superpopulations / 19 subpopulations</td>
   <td>No</td>
   <td>~37x ONT (R9.4.1)</td>
   <td>113,696</td>
   <td>0</td>
   <td>164</td>
   <td>98,289</td>
 </tr>
 <tr>
   <td><a href="hgTrackUi?g=lrSv1kgOnt">1KG ONT Vienna</a></td>
   <td>1,019</td>
   <td>1000 Genomes, diverse</td>
   <td>No</td>
   <td>~17x ONT</td>
   <td>148,375</td>
   <td>2</td>
   <td>177</td>
   <td>49,171</td>
 </tr>
 <tr>
   <td><a href="hgTrackUi?g=tommoJpSv">ToMMo Japanese</a></td>
   <td>333 (111 trios)</td>
   <td>Japanese, general population</td>
   <td>No</td>
   <td>~22x ONT</td>
   <td>74,201</td>
   <td>51</td>
   <td>162</td>
   <td>99,980</td>
 </tr>
 <tr>
   <td><a href="hgTrackUi?g=aou1kSv">AoU 1K</a></td>
   <td>1,027</td>
   <td>All of Us, self-identified Black/African American; biobank includes a variety of conditions (diabetes, hearing loss, etc.)</td>
   <td>Yes (mixed)</td>
   <td>~8x HiFi</td>
   <td>541,049</td>
   <td>50</td>
   <td>152</td>
   <td>9,998</td>
 </tr>
 <tr>
   <td><a href="hgTrackUi?g=ga4kSv">GA4K</a></td>
   <td>502</td>
   <td>Children's Mercy, pediatric rare disease probands + families</td>
   <td>Yes (probands)</td>
   <td>~27x HiFi</td>
   <td>115,554</td>
   <td>50</td>
   <td>186</td>
   <td>809,711</td>
 </tr>
 <tr>
   <td><a href="hgTrackUi?g=decodeSv">deCODE 3,622</a></td>
   <td>3,622</td>
   <td>Icelandic general population</td>
   <td>No</td>
   <td>~17x ONT</td>
   <td>133,886</td>
   <td>0</td>
   <td>127</td>
   <td>861,080</td>
 </tr>
 <tr>
   <td><a href="hgTrackUi?g=hprc2v21Sv">HPRC v2.1</a></td>
   <td>233</td>
   <td>HPRC release-2 pangenome (CHM13 + diverse 1KG assemblies)</td>
   <td>No</td>
   <td>~60x HiFi + ~30x ONT (pangenome graph)</td>
   <td>596,063</td>
   <td>50</td>
   <td>276</td>
   <td>1,064,897</td>
 </tr>
 <tr>
   <td><a href="hgTrackUi?g=hgsvc2Sv">HGSVC2</a></td>
   <td>32</td>
   <td>HGSVC2 haplotype-resolved assemblies (5 superpopulations)</td>
   <td>No</td>
   <td>&gt;40x PacBio CLR + &gt;20x HiFi (+ Strand-seq)</td>
   <td>111,746</td>
   <td>50</td>
   <td>168</td>
   <td>57,207,414</td>
 </tr>
 <tr>
   <td><a href="hgTrackUi?g=hgsvc3Sv">HGSVC3</a></td>
   <td>65</td>
   <td>HGSVC3 diverse reference assemblies</td>
   <td>No</td>
   <td>~47x HiFi + ~56x ONT</td>
   <td>176,531</td>
   <td>50</td>
   <td>154</td>
   <td>30,176,500</td>
 </tr>
 <tr>
   <td><a href="hgTrackUi?g=aprSv">Arab APR</a></td>
   <td>53</td>
   <td>UAE-resident Arabs from 8 countries (Arab Pangenome Reference)</td>
   <td>No</td>
   <td>~35x HiFi + ~54x ONT (+ Hi-C, pangenome graph)</td>
   <td>72,656</td>
   <td>1</td>
   <td>21</td>
   <td>99,885</td>
 </tr>
 <tr>
   <td><a href="hgTrackUi?g=cpc1Sv">CPC</a></td>
   <td>58</td>
   <td>Chinese Pangenome Consortium, 36 minority ethnic groups (HPRC-specific SVs removed)</td>
   <td>No</td>
   <td>~30x HiFi (pangenome graph)</td>
   <td>36,030</td>
   <td>1</td>
   <td>53</td>
   <td>8,998,096</td>
 </tr>
 <tr>
   <td><a href="hgTrackUi?g=kwanhoSv">Kim PD Brain</a></td>
   <td>100</td>
   <td>Parkinson's disease, ILBD, controls (post-mortem brain)</td>
   <td>Yes (PD + ILBD)</td>
   <td>~17x HiFi</td>
   <td>74,552</td>
   <td>50</td>
   <td>160</td>
   <td>190,088,222</td>
 </tr>
 <tr>
   <td><a href="hgTrackUi?g=chirmade101Sv">SVatalog 101</a></td>
   <td>101</td>
   <td>Cystic fibrosis (CF) patients from the CF Canada-Sick Kids Program in Individual CF Therapy (CFIT). Long-read WGS used for GWAS LD fine-mapping</td>
   <td>Yes (all CF)</td>
   <td>~50x PacBio CLR (34, Sequel I) + ~76x HiFi (67, Sequel II)</td>
   <td>87,183</td>
   <td>4</td>
   <td>160</td>
   <td>1,321,484</td>
 </tr>
 </table>
 
 <p>
 Note: there is likely some overlap in sample composition across these collections.
 For example, 1000 Genomes samples are also included in HPRC and CoLoRSdb.
 </p>
 
 <h3><a href="hgTrackUi?g=colorsDbSv">CoLoRSdb SVs</a></h3>
 <p>
 Structural variants from the Consortium of Long-Read Sequencing database
 (CoLoRSdb), from 1,427 PacBio HiFi long-read whole-genome sequences.
 ~426k SVs (insertions, deletions, inversions) called with pbsv and
 merged with Jasmine, with allele frequencies, genotype counts and
 Hardy-Weinberg statistics across the cohort.
 </p>
 
 <h3><a href="hgTrackUi?g=han945Sv">Han 945 SVs</a></h3>
 <p>
 Structural variants from 945 Han Chinese individuals. ~111k SVs
 (deletions, insertions, duplications, inversions, translocations) merged with SURVIVOR.
 Includes allele frequencies and per-sample support.
 </p>
 
 <h3><a href="hgTrackUi?g=gustafsonSv">1KG ONT 100 SVs</a></h3>
 <p>
 Structural variants from Oxford Nanopore long-read sequencing of 100
 1000 Genomes samples (5 superpopulations, 19 subpopulations) released
 by the 1000 Genomes ONT Sequencing Consortium and described in
 Gustafson et al. 2024. ~114k SVs (insertions, deletions, duplications,
 inversions) called with five callers and merged with Jasmine. This is a
 separate dataset from the Vienna 1KG-ONT release below; the 100 samples
 here do not overlap with the 1,019 samples in the Vienna release.
 </p>
 
 <h3><a href="hgTrackUi?g=lrSv1kgOnt">1KG ONT Vienna SVs</a></h3>
 <p>
 Structural variants from 1,019 individuals across 26 populations (1000 Genomes ONT).
 ~161k SVs annotated with SVAN, classifying insertions and deletions by mechanism
 of origin (mobile elements, VNTRs, processed pseudogenes, etc.).
 Original coordinates are on T2T-CHM13 (hs1); the hg38 version was created via liftOver.
 This is a separate dataset from the 1KG ONT 100 (Gustafson et al.) track above;
 the 1,019 samples here do not overlap with the 100 samples in that release.
 </p>
 
 <h3><a href="hgTrackUi?g=tommoJpSv">ToMMo Japanese SVs</a></h3>
 <p>
 Structural variants from 333 Japanese individuals (111 trios) from the Tohoku Medical
 Megabank (ToMMo). ~74k SVs (deletions and insertions) with trio-based Mendelian
 error rates and allele frequencies.
 </p>
 
 <h3><a href="hgTrackUi?g=aou1kSv">AoU 1K SVs</a></h3>
 <p>
 Structural variants from 1,027 individuals from the All of Us (AoU) Research Program,
 sequenced with PacBio HiFi long reads. AoU is a deeply phenotyped biobank
 that includes participants with a range of conditions (e.g. diabetes,
 hearing loss, hypertension), so the cohort is not disease-free.
 ~541k SVs (insertions and deletions) with population-specific allele
 frequencies, gene annotations, and clinical trait associations.
 </p>
 
 <h3><a href="hgTrackUi?g=ga4kSv">GA4K SVs</a></h3>
 <p>
 Structural variants from 502 probands and family members enrolled in the
 Genomic Answers for Kids (GA4K) pediatric rare-disease program at Children's
 Mercy Research Institute, sequenced with PacBio HiFi long reads. ~116k
 replicated SVs (deletions, insertions, duplications, inversions) called with
 pbsv and merged with JASMINE. The matched GA4K small-variant callset (SNVs
 and short indels) lives alongside other population allele-frequency resources
 as <a href="hgTrackUi?g=ga4kSnv">GA4K 552 PacBio LR</a> in the Variant
 Frequencies track collection.
 </p>
 
 <h3><a href="hgTrackUi?g=decodeSv">deCODE 3,622 SVs</a></h3>
 <p>
 High-confidence structural variants from 3,622 Icelanders (deCODE genetics),
 sequenced with Oxford Nanopore long reads. ~134k SVs (deletions, insertions
 and combined insertion/deletion events). Site-only callset with annotated
 surrounding tandem-repeat regions.
 </p>
 
 <h3><a href="hgTrackUi?g=hprc2v21Sv">HPRC v2.1 SVs</a></h3>
 <p>
 Structural variants derived from the Human Pangenome Reference Consortium
 release-2.1 minigraph-cactus pangenome graph, built from 233 PacBio HiFi
 haplotype-resolved assemblies (CHM13 + diverse 1000 Genomes samples).
 About 596k SV-sized alleles (insertions and deletions) extracted from the
 graph with <tt>vg deconstruct</tt>.
 </p>
 
 <h3><a href="hgTrackUi?g=hgsvc2Sv">HGSVC2 32 SVs</a></h3>
 <p>
 Structural variants from 32 haplotype-resolved diploid genomes (HGSVC2
 freeze 4, Ebert et al. 2021). ~112k SVs (deletions, insertions and
 inversions) called from phased de novo assemblies with PAV, with
 per-variant 1000 Genomes population allele frequencies (insertions and
 deletions) and rich structural/gene annotations. An earlier HGSVC release
 complementary to <a href="hgTrackUi?g=hgsvc3Sv">HGSVC3</a>.
 </p>
 
 <h3><a href="hgTrackUi?g=hgsvc3Sv">HGSVC3 65 SVs</a></h3>
 <p>
 Structural variants from 65 diverse individuals sequenced and de novo
 assembled by the Human Genome Structural Variation Consortium phase 3
 (HGSVC3). ~177k haplotype-resolved SVs (deletions, insertions and
 inversions) called with PAV and cross-validated with ten additional callers,
 with per-site carrier haplotype lists and structural annotations.
 </p>
 
 <h3><a href="hgTrackUi?g=kwanhoSv">Kim PD Brain SVs</a></h3>
 <p>
 Structural variants from 100 post-mortem brain samples (Parkinson's disease,
 incidental Lewy body disease, and healthy controls) sequenced with PacBio
 HiFi long reads. ~75k high-confidence SVs (deletions, insertions,
 duplications, inversions) with per-cohort allele frequencies and
 case-control carrier-rate differentials, from Kim et al. 2026.
 </p>
 
 <h3><a href="hgTrackUi?g=chirmade101Sv">SVatalog 101 SVs</a></h3>
 <p>
 Structural variants from 101 long-read whole-genome sequences released
 alongside the GWAS SVatalog tool (Chirmade et al. 2026). The samples come
 from the CF Canada-Sick Kids Program in Individual CF Therapy (CFIT), a
 cystic-fibrosis (CF) patient cohort assembled to model patient-specific
 responses to CFTR modulator therapies (most participants are F508del
 homozygotes or F508del / minimal-function compound heterozygotes; a smaller
 number carry rare nonsense or missense CFTR mutations). ~87k SVs
 (deletions, insertions, duplications, inversions and complex events)
 annotated with gene overlaps, ClinGen / gnomAD constraint scores,
 OMIM / ClinVar / DGV / Decipher regional annotations.
 </p>
 
 
 <h2>Data Access</h2>
 <p>
 Each subtrack has its own documentation page with details on how to download
 and intersect the underlying annotations.
 </p>
 
 <h2>References</h2>
 
 <p>
 Gong J, Sun H, Wang K, Zhao Y, Huang Y, Chen Q, Qiao H, Gao Y, Zhao J, Ling Y <em>et al</em>.
 <a href="https://doi.org/10.1038/s41467-025-56661-9" target="_blank">
 Long-read sequencing of 945 Han individuals identifies structural variants associated with
 phenotypic diversity and disease susceptibility</a>.
 <em>Nat Commun</em>. 2025 Feb 10;16(1):1494.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/39929826" target="_blank">39929826</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11811171/" target="_blank">PMC11811171</a>
 </p>
 
 <p>
 Schloissnig S, Pani S, Ebler J, Hain C, Tsapalou V, S&#246;ylev A, H&#252;ther P, Ashraf H, Prodanov T,
 Asparuhova M <em>et al</em>.
 <a href="https://doi.org/10.1038/s41586-025-09290-7" target="_blank">
 Structural variation in 1,019 diverse humans based on long-read sequencing</a>.
 <em>Nature</em>. 2025 Aug;644(8076):442-452.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/40702182" target="_blank">40702182</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12350158/" target="_blank">PMC12350158</a>
 </p>
 
 
 <p>
 Otsuki A, Okamura Y, Ishida N, Tadaka S, Takayama J, Kumada K, Kawashima J, Taguchi K, Minegishi N,
 Kuriyama S <em>et al</em>.
 <a href="https://doi.org/10.1038/s42003-022-03953-1" target="_blank">
 Construction of a trio-based structural variation panel utilizing activated T lymphocytes and long-
 read sequencing technology</a>.
 <em>Commun Biol</em>. 2022 Sep 20;5(1):991.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/36127505" target="_blank">36127505</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9489684/" target="_blank">PMC9489684</a>
 </p>
 
 
 
 <p>
 Garimella KV, Li Q, Wertz J, Lee SK, Cunial F, Huang Y, Mostovoy Y, Lorig-Roach R, English A, Su H
 <em>et al</em>.
 <a href="https://doi.org/10.1101/2025.10.02.25336942" target="_blank">
 Population-scale Long-read Sequencing in the All of Us Research Program</a>.
 <em>medRxiv</em>. 2025 Oct 5;.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/41256123" target="_blank">41256123</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12622093/" target="_blank">PMC12622093</a>
 </p>
 
 
 
 <p>
 Cohen ASA, Farrow EG, Abdelmoity AT, Alaimo JT, Amudhavalli SM, Anderson JT, Bansal L, Bartik L,
 Baybayan P, Belden B <em>et al</em>.
 <a href="https://linkinghub.elsevier.com/retrieve/pii/S1098-3600(22)00653-0" target="_blank">
 Genomic answers for children: Dynamic analyses of &gt;1000 pediatric rare disease genomes</a>.
 <em>Genet Med</em>. 2022 Jun;24(6):1336-1348.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/35305867" target="_blank">35305867</a>
 </p>
 
 
 
 <p>
 Beyter D, Ingimundardottir H, Oddsson A, Eggertsson HP, Bjornsson E, Jonsson H, Atlason BA,
 Kristmundsdottir S, Mehringer S, Hardarson MT <em>et al</em>.
 <a href="https://doi.org/10.1038/s41588-021-00865-4" target="_blank">
 Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in
 human diseases and other traits</a>.
 <em>Nat Genet</em>. 2021 Jun;53(6):779-786.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/33972781" target="_blank">33972781</a>
 </p>
 
 
 
 <p>
 Logsdon GA, Ebert P, Audano PA, Loftus M, Porubsky D, Ebler J, Yilmaz F, Hallast P, Prodanov T, Yoo
 D <em>et al</em>.
 <a href="https://doi.org/10.1038/s41586-025-09140-6" target="_blank">
 Complex genetic variation in nearly complete human genomes</a>.
 <em>Nature</em>. 2025 Aug;644(8076):430-441.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/40702183" target="_blank">40702183</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12350169/" target="_blank">PMC12350169</a>
 </p>
 
 
 
 <p>
 Kim K, Lin Z, Simmons SK, Parker J, Kearney M, Liao Z, Haywood N, Zhang J, Cline MP, Tuncali I
 <em>et al</em>.
 <a href="https://doi.org/10.64898/2026.03.20.713192" target="_blank">
 Integrating Long-Read Structural Variant Analysis with single-nucleus RNA-seq to Elucidate Gene
 Expression Effects in Disease</a>.
 <em>bioRxiv</em>. 2026 Mar 23;.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/41929179" target="_blank">41929179</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13041997/" target="_blank">PMC13041997</a>
 </p>
 
 
 
 <p>
 Chirmade S, Wang Z, Mastromatteo S, Sanders E, Thiruvahindrapuram B, Nalpathamkalam T, Pellecchia G,
 Lin F, Keenan K, Patel RV <em>et al</em>.
 <a href="https://doi.org/10.1038/s41437-025-00809-2" target="_blank">
 GWAS SVatalog: a visualization tool to aid fine-mapping of GWAS loci with structural variations</a>.
 <em>Heredity (Edinb)</em>. 2026 Mar;135(3):199-210.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/41203876" target="_blank">41203876</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13031531/" target="_blank">PMC13031531</a>
 </p>
 
 
 
 <p>
 Gustafson JA, Gibson SB, Damaraju N, Zalusky MPG, Hoekzema K, Twesigomwe D, Yang L, Snead AA,
 Richmond PA, De Coster W <em>et al</em>.
 <a href="http://genome.cshlp.org/lookup/pmidlookup?view=long&amp;pmid=39358015" target="_blank">
 High-coverage nanopore sequencing of samples from the 1000 Genomes Project to build a comprehensive
 catalog of human genetic variation</a>.
 <em>Genome Res</em>. 2024 Nov 20;34(11):2061-2073.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/39358015" target="_blank">39358015</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11610458/" target="_blank">PMC11610458</a>
 </p>
 
 
 
 <p>
 Ebert P, Audano PA, Zhu Q, Rodriguez-Martin B, Porubsky D, Bonder MJ, Sulovari A, Ebler J, Zhou W,
 Serra Mari R <em>et al</em>.
 <a href="https://www.science.org/doi/10.1126/science.abf7117" target="_blank">
 Haplotype-resolved diverse human genomes and integrated analysis of structural variation</a>.
 <em>Science</em>. 2021 Apr 2;372(6537).
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/33632895" target="_blank">33632895</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8026704/" target="_blank">PMC8026704</a>
 </p>
 
 
 
 <p>
 Byrska-Bishop M, Evani US, Zhao X, Basile AO, Abel HJ, Regier AA, Corvelo A, Clarke WE, Musunuri R,
 Nagulapalli K <em>et al</em>.
 <a href="https://linkinghub.elsevier.com/retrieve/pii/S0092-8674(22)00991-6" target="_blank">
 High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602
 trios</a>.
 <em>Cell</em>. 2022 Sep 1;185(18):3426-3440.e19.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/36055201" target="_blank">36055201</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9439720/" target="_blank">PMC9439720</a>
 </p>