a8e19c9657462c53e955bb58de7ef4093fda6d9c
hiram
  Wed Oct 9 11:43:23 2019 -0700
adding data access section and correct reference refs #21784

diff --git src/hg/makeDb/trackDb/human/platinumGenomes.html src/hg/makeDb/trackDb/human/platinumGenomes.html
new file mode 100644
index 0000000..53d8fe5
--- /dev/null
+++ src/hg/makeDb/trackDb/human/platinumGenomes.html
@@ -0,0 +1,57 @@
+<h2>Abstract</h2>
+
+<p>
+Improvement of variant calling in next-generation sequence data requires
+a comprehensive, genome-wide catalog of high-confidence variants called in
+a set of genomes for use as a benchmark. We generated deep, whole-genome
+sequence data of 17 individuals in a three-generation pedigree and called
+variants in each genome using a range of currently available algorithms.
+We used haplotype transmission information to create a phased "Platinum"
+variant catalog of 4.7 million single-nucleotide variants (SNVs)
+plus 0.7 million small (1-50 bp) insertions and deletions (indels) that are
+consistent with the pattern of inheritance in the parents and 11 children
+of this pedigree. Platinum genotypes are highly concordant with the current
+catalog of the National Institute of Standards and Technology for
+both SNVs (&gt;99.99%) and indels (99.92%) and add a validated truth catalog
+that has 26% more SNVs and 45% more indels. Analysis of 334,652 SNVs that
+were consistent between informatics pipelines yet inconsistent with haplotype
+transmission ("nonplatinum") revealed that the majority of these variants
+are de novo and cell-line mutations or reside within previously unidentified
+duplications and deletions. The reference materials from this study are a
+resource for objective assessment of the accuracy of variant calls
+throughout genomes. 
+</p>
+
+<p>
+The 'hybrid' truthsets were generated by merging Genome in a Bottle
+high confidence calls (hg001, v3.3.2) with those from the Platinum
+Genomes truthset for the same sample (NA12878, v2017-1.0). Merged
+records were validated by performing a k-mer test on alignments from
+the lower pedigree CEPH 1463 (11 children). Records with k-mer support
+via haplotype inheritance were added to the hybrid truthset.
+</p>
+
+<h2>Data Access</h2>
+<p>
+The VCF files for this track can be obtained from the download server:
+<a href="https://hgdownload.soe.ucsc.edu/gbdb/$db/platinumGenomes/" target=_blank>
+https://hgdownload.soe.ucsc.edu/gbdb/$db/platinumGenomes/</a>.<br>
+These files were obtained from the Platinum genomes source archive:
+<a href="https://s3.eu-central-1.amazonaws.com/platinum-genomes/2017-1.0/ReleaseNotes.txt" target=_blank>https://s3.eu-central-1.amazonaws.com/platinum-genomes/2017-1.0/ReleaseNotes.txt</a>.
+</p>
+
+<h2>Reference</h2>
+
+<a href="https://genome.cshlp.org/content/27/1/157" target=_blank>
+A reference data set of 5.4 million phased human variants
+validated by genetic inheritance from sequencing a three-generation
+17-member pedigree</a><br>
+<em>Genome Research</em>. 2017 Jan;27(1):157-164. doi: 10.1101/gr.210500.116. Epub 2016 Nov 30.<br>
+PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/27903644" target="_blank">27903644</a><br>
+PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5204340/" target=_blank>PMC5204340</a>
+<p>
+Michael A. Eberle, Epameinondas Fritzilas, Peter Krusche, Morten Källberg,
+Benjamin L. Moore, Mitchell A. Bekritsky, Zamin Iqbal, Han-Yu Chuang,
+Sean J. Humphray, Aaron L. Halpern, Semyon Kruglyak, Elliott H. Margulies,
+Gil McVean and David R. Bentley
+</p>