src/hg/makeDb/trackDb/human/hg17/cnp.html 198c9b8daecc44fbda6a6494c566c723920f030a

198c9b8daecc44fbda6a6494c566c723920f030a
lrnassar
  Wed Mar 11 18:25:21 2026 -0700
Fixing a few hundred clear typos with the help of Claude. Some are less important in code comments, but majority of them are in user-facing places. I manually approved 60%+ of the changes and didn't see any that were an incorrect suggestion, at worst it was potentially uncessesary, like a code comment having cant instead of can't. No RM.

diff --git src/hg/makeDb/trackDb/human/hg17/cnp.html src/hg/makeDb/trackDb/human/hg17/cnp.html
index e26997edb24..d0140b6b60b 100644
--- src/hg/makeDb/trackDb/human/hg17/cnp.html
+++ src/hg/makeDb/trackDb/human/hg17/cnp.html
@@ -1,322 +1,322 @@
 <H2>Description</H2>
 <P>
 This annotation shows regions detected as putative copy number polymorphisms
 (CNP) and sites of detected intermediate-sized structural variation (ISV). 
 The CNPs and ISVs were determined by various methods, displayed in 
 individual subtracks within the annotation:</P>
 <UL>
 <LI>
 <B>BAC microarray analysis (Sharp):</B> 154 putative CNP regions detected by BAC
 microarray analysis in a population of 47 individuals comprised of 8 
 Chinese, 4 Japanese, 10 Czech, 2 Druze, 7 Biaka, 9 Mbuti, and 7 Amerindians. 
 <LI>
 <B>BAC microarray analysis (Iafrate):</B> 249 putative CNP regions detected by
 BAC microarray analysis in a population of 55 individuals, 16 of which had
 previously-characterized chromosomal abnormalities. The group consisted of 10
 Caucasians, 4 Amerindians, 2 Chinese, 2 Indo-Pakistani, 2 Sub-Saharan
 African, and 35 of unknown ethnic origin.
 <LI>
 <B>Representational oligonucleotide microarray analysis (ROMA) (Sebat):</B> 72 putative
 CNP regions detected by ROMA in a population of 20 normal individuals comprised
 of 1 Biaka, 1 Mbuti, 1 Druze, 1 Melanesian, 4 French, 1 Venezualan, 1 Cambodian,
 1 Mayan and 9 of unknown ethnicity.
 <LI>
 <B>Fosmid mapping (Tuzun):</B> 297 ISV sites detected by mapping paired-end sequences 
 from a human fosmid DNA library.
 <LI>
 <B>Deletions from genotype analysis (McCarroll):</B> 538 deletions detected
 by analysis of SNP genotypes, using the HapMap Phase I data, release 16a.
 <LI>
 <B>Deletions from genotype analysis (Conrad):</B> 910 deletions detected
 by analysis of SNP genotypes, using the HapMap Phase I data, release 16c.1, 
 CEU and YRI samples.
 <LI>
 <B>Deletions from haploid hybridization analysis (Hinds):</B> 100 deletions 
 from haploid hybridization analysis in 24 unrelated individuals from the 
 Polymorphism Discovery Resource, selected for SNP LD study.
 <LI>
 <B>SNP and BAC microarray analysis of HapMap data (Redon):</B> 1,447 copy number 
 variable regions found in the HapMap Phase II data.
 </UL></P>
 
 <H2>Display Conventions and Configuration</H2>
 <P>
 CNP and ISV regions are indicated by solid blocks that are color-coded to 
 indicated the type of variation detected:
 <UL>
 <LI>
 <B><FONT COLOR="green">Green</FONT>:</B> gain (duplications)
 <LI>
 <B><FONT COLOR="red">Red</FONT>:</B> loss (deletions)
 <LI>
 <B><FONT COLOR="blue">Blue</FONT>:</B> gain and loss (both deletion and duplication)
 <LI>
 <B>Black:</B> inversion
 <LI>
 <B><FONT COLOR="gray">Gray</FONT>:</B> gain or loss (unknown direction)
 </UL></P>
 <P>Note that display IDs are not preserved between assemblies.</P>
 
 <H3>Sharp subtrack </H3>
 <P>
 On the details pages for elements in this subtrack, 
 the table shows value/threshold data for each individual in the population.
 &quot;Value&quot; is defined as the log<sub>2</sub> ratio of fluorescence intensity of
 test versus reference DNA. &quot;Threshold&quot; is defined as 2 standard 
 deviations from the mean log<sub>2</sub> ratio of all autosomal clones per 
 hybridization. 
 The &quot;Disease Percent&quot; value reflects the percent of the BAC that lies 
 within a &quot;rearrangement hotspot&quot;, as defined in Sharp <em>et al</em>. 
 (2005). A 
 rearrangement hotspot is defined by the presence of flanking intrachromosomal 
 duplications &gt;10 kb in length with &gt;95% similarity and separated by 
 50 kb - 10 Mb of intervening sequence.</P>
 
 <H3>Tuzun subtrack</H3>
 <P>
 Items are labeled using the following naming convention:
 <UL>
 <LI><B>First letter:</B> rearrangement type (<B>D</B>=deletion, <B>I</B>=insertion, 
 <B>V</B>=inversion).
 <LI><B>Second letter:</B> association with repeat or duplication
 (<B>R</B>=human-specific repeat, <B>D</B>=duplication, <B>N</B>=neither 
 (unique)).
 <LI><B>Third letter:</B> second haplotype support (<B>N</B>=variant site lacking
 support from the human genome reference, <B>S</B>=variant site with support 
 from the human genome reference). 
 </UL></P>
 
 <H3>Conrad subtrack</H3>
 <P>
 The method used to identify these deletions approximates the breakpoints of each
 event; therefore, a set of minimal and maximal endpoints is associated with each
 deletion. Thick lines delineate
 the minimally deleted region; thin lines delineate the maximally deleted region.
 
 <H2>Methods</H2>
 
 <H3>Sharp BAC microarray analysis</H3>
 <P>
 All hybridizations were performed in duplicate incorporating a dye-reversal 
 using a custom array consisting of 2,194 end-sequence or FISH-confirmed BACs, 
 targeted to regions of the genome flanked by segmental duplications. 
 The false positive rate was estimated at ~3 clones per 4,000 tested.</P>
 <P>
 Note that CNP intervals, as detailed by Sharp <em>et al</em>., were 
 converted from the July 2003 human genome assembly (NCBI Build 34) to the 
 May 2004 assembly (NCBI Build 35) using BLAT alignments of BAC End
 pairs and the UCSC 
 <A HREF="http://genome.soe.ucsc.edu/goldenPath/help/hgTracksHelp.html#Convert">liftOver</A>
 tool.</P>
 
 <H3>Iafrate BAC microarray analysis</H3>
 <P>
 All hybridizations were performed in duplicate incorporating a dye-reversal 
 using proprietary 1 Mb GenomeChip V1.2 Human BAC Arrays consisting of 2,632 BAC 
 clones (Spectral Genomics, Houston, TX). The false positive rate was estimated 
 at ~1 clone per 5,264 tested. </P>
 <P>
 Further information is available from the 
 <A HREF="http://projects.tcag.ca/variation/" TARGET=_blank>Database of Genomic
 Variants</A> website.</P>
 <P>
 Note that CNP intervals, as detailed by Iafrate <em>et al</em>., were 
 converted from the July 2003 human genome assembly (NCBI Build 34) to the 
 May 2004 assembly (NCBI Build 35) using the UCSC 
 <A HREF="http://genome.soe.ucsc.edu/goldenPath/help/hgTracksHelp.html#Convert">liftOver</A>
 tool.</P>
 
 <H3>Sebat ROMA</H3>
 <P>
 Following digestion with BglII or HindIII, genomic DNA was hybridized to a 
 custom array consisting of 85,000 oligonucleotide probes. The probes were 
 selected to be free of common repeats and have unique homology within the human 
 genome. The average resolution of the array was ~35 kb; however, only intervals 
 in which three consecutive probes showed concordant signals were scored as 
 CNPs. All hybridizations were performed in duplicate incorporating a 
 dye-reversal, with the false positive rate estimated to be ~6%.</P>
 <P>
 Note that CNP intervals, as detailed by Sebat <em>et al</em>., were 
 converted from the April 2003 human genome assembly (NCBI Build 33) to the 
 July 2003 assembly (NCBI Build 34) and the May 2004 assembly
 (NCBI Build 35) using the UCSC 
 <A HREF="http://genome.soe.ucsc.edu/goldenPath/help/hgTracksHelp.html#Convert">liftOver</A>
 tool.</P>
 
 <H3>Tuzun fosmid mapping</H3>
 <P>
 Paired-end sequences from a human fosmid DNA library were mapped to the assembly. 
 The average resolution of this 
 technique was ~8 kb, and included 56 sites of inversion not detectable by 
 the array-based approaches.  However, because of the physical constraints of 
 fosmid insert size, this technique was unable to detect insertions greater than 
 40 kb in size.</P>
 
 <H3>McCarroll genotype analysis</H3>
 <P>
 A segregating deletion can leave &quot;footprints&quot; in SNP genotype data, including
 apparent deviations from Mendelian inheritance, apparent deviations from
 Hardy-Weinberg equilibrium and null genotypes.  Using these clues to discover
 true variants is challenging, however, because the vast majority of such observations
 represent technical artifacts and genotyping errors.
 </P>
 <P>
 To determine whether a subset of &quot;failed&quot; SNP genotyping assays in the HapMap data
 might reflect structural variation, the authors examined whether such failures
 were physically clustered in a manner that is specific to individuals.  Consistent
 with this hypothesis, the rate of Mendelian-inconsistent genotypes was elevated
 near other Mendelian-inconsistent genotypes in the same individual but was unrelated to
 Mendelian inconsistencies in other individuals.
 </P>
 <P>
 The authors systematically looked for regions of the genome in which the
 same failure profile appeared repeatedly at nearby markers in a manner that
 was statistically unexpected based on chance.  A set of statistical thresholds was 
 tailored to each mode of failure, genotyping center and genotyping platform used in the
 project.  The same procedure could readily apply to dense SNP data from any
 platform or study.</P>
 <P>
 Note that deletions as detailed by McCarroll <em>et al</em>. were 
 converted from the July 2003 human genome assembly (NCBI Build 34) to the 
 May 2004 assembly (NCBI Build 35) using the UCSC 
 <A HREF="http://genome.soe.ucsc.edu/goldenPath/help/hgTracksHelp.html#Convert">liftOver</A>
 tool.</P>
 
 
 <H3>Conrad genotype analysis</H3>
 <P>
 SNPs in regions that are hemizygous for a deletion are generally miscalled as homozygous 
 for the allele that is present.  Hence, when a deletion is transmitted from parent to child, 
 the genotypes at SNPs within the deletion region will often appear to violate the rules of Mendelian 
 transmission.  The authors developed a simple algorithm for scanning trio data for unusual runs of 
 consecutive SNPs that, in a single family, have genotype configurations consistent with the presence of a deletion. 
 
 <P>
 Note that deletions as detailed by Conrad <em>et al</em>. were 
 converted from the July 2003 human genome assembly (NCBI Build 34) to the 
 May 2004 assembly (NCBI Build 35) using the UCSC 
 <A HREF="http://genome.soe.ucsc.edu/goldenPath/help/hgTracksHelp.html#Convert">liftOver</A>
 tool.</P>
 
 <H3>Hinds haploid hybridization analysis</H3>
 <P>
 Approximately 600 Mb of genomic DNA from 24 unrelated individuals
 were obtained from the Polymorphism Discovery Resource.
 Haploid hybridization was used to identify genomic intervals
 showing a reduced hybridization signal in comparison to the reference
 assembly.   PCR amplification was performed on 215 candidate deletions.
 100 deletions were selected that were unambiguously confirmed.
 <H3>Redon analysis of HapMap data</H3>
 <P>Experiments were performed with the International HapMap DNA and cell-line collection
 using two technologies: comparative analysis of hybridization intensities on
 Affymetric GeneChip Human Mapping 500K early access arrays (500K EA)
 and comparative genomic hybridization with a Whole Genome TilePath (WGTP)
 array.
 
 <H2>Validation</H2>
 <H3>McCarroll genotype analysis</H3>
 <P>
 Four methods of validation were used: 
 fluorescent <em>in situ</em> hybridization (FISH), 
 two-color fluorescence intensity measurements, PCR amplification and quantitative PCR.
 </P>
 <P>
 The authors performed fluorescent <em>in situ</em> hybridization for five
 candidate deletions large enough to span available FISH probes.  In all five cases,
 FISH assays confirmed the deletions in the predicted individuals.
 </P>
 <P>
 The authors examined two-color allele-specific fluorescence data from SNP genotyping
 assays from a data subset available at the Broad Institute, looking for a 
 reduction in fluorescence intensity in individuals predicted to carry a 
 deletion.  At most SNPs
 in the genome, fluorescence intensity measurements clustered into two or three
 discrete groups corresponding to homozygous and hetrozygous genotypes.
 At 15 of 17 candidate deletion loci, fluorescence intensity data for one or more
 SNPs clustered into additional groups that corresponded to the predicted deletion
 genotypes.
 </P>
 <P>
 The authors used PCR amplification to query 60 loci for which the pattern of genotypes
 suggested multiple individuals with homozygous deletions.  Variants were considered
 confirmed if the pattern of amplification success and failure matched prediction
 across a set of 12-24 individuals.   The authors confirmed 51 of 60 candidate
 variants by this criterion.
 </P>
 <P>
 The authors performed quantitative PCR in all 269 HapMap DNA samples for 11 candidate
 deletions that overlapped the coding exons of genes and that were discovered in
 many individuals. At 10/11 loci, the authors observed three discrete clusters, identifying 
 individuals with zero, one and two gene copies.
 All 60 trios displayed Mendelian inheritance for the ten deletions, as well as
 Hardy-Weinberg equilibrium in all four populations surveyed, and transmission rates
 close to 50%.   This suggests that the deletions behave as a stable, heritable
 genetic polymorphism.
 </P>
 
 <H3>Conrad genotype analysis</H3>
 <P>
 The authors first tested 12 predicted deletions using quantitative PCR. 
 For all 12 deletions, DNA concentrations consistent with transmission of a
 deletion from parent to child were observed.
 <P>To provide more extensive validation by comparative genome hybridization (CGH), the authors designed a 
 custom oligonucleotide microarray comprised of 380,000 probes that tile across all 134 candidate deletions 
 identified in 9 HapMap offspring (8 YRI and 1 CEU). 
 The results of this CGH analysis indicate that the majority (about 85%) of candidate deletions detected 
 by the method are real.
 
 <H3>Redon analysis of HapMap data</H3>
-The authors utilized numerous quality meaures, including
+The authors utilized numerous quality measures, including
 repeated experiments on the WGTP array for 82 individual and on the 500K EA
 array for 15 individuals.
 The average false-positive rate per experiment was held beneath 5%.  Aberrant chromosomes were
 removed from the analysis.    Further details are available in the Nature paper cited below.
 <H2>References</H2>
 <P>
 Conrad, D., Andrews, T.D., Carter, N.P., Hurles, M.E., Pritchard, J.K.
 <A HREF="https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16327808&query_hl=1"
 TARGET=_blank>A high-resolution survey of deletion polymorphism in the human genome</A>.
 <em>Nature Genet</em> <B>38</B>(1), 75-81 (2006). </P>
 <P>
 Hinds, D., Kloek, A.P., Jen, M., Chen, X., Frazer, K.A.
 <A HREF="https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16327809&query_hl=1"
 TARGET=_blank>Common deletions and SNPs are in linkage disequilibrium in the human genome</A>.
 <em>Nature Genet</em> <B>38</B>(1), 82-85 (2006). </P>
 <P>
 Iafrate, J.A., Feuk, L., Rivera, M.N., Listewnik, M.L., Donahoe, P.K., Qi, Y., 
 Scherer, S.W. and Lee, C.  
 <A HREF="https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=15286789&query_hl=1"
 TARGET=_blank>Detection of large-scale variation in the human genome</A>.
 <em>Nature Genet</em> <B>36</B>(9), 949-51 (2004). </P>
 <P>
 McCarroll, S.A., Hadnott, T.N., Perry, G.H., Sabeti, P.C., 
 Zody, M.C., Barrett, J.C., Dallaire, S., Gabriel, S., Lee, C., Daly, M.J., 
 Altshuler, D.M.
 <A HREF="https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16468122&query_hl=1"
 TARGET=_blank>Common deletion polymorphisms in the human genome</A>.
 <em>Nature Genet</em> <B>38</B>(1), 86-92 (2006). </P>
 <P>
 Redon, R., Ishikawa, S., Fitch, K., Feuk, L., Perry, G., Andrews, T., Fiegler, H.,
 Lee, C., Jones, K., Scherer, S., Hurles, M. <em>et al</em>.
 <A HREF="https://www.nature.com/articles/nature05329" TARGET=_blank>
 Global variation in copy number in the human genome</A>.
 <em>Nature</em> <B>444</B>(7118), 444-454 (2006).
 <P>
 Sebat, J., Lakshmi, B., Troge, J., Alexander, J., Young, J., Lundin, P., 
 Maner, S., Massa, H., Walker, M., Chi, M. <em>et al</em>.
 <A HREF="https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=pubmed&amp;dopt=Abstract&amp;list_uids=15273396&amp;query_hl=3"
 TARGET=_blank>Large-scale copy number polymorphism in the human genome</A>.
 <em>Science</em> <B>305</B>(5683), 525-8 (2004).</P>
 <P>
 Sharp, A.J., Locke, D.P., McGrath, S.D., Cheng, Z., Bailey, J.A., Samonte, R.V.,
 Pertz, L.M., Clark, R.A., Schwartz, S., Segraves, R. <em>et al</em>.
 <A HREF="https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=15918152&query_hl=3"
 TARGET=_blank>Segmental duplications and copy number variation in the human 
 genome</A>. 
 <em>Am J Hum Genet</em> <B>77</B>(1), 78-88 (2005).</P>
 <P>
 Tuzun, E., Sharp, A.J., Bailey, J.A., Kaul, R., Morrison, V.A., Pertz, L.M., 
 Haugen, E., Hayden, H., Albertson, D.  Pinkel, D. <em>et al</em>.
 <A HREF="https://www.nature.com/articles/ng1562"
 TARGET=_blank>Fine-scale structural variation of the human genome</A>. 
 <em>Nature Genet</em> <B>37</B>(7), 727-32 (2005). </P>