6b1940d180706a84a5cdbf53dd59cff08a21d402 dschmelt Mon Apr 22 17:00:26 2019 -0700 Updating sentence in vcfHelp doc #22039 diff --git src/hg/htdocs/goldenPath/help/hgVcfTrackHelp.html src/hg/htdocs/goldenPath/help/hgVcfTrackHelp.html index 45b4916..e70b51f 100755 --- src/hg/htdocs/goldenPath/help/hgVcfTrackHelp.html +++ src/hg/htdocs/goldenPath/help/hgVcfTrackHelp.html @@ -1,134 +1,135 @@ <!DOCTYPE html> <!--#set var="TITLE" value="Genome Browser VCF Tracks" --> <!--#set var="ROOT" value="../.." --> <!-- Relative paths to support mirror sites with non-standard GB docs install --> <!--#include virtual="$ROOT/inc/gbPageStart.html" --> <!-- VCF Track Help Page --> <a name="Topic5"></a> <h1>Configuring VCF tracks</h1> <p> Genome Browser VCF tracks may be configured in a variety of ways to highlight different aspects of the displayed information. By default, VCFs will display alleles with base-specific coloring. Homozygote data are shown as one letter, while heterozygotes will be displayed with both letters. </p> <p class='text-center'> <img src="../../images/vcfDefault.png" alt="VCF default display" width=600px> <p class='gbsCaption text-center'>The default VCF custom track will display colored bases and will not show clustering unless specified as VCF/tabix in the custom track page.</p> </p> <p> The following section describes configuration settings available to VCF files compressed and indexed in the Tabix format. This requires VCF manipulation, separate index files, and a web accessible directory to reference from the bigDataUrl track line. For more information on setting up and uploading VCF/Tabix data, click the link on <a href="vcf.html" target="_blank"> VCF custom track creation</a>. </p> <h2>Configuring the haplotype sorting display</h2> <p> If the VCF file contains genotype columns for at least two samples (four haplotypes), then a -haplotype sorting display can be configured. This can be useful for determining similarity -between the samples and possible inheritance at a particular loci. </p> +haplotype sorting display can be configured. This can be useful for determining the similarity +between the samples and inferring inheritance at a particular locus. +</p> <p> <strong>Enable Haplotype sorting display:</strong> When this option is checked, each sample's phased and/or homozygous genotypes are split into haplotypes, clustered by similarity around a central variant, and sorted for display by their position in the clustering tree. The tree (as space allows) is drawn in the label area next to the track image. Leaf clusters, in which all haplotypes are identical (at least for the variants used in clustering), are colored purple. </p> <p class='text-center'> <img src="../../images/vcfTree.png" alt="VCF tree diagram" width=600px> <p class='gbsCaption text-center'>The haplotype tree can be seen to the left of the track.</p> </p> <p> Each variant is drawn as a vertical column, using color to distinguish between reference alleles and alternate alleles of the horizontally running haplotypes. If unchecked, then the display is the same as for VCF without genotypes: a stacked bar graph of the top two alleles, showing the proportion of alleles if allele counts are available. This setting is enabled by default.</p> <p> The following options are applicable only when the haplotype sorting display is enabled:</p> <p> <strong>Haplotype sorting order:</strong> Haplotypes are sorted using a distance function that uses a central variant. Differences between haplotypes are penalized with weights that decrease for each successive variant away from the central variant. By default, the median variant in the window is used. By clicking on a variant in the display, you will get the option to always use that variant when it is in the current view.</p> <p> <strong>Haplotype coloring scheme:</strong> There are three ways that reference and alternate alleles can be colored:</p> <ul> <li> By default, the reference allele is invisible and the alternate allele is black. When multiple haplotypes must be combined into the same pixel row, grayscale is used to shade according to the proportions of reference and alternate alleles. The central variant has a thin purple outline. Extra pixel rows at the top and bottom show the locations of variants in case they are hard to see due when the invisible reference allele is the major allele. Variants used in clustering have purple marks in these rows; variants outside the clustered regions have black marks.</li> <li> The reference allele is blue and the alternate allele is red. Purple indicates a mix of reference and alternate alleles. The central variant has a thick black outline.</li> <li> Both alleles are colored using the same color scheme as when there are no genotypes: A is red, C is blue, G is green and T is magenta. Gray indicates a mix of reference and alternate alleles. The central variant has a thick black outline.</li> </ul> <p> In all coloring modes, if some alleles in a haplotype are undefined, a pale yellowish color is used for those alleles.</p> <p> <strong>Haplotype clustering leaf shape:</strong> Leaf clusters are collections of identical haplotypes. By default, they are drawn as open triangles <. They can also be displayed as open rectangles [.</p> <p class='text-center'> <img src="../../images/vcfSquare.png" alt="VCF options" width=600px> </p> <p> <strong>Haplotype sorting display height:</strong> This number represents the track height in pixels. If the number of pixels is fewer than the number of haplotypes (2 * the number of genotype columns), some horizontal pixel rows must represent multiple haplotypes; with differing haplotypes' colors combined according to the selected coloring scheme.</p> <p class='text-center'> <img src="../../images/vcfOptions.png" alt="VCF options" width=600px> </p> <h2>Filtering out variants</h2> <p> Variants can be filtered out of the display according to several properties:</p> <ul> <li> <strong>Exclude items with QUAL score less than <em>N</em>:</strong> If the checkbox is checked, then all variants whose QUAL column has a non-numeric value (e.g. ".") or a value less than <em>N</em> are excluded from display. By default, the checkbox is not checked and <em>N</em> is 0.</li> <li> <strong>Exclude items with these FILTER values:</strong> This option appears only if the VCF header defines at least one FILTER code. There is a checkbox for each code defined in the header. If checked, then all variants with that code in the FILTER column are excluded from display. By default, no checkboxes are checked, so all variants are displayed regardless of FILTER column values.</li> <li> <strong>Minimum minor allele frequency (if INFO column includes AF or AC+AN):</strong> If a variant's INFO field includes AF (alternate allele frequency) or both AC and AN (alternate allele count and total number of alleles), then its minor allele frequency can be compared against this threshold. If the minor allele frequency is less than the threshold, the variant will not be displayed.</li> </ul> <p class='text-center'> <img src="../../images/vcfFilters.png" alt="VCF options" width=600px> </p> <p> When you have finished making your configuration changes, click the <button>Submit</button> button to return to the annotation track display page.</p> <!--#include virtual="$ROOT/inc/gbPageEnd.html" -->