94beebccbdcae3b709516cdacd2f42e2372ebae2 jnavarr5 Wed Sep 25 16:26:39 2019 -0700 Adding a note to other help pages about the 'useOneFile on' hub setting, refs #19383 diff --git src/hg/htdocs/goldenPath/help/vcf.html src/hg/htdocs/goldenPath/help/vcf.html index 56b1398..fe119c1 100755 --- src/hg/htdocs/goldenPath/help/vcf.html +++ src/hg/htdocs/goldenPath/help/vcf.html @@ -1,191 +1,195 @@ <!DOCTYPE html> <!--#set var="TITLE" value="Genome Browser VCF+tabix Track Format" --> <!--#set var="ROOT" value="../.." --> <!-- Relative paths to support mirror sites with non-standard GB docs install --> <!--#include virtual="$ROOT/inc/gbPageStart.html" --> <h1>VCF+tabix Track Format</h1> <p> <a href="http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41" target="_blank">Variant Call Format</a> (VCF) is a flexible and extendable line-oriented text format developed by the <a href="http://www.1000genomes.org/" target="_blank">1000 Genomes Project</a> for releases of single nucleotide variants, indels, copy number variants and structural variants discovered by the project. When a VCF file is compressed and indexed using <a href="http://samtools.sourceforge.net/tabix.shtml" target="_blank">tabix</a>, and made web-accessible, the Genome Browser is able to fetch only the portions of the file necessary to display items in the viewed region. This makes it possible to display variants from files that are so large that the connection to UCSC would time out when attempting to upload the whole file to UCSC. Both the VCF file and its tabix index file remain on your web-accessible server (http, https, or ftp), not on the UCSC server. UCSC temporarily caches the accessed portions of the files to speed up interactive display. If you do not have access to a web-accessible server and need hosting space for your VCF files, please see the <a href="hgTrackHubHelp.html#Hosting">Hosting</a> section of the Track Hub Help documentation.</p> <p>The UCSC tools support VCF versions 3.3 and greater.</p> <h2>Generating a VCF track</h2> <p> The typical workflow for generating a VCF custom track is this:</p> <ol> <li> If you haven't done so already, <a href="http://sourceforge.net/projects/samtools/files/tabix/" target="_blank"> download</a> and build the <a href="http://samtools.sourceforge.net/tabix.shtml" target="_blank">tabix and bgzip</a> programs. Test your installation by running tabix with no command-line arguments; it should print a brief usage message. For help with tabix, please contact the <a href="https://lists.sourceforge.net/lists/listinfo/samtools-help" target="_blank">samtools-help mailing list</a> (tabix is part of the samtools project).</li> <li> Create VCF or convert another format to VCF. Items must be sorted by genomic position.</li> <li> Compress your <em>.vcf</em> file using the <code>bgzip</code> program: <pre><code>bgzip my.vcf</code></pre> For more information about the <code>bgzip</code> command, run it with no arguments to display the usage message.</li> <li> Create a tabix index file for the bgzip-compressed VCF (<em>.vcf.gz</em>): <pre><code>tabix -p vcf my.vcf.gz</code></pre> The tabix command appends <em>.tbi</em> to the <em>my.vcf.gz</em> filename, creating a binary index file named <em>my.vcf.gz.tbi</em> with which genomic coordinates can quickly be translated into file offsets in <em>my.vcf.gz</em>.</li> <li> Move both the compressed VCF file (<em>my.vcf.gz</em>) and tabix index file (<em>my.vcf.gz.tbi</em>) to an http, https, or ftp location.</li> <li> Construct a <a href="hgTracksHelp.html#CustomTracks">custom track</a> using a single <a href="hgTracksHelp.html#TRACK">track line</a>. The basic version of the track line will look something like this: <pre><code>track type=vcfTabix name="My VCF" bigDataUrl=<em>http://myorg.edu/mylab/my.vcf.gz</em></code></pre> Again, in addition to <em>http://myorg.edu/mylab/my.vcf.gz</em>, the associated index file <em>http://myorg.edu/mylab/my.vcf.gz.tbi</em> must also be available at the same location.</li> <li> Paste the custom track line into the text box in the <a href="../../cgi-bin/hgCustom" target="_blank">custom track management page</a>, click "submit" and view in the Genome Browser.</li> </ol> <h2>Parameters for VCF custom track definition lines</h2> <p> All options are placed in a single line separated by spaces (lines are broken only for readability here):</p> <pre><code><strong>track type=vcfTabix bigDataUrl=</strong><em>http://...</em> <strong>hapClusterEnabled=</strong><em>true|false</em> <strong>hapClusterColorBy=</strong><em>altOnly|refAlt|base</em> <strong>hapClusterTreeAngle=</strong><em>triangle|rectangle</em> <strong>hapClusterHeight=</strong><em>N</em> <strong>applyMinQual=</strong><em>true|false</em> <strong>minQual=</strong><em>Q</em> <strong>minFreq=</strong><em>F</em> <strong>name=</strong><em>track_label</em> <strong>description=</strong><em>center_label</em> <strong>visibility=</strong><em>display_mode</em> <strong>priority=</strong><em>priority</em> <strong>db=</strong><em>db</em> <strong>maxWindowToDraw=</strong><em>N</em> <strong>chromosomes=</strong><em>chr1,chr2,...</em> </code></pre> <p> Note if you copy/paste the above example, you must remove the line breaks. Click <a href="examples/vcfExample.txt">here</a> for a text version that you can paste without editing.</p> <p> The track type and bigDataUrl are REQUIRED:</p> <pre><code><strong>type=vcfTabix bigDataUrl=</strong><em>http://myorg.edu/mylab/my.vcf.gz</em></strong> </code></pre> <p> The remaining settings are OPTIONAL. Some are specific to VCF:</p> <pre><code><strong>hapClusterEnabled </strong><em>true|false </em> # if file has phased genotypes, sort by local similarity <strong>hapClusterColorBy </strong><em>altOnly|refAlt|base </em> # coloring scheme, default altOnly, conditional on hapClusterEnabled <strong>hapClusterTreeAngle </strong><em>triangle|rectangle </em> # draw leaves as < or [, default <, conditional on hapClusterEnabled <strong>hapClusterHeight </strong><em>N </em> # height of track in pixels, default 128, conditional on hapClusterEnabled <strong>applyMinQual </strong><em>true|false </em> # if true, don't display items with QUAL < minQual; default false <strong>minQual </strong><em>Q </em> # minimum value of Q column to display item, conditional on applyMinQual <strong>minFreq </strong><em>F </em> # minimum minor allele frequency to display item; default 0.0 </code></pre> <p> Other optional settings are not specific to VCF, but relevant:</p> <pre><code><strong>name </strong><em>track label </em> # default is "User Track" <strong>description </strong><em>center label </em> # default is "User Supplied Track" <strong>visibility </strong><em>squish|pack|full|dense|hide</em> # default is hide (will also take numeric values 4|3|2|1|0) <strong>priority </strong><em>N </em> # default is 100 <strong>db </strong><em>genome database </em> # e.g. hg19 for Human Feb. 2009 (GRCh37) <strong>maxWindowToDraw </strong><em>N </em> # don't display track when viewing more than N bases <strong>chromosomes </strong><em>chr1,chr2,... </em> # track contains data only on listed reference assembly sequences </code></pre> The <a target="_blank" href="hgVcfTrackHelp.html">VCF track configuration</a> help page describes the VCF track configuration page options.</p> <h3>Example #1</h3> <p> In this example, you will create a custom track for an indexed VCF file that is already on a public server — variant calls generated by the <a href="http://1000genomes.org/" target="_blank">1000 Genomes Project</a>. The line breaks inserted here for readability must be removed before submitting the track line:</p> <pre><code>browser position chr21:33,034,804-33,037,719 track type=vcfTabix name="VCF Example One" description="VCF Ex. 1: 1000 Genomes phase 1 interim SNVs" chromosomes=chr21 maxWindowToDraw=200000 db=hg19 visibility=pack bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/vcfExample.vcf.gz </code></pre> <p> The "browser" line above is used to view a small region of chromosome 21 with variants from the <em>.vcf.gz</em> file.</p> <p> Note if you copy/paste the above example, you must remove the line breaks (or, click <a href="examples/vcfExampleOne.txt">here</a> for a text version that you can paste without editing).</p> <p> Paste the "browser" line and "track" line into the <a href="../../cgi-bin/hgCustom" target="_blank">custom track management page</a> for the human assembly hg19 (Feb. 2009), then click the "submit" button. On the following page, click the "chr21" link in the custom track listing to view the VCF track in the Genome Browser.</p> <h3>Example #2</h3> <p> In this example, you will create compressed, indexed VCF from an existing VCF text file. First, save this VCF file -- <em><a href="examples/vcfExampleTwo.vcf" target="_blank">vcfExampleTwo.vcf</a></em> -- to your machine. Perform steps 1 and 3-7 in the workflow described above, but substitute <em>vcfExampleTwo.vcf</em> for <em>my.vcf</em>. On the <a href="../../cgi-bin/hgCustom" target="_blank">custom track management page</a>, click the "add custom tracks" button if necessary and make sure that the genome is set to "Human" and the assembly is set to "Feb. 2009 (hg19) " before pasting the track line and submitting. Remember to remove the line breaks that have been added to the track line for readability (or, click <a href="examples/vcfExampleTwo.txt">here</a> for a text version that you can paste without editing):</p> <pre><code>track type=vcfTabix name="VCF Example Two" bigDataUrl=<em>http://myorg.edu/mylab/vcfExampleTwo.vcf.gz</em> description="VCF Ex. 2: More variants from 1000 Genomes" visibility=pack db=hg19 chromosomes=chr21 browser position chr21:33,034,804-33,037,719 browser pack snp132Common</code></pre> <h3>Example #3</h3> <p> In this example, you will load a hub that has VCF data described in a hub's trackDb.txt file. First, navigate to the <a href="hubQuickStart.html" target="_blank">Basic Hub Quick Start Guide</a> and review an introduction to hubs.</p> <p> Visualizing VCF files in hubs involves creating three text files: <em>hub.txt</em>, <em>genomes.txt</em>, and <em>trackDb.txt</em>. The browser is passed a URL to the top-level <em>hub.txt</em> file that points to the related <em>genomes.txt</em> and <em>trackDb.txt</em> files. The <em>trackDb.txt</em> file contains stanzas for each track that outlines the details and type of each track to display, such as these lines for a VCF file located at the bigDataUrl location:</p> <pre><code>track vcf1 bigDataUrl http://hgdownload.soe.ucsc.edu/gbdb/hg19/1000Genomes/ALL.chr21.integrated_phase1_v3.20101123.snps_indels_svs.genotypes.vcf.gz #Note: there is a corresponding fileName.vcf.gz.tbi in the same above directory shortLabel chr21 VCF example longLabel This chr21 VCF file is an example from the 1000 Genomes Phase 1 Integrated Variant Calls Track type vcfTabix visibility dense </code></pre> <p> +<b>Note:</b> there is now a <code>useOneFile on</code> hub setting that allows the hub +properties to be specified in a single file. More information about this setting can be found on the +<a href="./hgTracksHelp.html#UseOneFile" target="_blank">Genome Browser User Guide</a>.</p> +<p> Here is a direct link to the <em><a href="examples/hubDirectory/hg19/trackDb.txt" target="_blank">trackDb.txt</a></em> file to see more information about this example hub, and below is a direct link to visualize the hub in the browser, where this example VCF file displays in dense mode alongside the other tracks in this hub. You can find more Track Hub VCF display options on the <a href="trackDb/trackDbHub.html#vcfTabix" target="_blank">Track Database (trackDb) Definition Document</a> page.</p> <pre><code><a href="../../cgi-bin/hgTracks?db=hg19&hubUrl=http://genome.ucsc.edu/goldenPath/help/examples/hubDirectory/hub.txt" target="_blank">http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&hubUrl=http://genome.ucsc.edu/goldenPath/help/examples/hubDirectory/hub.txt</a> </code></pre> <h2>Sharing Your Data with Others</h2> <p> If you would like to share your VCF data track with a colleague, learn how to create a URL by looking at <strong>Example 6</strong> on the <a href="customTrack.html#SHARE">custom tracks</a> page.</p> <!--#include virtual="$ROOT/inc/gbPageEnd.html" -->