d8f77b46c9ef3bbb0c62927eb86fe8aa3b1d494b max Thu Feb 18 06:31:01 2021 -0800 making some changes to the wiggle docs page, OKed by Hiram, no redmine yet diff --git src/hg/htdocs/goldenPath/help/wiggle.html src/hg/htdocs/goldenPath/help/wiggle.html index 90a1e30..3a60bd1 100755 --- src/hg/htdocs/goldenPath/help/wiggle.html +++ src/hg/htdocs/goldenPath/help/wiggle.html @@ -1,44 +1,62 @@ <!DOCTYPE html> <!--#set var="TITLE" value="Genome Browser Wiggle Track Format" --> <!--#set var="ROOT" value="../.." --> <!-- Relative paths to support mirror sites with non-standard GB docs install --> <!--#include virtual="$ROOT/inc/gbPageStart.html" --> -<h1>Wiggle Track Format (WIG)</h1> +<h1>Wiggle Track ASCII Text Format (.wig)</h1> <p> -The <a href="bigWig.html">bigWig</a> format is the recommended format for almost all graphing track -needs (for more information, see the following +<p>Wiggle files and its bedgraph variant allow you to plot quantitative data as either shades of +color (dense mode) or bars of varying height (full and pack mode) on the +genome. Both are text files that are easy to create, but need to be converted for actual use by the +genome browser.</p> + +<p>The <a href="bedgraph.html">bedGraph</a> format is a very similar format for +<a href="#sparse">sparse</a> data or data that contains elements of +varying size. bedGraph can also be converted to compressed/indexed binary bigWig files. +If you have other data to show in addition to the quantitative data, e.g. data you +want to show on mouseover or when the user clicks the feature (like GWAS data), +you should have a look at bigBed files with the "lollipop" type (contact us for more info). +For a list of all possible formats for graphing, see the following <a href="http://genomewiki.ucsc.edu/index.php/Selecting_a_graphing_track_data_format" -target="_blank">wiki page</a>). The wiggle (WIG) format is an older format for display of dense, -continuous data such as GC percent, probability scores, and transcriptome data. Wiggle data elements -must be equally sized. The <a href="bedgraph.html">bedGraph</a> format is also an older format used -to display <a href="#sparse">sparse</a> data or data that contains elements of varying size.</p> +target="_blank">wiki page</a>.</p> +<p>Text files in wiggle format can be uploaded as custom tracks as-is to UCSC +where they are compressed and stored for some time. But we recommand that you convert them +on your own computer to the binary bigWig storage format. You then copy bigWig files onto +your own webserver and they are referenced in custom tracks or track hubs via their URL.</p> + <p> -For speed and efficiency, wiggle data is compressed and stored internally in 128 unique bins. This -compression means that there is a minor loss of precision when data is exported from a wiggle track +Unlike bigWig binar files, wiggle ASCII text files can be uploaded as custom tracks onto our server. +After the upload, wiggle data is compressed and stored +internally in 128 unique bins. This compression means that there is a minor +loss of precision when data is exported from a wiggle track (<em>i.e.</em>, with output format "data points" or "bed format" within the -Table Browser). The <a href='bedgraph.html'>bedGraph</a> format should be used if it is important to -retain exact data when exporting.</p> +Table Browser). For custom tracks, use the <a href='bedgraph.html'>bedGraph</a> +format if it is important to retain exact data when +exporting. However, the size of all custom tracks is limited. For these reasons, we +recommend always converting wiggle files to the <a href="bigWig.html">bigWig</a> storage format +and reference these from your custom tracks or track hubs via their URL.</p> <h2>General structure</h2> <p> Wiggle format is line-oriented. For wiggle custom tracks, <strong>the first line must be a <a href="customTrack.html#TRACK" target="_blank">track definition line</a> (<em>i.e.</em>, track type=wiggle_0)</strong>, which designates the track as a wiggle track and adds a number of options -for controlling the default display.</p> +for controlling the default display. For conversion to bigWig, the most common use case, +this line must not be present.</p> <p> Wiggle format is composed of declaration lines and data lines, and require a separate wiggle track definition line. There are two options for formatting wiggle data: <strong>variableStep</strong> and <strong>fixedStep</strong>. These formats were developed to allow the file to be written as compactly as possible.</p> <h3>variableStep format</h3> <p> This format is used for data with irregular intervals between new data points, and is the more commonly used wiggle format. After the wiggle track definition line, variableStep begins with a declaration line and is followed by two columns containing chromosome positions and data values: <pre><code><strong>variableStep</strong> <strong>chrom=</strong><em>chrN</em> <strong>[span=</strong><em>windowSize</em><strong>]</strong> <em>chromStartA</em> <em>dataValueA</em> <em>chromStartB</em> <em>dataValueB</em> @@ -105,34 +123,35 @@ Note that for both variableStep and fixedStep formats, the same span must be used throughout the dataset. If no span is specified, the default span of 1 is used. As the name suggests, fixedStep wiggles require the same size step throughout the dataset. If not specified, a step size of 1 is used.</p> <a name="data"></a> <h2>Data values</h2> <p> Wiggle track data values can be integer or real, positive or negative values. Only positions specified have data. Positions not specified do not have data and will not be graphed. All positions specified in the input data must be in numerical order. NaN values are not supported by the browser and, if included, may have unforeseen effects.</p> <h3>1-start coordinate system in use for variableStep and fixedStep</h3> <p> -BigWig files created from bedGraph format use "0-start, half-open" coordinates, but -bigWigs that represent variableStep and fixedStep data are generated from wiggle files that -use "1-start, fully-closed" coordinates. -For example, for a chromosome of length N, the first +The bedGraph format, like all BED-based formats and most file formats used by UCSC, +use "0-start, half-open" coordinates, but +the wiggle ASCII text format for variableStep and fixedStep data uses "1-start, fully-closed" coordinates. +Wiggle (variableStep and fixedStep) is the only format defined by UCSC that uses a 1-based +format, for historical reasons. For example, for a chromosome of length N, the first position is 1 and the last position is N. For more information, see: <ul> <li> <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/20639541/" target="_blank">BigWig and BigBed: enabling browsing of large distributed datasets</a> (<em>Bioinformatics</em>)</li> <li> <a href="../../FAQ/FAQtracks.html#tracks1">Database/browser start coordinates differ by 1 base</a> (Genome Browser FAQ)</li> <li> <a href="http://genome.ucsc.edu/blog/the-ucsc-genome-browser-coordinate-counting-systems/" target="_blank">The UCSC Genome Browser Coordinate Counting Systems</a> (Genome Browser blog)</li> </ul> <h3>Parameters for custom wiggle track definition lines</h3> <p>