5884b9f213c7d339352dda9c40a6466ce3e12a1c gperez2 Wed Jul 16 11:58:33 2025 -0700 Updating the bedMethyl/bigMethyl Track Format page, refs #36002 diff --git src/hg/htdocs/goldenPath/help/bedMethyl.html src/hg/htdocs/goldenPath/help/bedMethyl.html index 2b70c9ebc3f..491b668a7a5 100755 --- src/hg/htdocs/goldenPath/help/bedMethyl.html +++ src/hg/htdocs/goldenPath/help/bedMethyl.html @@ -1,51 +1,180 @@ <!DOCTYPE html> <!--#set var="TITLE" value="Genome Browser bedMethyl Track Format" --> <!--#set var="ROOT" value="../.." --> <!-- Relative paths to support mirror sites with non-standard GB docs install --> <!--#include virtual="$ROOT/inc/gbPageStart.html" --> -<h1>BedMethyl Track Format</h1> +<h1>bedMethyl and bigMethyl Track Format</h1> <p> -The bedMethyl format allows display of methylation sites. +The <a target="_blank" href="https://www.encodeproject.org/data-standards/wgbs/">bedMethyl</a> +format is an extension of the standard <a href="/FAQ/FAQformat.html#format1">BED 9 format</a> used +to display DNA methylation site data in a genome browser. This format is useful for base-resolution +methylation data generated by bisulfite sequencing or direct methylation detection methods such as +long-read sequencing. By including both methylation level and support (coverage), bedMethyl +provides a detailed view of methylation across the genome.</p> -<h2>General Structure</h2> -<p> -The bedMethyl format is line-oriented. BedMethyl data are preceded by a -<a href="customTrack.html#TRACK">track definition line</a>, which adds a number of options for -controlling the default display of this track.</p> +<p>The bedMethyl format includes the information of a <b>BED 9</b> along with additional fields:</p> +<ul> +<li>Valid Coverage: Reads with valid modification call +<li>Percent Modified: Percent of valid calls that are modified +<li>Modified calls: Number of calls with a modified base +<li>Canonical calls: Number of calls with a canonical base +<li>Other modification calls: Number of calls with a modified base, other modifications +<li>Reads with a deletion: Number of reads with a deletion at this reference position +<li>Low-confidence calls: Number of calls where the probability of the call was below the threshold +<li>Reads with a base mismatch: Number of reads with a base other than the canonical base for this modification +<li>Reads with no modification call: Number of reads aligned to this reference position, with the correct canonical base, but without a base modification call +</ul> + +<p class="text-center"> +<img "1080" height="148" src="../../images/bedMethylEx.jpg"> +</p> + +<p>The items are colored from <b><font color="#0000FF">0%</font></b> +methylated modified (blue) to +<b><font color="#FF0000">100%</font></b> (red). Hovering over an item or +clicking it shows the additional details found in bedMethyl.</p> + +<a name=bedMethyl></a> +<h2>Creating a bedMethyl custom track</h2> +<a name="example1"></a> +<h3>Example #1</h3> <p> -Following the track definition line are the track data in 18 column BED format:</p> +In this example, you will create a bedMethyl custom track using bedMethyl data for the hg38 assembly.</p> +<ol> + <li>Paste the following track line into the + <a href="../../cgi-bin/hgCustom?db=hg19">custom track management + page</a> for the human assembly hg38. + + <pre><code> + +track type=bedMethyl name="bedmethyl example" description="bedMethyl custom track" visibility="pack" +chr21 5010053 5010054 h 0 + 5010053 5010054 255,0,0 1 0.00 0 0 1 0 0 0 0 +chr21 5010053 5010054 m 0 + 5010053 5010054 255,0,0 1 0.00 1 0 0 0 0 0 0 +chr21 5010215 5010216 h 0 + 5010215 5010216 255,0,0 1 30.00 0 0 1 0 0 0 0 +chr21 5010215 5010216 m 0 + 5010215 5010216 255,0,0 1 30.00 1 0 0 0 0 0 0 +chr21 5010331 5010332 h 0 + 5010331 5010332 255,0,0 1 70.00 0 0 1 0 0 0 0 +chr21 5010331 5010332 m 0 + 5010331 5010332 255,0,0 1 70.00 1 0 0 0 0 0 0 +chr21 5010335 5010336 h 0 + 5010335 5010336 255,0,0 1 100.00 0 0 1 0 0 0 0 +chr21 5010335 5010336 m 0 + 5010335 5010336 255,0,0 1 100.00 1 0 0 0 0 0 0 + </code></pre></li> + + <li>Click the "submit" button.</li> + <li>Go to chr21:5,010,030-5,010,408 to see the data.</li> +</ol> +</p> + -<pre><code><em>chromA</em> <em>chromStartA</em> <em>chromEndA</em> <em>dataValueA</em> -<em>chromB</em> <em>chromStartB</em> <em>chromEndB</em> <em>dataValueB</em></code></pre> +<a name=bigMethyl></a> +<h2>bigMethyl Format</h2> +<p>The <b>bigMethyl</b> format is the indexed version of bedMethyl using bedToBigBed. +See <a href="bigBed.html">bigBed format</a>. The bigMethyl format is more efficient to display in +the Genome Browser, and it offers more trackDb options, which will allow for customization. The +following autoSql definition is an example on how to specify bigMethyl files. -<h3>Parameters for bedMethyl track definition lines</h3> +This definition, contained in the file +<a href="examples/bigMethyl.as"><em>bigMethyl.as</em></a>, +is pulled in when the <code>bedToBigBed</code> utility is run with the +<code>-as=bigMethyl.as</code> option.</p> +<pre><code> +table bigMethyl +"bigMethyl bedMethyl" +( + string chrom; "Reference sequence chromosome or scaffold" + uint chromStart; "Start position in chrom" + uint chromEnd; "End position in chrom" + string name; "dbSNP Reference SNP (rs) identifier or :" + uint score; "Score from 0-1000, derived from p-value" + char[1] strand; "Unused. Always '.'" + uint thickStart; "Start position in chrom" + uint thickEnd; "End position in chrom" + uint color; "Red (positive effect) or blue (negative). Brightness reflects pvalue" + string nValidCov; "Valid Coverage" + double percMod; "Percent Modified" + uint nMod; "Number of calls with a modified base" + uint nCanon; "Number of calls with a canonical base" + uint nOther; "Number of calls with a modified base, other modification" + uint nDelete; "Number of reads with a deletion at this reference position" + uint nFail; "Number of calls where the probability of the call was below the threshold" + uint nDiff; "Number of reads with a base other than the canonical base for this modification" + uint nNoCall; "Number of reads aligned to this reference position, with the correct canonical base, but without a base modification call" +) </code></pre> +<p>The first 9 fields of this bigMethyl format are the same as the first 9 fields of the standard +BED format.</p> + + +<h2>Creating a bigMethyl custom track</h2> +<a name="example2"></a> +<h3>Example #2</h3> <p> -All options are placed in a single line separated by spaces:</p> -<pre><code><strong>track type=</strong>bedMethyl <strong>name=</strong><em>track_label</em> <strong>description=</strong><em>center_label</em> - </pre> +In this example, you will create a bigMethyl file to display as a custom track.</p> +<ol> + <li> + Save <a href="examples/bedMethyl.bed">this bedMethyl file</a> to your + computer.</li> + <li> + Save the autoSql files <a href="examples/bigMethyl.as"><em>bigMethyl.as</em></a> to your computer.</li> + <li> + Download the <code>bedToBigBed</code> + <a href="http://hgdownload.soe.ucsc.edu/admin/exe/">utility</a>.</li> + <li> + Save the <a href="hg38.chrom.sizes"><em>hg38.chrom.sizes</em> text file</a> to your computer. This + file contains the chrom.sizes for the human hg38 assembly.</li> + <li> +Use the <code>bedToBigBed</code> utility to create a bigMethyl file from your sorted bedMethyl file, using +the <em>bedMethyl.bed</em> file and <em>chrom.sizes</em> files created above. </p> +<pre><code><strong>bedToBigBed</strong> -as=bigMethyl.as -type=bed9+9 bedMethyl.bed hg38.chrom.sizes bigMethyl.bb</code></pre></li> <p> -<strong>Note:</strong> if you copy/paste the above example, you must remove the line breaks.</p> + <li> +Move the newly created bigMethyl file (<em>bigMethyl.bb</em>) to a web-accessible http, https, or +ftp location. At this point you should have a URL to your data, such as "https://institution.edu/bigMethyl.bb", and the file should be accessible outside of your institution/hosting providers network. For more information on where to host your data, please see the <a href="hgTrackHubHelp.html#Hosting">Hosting</a> section of the Track Hub Help documentation. Construct a custom track line with a bigDataUrl parameter pointing to the newly created bigMethyl file.</p> + <pre><code>track type=bigMethyl name="bigMethyl Example" description="A bigMethyl file" bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigMethyl.bb visibility=pack</code></pre></li> + <li> + Go to chr21:5,010,030-5,010,408 to see the data.</li> +</ol> + + +<a name=sharing_data></a> +<h2>Sharing your data with others</h2> <p> -The track type is REQUIRED, and must be <em>bedMethyl</em>:</p> -<pre><code><strong>type=</strong>bedMethyl</code></pre> +Custom tracks can also be loaded via one URL line. +<a href="../../cgi-bin/hgTracks?ignoreCookie=1&db=hg38&position=chr21:5,010,030-5,010,408&hgct_customText=track%20type=bigMethyl%20name=Example%20bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigMethyl.bb%20visibility=pack" +target="_blank">This link</a> loads the same <em>bigMethyl.bb</em> track and sets additional display parameters from <a href="#example2">Example 2</a> in the URL:</p> +<pre><code>http://genome.ucsc.edu/cgi-bin/hgTracks?ignoreCookie=1&db=hg38&position=chr21:5,010,030-5,010,408&hgct_customText=track%20type=bigMethyl%20name=Example %20bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigMethyl.bb %20visibility=pack</code></pre> + <p> +If you would like to share your bigMethyl data track with a colleague, learn how to create a URL +link to your data by looking at <a href="customTrack.html#EXAMPLE6">Example #6</a> on the +custom track help page.</p> -<h3>Data Values</h3> +<a name=extracting_data></a> +<h2>Extracting data from the bigMethyl format</h2> <p> -BedMethyl track data values can be integer or real, positive or negative values. The -chromosome coordinates are <a href="../../FAQ/FAQtracks.html#tracks1">zero-based, half-open</a>. -This means that the first chromosome position is 0, and the last position in a chromosome -of length <em>N</em> would be <em>N - 1</em>. The positions listed in the input data must be in -numerical order, and only the specified positions will be graphed. bedMethyl format has eighteen -columns of data: <pre><code><em>chrom chromStart chromEnd dataValue</em></code></pre></p> +Because the bigMethyl files are an extension of bigBed files, which are indexed binary files, it can +be difficult to extract data from them. UCSC has developed the following programs to assist +in working with bigBed formats, available from the +<a href="http://hgdownload.soe.ucsc.edu/admin/exe/">binary utilities directory</a>.</p> +<ul> + <li> + <code>bigBedToBed</code> — converts a bigBed file to ASCII BED format.</li> + <li> + <code>bigBedSummary</code> — extracts summary information from a bigBed file.</li> + <li> + <code>bigBedInfo</code> — prints out information about a bigBed file.</li> +</ul> <p> -<h2>Example</h2> +As with all UCSC Genome Browser programs, simply type the program name (with no parameters) at the +command line to view the usage statement.</p> + +<a name=troubleshooting></a> +<h2>Troubleshooting</h2> <p> -<strong>Note:</strong> -The above example is a custom track that includes a <code>track type=</code> line that is -specific for loading the data in the browser. This line will cause a raw bedMethyl data file to fail -validation by other tools, such as <code>validateFiles</code>, outside of the browser.</p> +If you encounter an error when you run the <code>bedToBigBed</code> program, check your input +file for data coordinates that extend past the the end of the chromosome. If these are present, run +the <code>bedClip</code> program +(<a href="http://hgdownload.soe.ucsc.edu/admin/exe/">available here</a>) to remove the problematic +row(s) in your input file before running the <code>bedToBigBed</code> program.</p> + <!--#include virtual="$ROOT/inc/gbPageEnd.html" -->