63130a0a315c1c10d58d741cdf534d922cece886 dschmelt Thu Sep 16 15:44:09 2021 -0700 Adding S to http to eliminate warning when searching docs refs #28175 diff --git src/hg/htdocs/ENCODE/FAQ/index.html src/hg/htdocs/ENCODE/FAQ/index.html index 0c3b633..7711ea6 100755 --- src/hg/htdocs/ENCODE/FAQ/index.html +++ src/hg/htdocs/ENCODE/FAQ/index.html @@ -1,755 +1,755 @@ <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> <title>ENCODE Resources and Frequently Asked Questions</title> <link rel="stylesheet" href="/style/HGStyle.css" type="text/css"> <link rel="stylesheet" href="/style/encodeProject.css" type="text/css"> </head> <body> <!--#include virtual="/inc/encodeProject.topbar.html"--> <div class="encodeHeader"> <a href="/ENCODE/index.html"><img src="/images/gbLogoOnly.png" style="height:80%" alt="ENCODE Project at UCSC" title="ENCODE Project at UCSC"></a> <span class="txt">ENCODE Resources and Frequently Asked Questions</span> </div> <div class="wrapper"> <div class="bar"><h4 class="title">About</h4></div> <div class="content"> <!--Content Resources Section Links--> <p> During the first decade of the ENCODE project (2003-2014), UCSC coordinated all project data, hosting genome browser tracks and download files for all Consortium experiments. UCSC also developed tools for locating and accessing ENCODE data as well as outreach and tutorial materials to help the user community. The <em>ENCODE Data at UCSC</em> resources below are those developed during this period. For newer data and outreach materials, consult the <em>ENCODE Project</em> section. <p> </div><!--end content--> </div><!--end wrapper--> <div class="wrapper"> <div class="bar"><h4 class="title">ENCODE Project Resources</h4></div> <div class="content"> <!--Content Resources Section Links--> <ul> <p> <li><a href="http://encodeproject.org/ENCODE/" target="_blank">ENCODE Portal (encodeproject.org)</a> <li><a href="http://genome.gov/10005107" target="_blank">ENCODE Project at NHGRI (genome.gov)</a> <li> <a href="http://www.nature.com/encode/" target="_blank">Nature ENCODE Explorer</a> and <a href="https://main.g2.bx.psu.edu/" target="_blank">Galaxy Biomedical Tools</a> <li><a href="https://www.ncbi.nlm.nih.gov/geo/info/ENCODE.html" target="_blank">ENCODE Data at GEO</a> <li><a href="http://www.ensembl.org/info/website/tutorials/encode.html" target="_blank">ENCODE Data at Ensembl</a> <p> </ul> </div><!--end content--> </div><!--end wrapper--> <div class="wrapper"> <div class="bar"><h4 class="title">Helpful Resources for ENCODE Data at UCSC</h4></div> <div class="content"> <!--Content Resources Section Links--> <ul> <p> <li><a href="/ENCODE/usageResources.html" target="_blank">Tutorials and User's Guide</a> <li><a href="/ENCODE/search.html" target="_blank">Track and File Search</a> <li><strong>Human: </strong><a href="/ENCODE/dataMatrix/encodeDataMatrixHuman.html" target="_blank">Experiment Matrix</a>, <a href="/ENCODE/dataSummary.html" title="formerly Data Summary" target="_blank">Experiment List</a>, <a href="/ENCODE/downloads.html" target="_blank">Downloads</a> and <a href="/ENCODE/cellTypes.html" target="_blank">Cell Types</a> <li><strong>Mouse: </strong><a href="/ENCODE/dataMatrix/encodeDataMatrixMouse.html" target="_blank">Experiment Matrix</a>, <a href="/ENCODE/dataSummaryMouse.html"title="formerly Data Summary" target="_blank">Experiment List</a>, <a href="/ENCODE/downloadsMouse.html" target="_blank">Downloads</a> and <a href="/ENCODE/cellTypesMouse.html" target="_blank">Cell Types</a> <li><a href="/ENCODE/antibodies.html" target="_blank">ENCODE Antibodies</a> and <a href="/ENCODE/otherTerms.html" target="_blank">ENCODE Registered Experiment Variables</a> <li><a href="/ENCODE/fileFormats.html" target="_blank">File Formats</a> and <a href="/ENCODE/dataStandards.html" target="_blank">Data Standards</a> and <a href="/ENCODE/softwareTools.html" target="_blank">Software Tools</a> <li><a href="../../FAQ/" target="_blank">UCSC Genome Browser FAQ Page</a> and <a href="http://groups.google.com/a/soe.ucsc.edu/group/genome?hl=en" target="_blank">Searchable Google Groups Mailing List</A> <p> </ul> <!--- Content Resources Section Search Forms --> - <form name="googleForm0" method="GET" action="http://www.google.com/search" onSubmit="document.googleForm0.q.value=document.googleForm0.qq.value+' site:genome.ucsc.edu/ENCODE/';"> + <form name="googleForm0" method="GET" action="https://www.google.com/search" onSubmit="document.googleForm0.q.value=document.googleForm0.qq.value+' site:genome.ucsc.edu/ENCODE/';"> Search all ENCODE web pages at UCSC: <input type="hidden" name="q" value=""> <input type="hidden" name="num" value="10"> <input type="hidden" name="filter" value="0"> <input type="text" name="qq" size="30" maxlength="255" value=""> <input type="submit" value="Submit"> </form> <p> - <form name="googleForm1" method="GET" action="http://www.google.com/search" onSubmit="document.googleForm1.q.value=document.googleForm1.qq.value+' site:genome.ucsc.edu/';"> + <form name="googleForm1" method="GET" action="https://www.google.com/search" onSubmit="document.googleForm1.q.value=document.googleForm1.qq.value+' site:genome.ucsc.edu/';"> Search the entire UCSC Genome Browser website: <input type="hidden" name="q" value=""> <input type="hidden" name="num" value="10"> <input type="hidden" name="filter" value="0"> <input type="text" name="qq" size="30" maxlength="255" value=""> <input type="submit" value="Submit"> </form> <p> </div><!--end content--> </div><!--end wrapper--> <!--Content FAQ Question Index section------------------------------------------------------> <div class="wrapper"> <a name="FAQ"></a> <div class="bar"><h4 class="title">ENCODE at UCSC Frequently Asked Questions</h4></div> <div class="content"> <!--Content FAQ Question items--> <p> DATA FILE AND TABLE FORMAT QUESTIONS <li><a href="#release6">How do I extract information about an ENCODE experiment from the filename?</a> <li><a href="#release5">What is the difference between a file xxx and the related file xxxV2?</a> <li><a href="#release7">How do I learn more about different ENCODE file formats?</a> <li><a href="#release15">What does xxx mean in a file in hgdownload/encodeDCC/hg19/wgEncode(track)?</a> <li><a href="#release9">How do I download ENCODE histone data in BED format?</a> <li><a href="#release11">How do I find the meaning of a column of a BED file?</a> <li><a href="#release8">What is the definition of "score" in ENCODE tables?</a> <li><a href="#release18">How are the columns signalValue and peak calculated in narrowPeak files?</a> <li><a href="#release10">What does the name column represent for DNase clustered BED files?</a> <li><a href="#release14">Can I convert WIG files into a variableStep format to use with SitePro?</a> <li><a href="#release19">How do I learn more about peak calling algorithms used to generate narrowPeak and broadPeak files?</a> <li><a href="#release20">What program reads ".bb" TFBS files from ENCODE?</a> </li> <p> OTHER QUESTIONS <li><a href="#release0">How do I display ENCODE data from GEO in the genome browser?</a> <li><a href="#release13">Where can I find ENCODE papers?</a> <li><a href="#release12">Is there a service providing ENCODE data on a hard drive?</a> <li><a href="#release17">May I use the the ENCODE figure from your homepage?</a> <li><a href="#release1">Which cell types are used by ENCODE?</a> <li><a href="#release16">Which cell protocols were used in my track of interest?</a> <li><a href="#release2">Where can I find the ENCODE growth protocol for a specific cell type?</a> <li><a href="#release3">Has transcription factor xxx been mapped by ENCODE?</a> <li><a href="#release4">How do I find overlaps between my own ChIP-seq regions and available ENCODE transcription factors?</a> <li><a href="#release21">I am making a public hub for my paper, is there an example html file to use for my data description?</a> </p> <p> <a href="/ENCODE/contacts.html" target="_blank">Questions and feedback welcome</a>. </p> </div><!--end content--> </div><!--end wrapper--> <!--START FAQ Content Tables-------------------------------------------------------> <a name="release0"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td ALIGN="right"><a href="#FAQ"><img SRC="../../images/top.gif" alt="" ALIGN="right" BORDER=0></a> <div class="bar"><h4 class="title">GEO DATA</h4></div> <div class="content"> <p> <b><font COLOR="#006666">Question: </font></b><br> How do I display ENCODE data from GEO in the genome browser? <p> <b><font COLOR="#006666">Response:</font></b><br> Please avoid loading GEO data as a custom track! Rather, as nearly all ENCODE data at GEO are already hosted as tracks on the UCSC browser, load the existing corresponding track. </p> <p> Take note of the GEO sample accession (GSM) number and enter it into the Track Search tool accessible from the left side of the ENCODE portal page by clicking <a href="/ENCODE/search.html" target="_blank">Search</a>, for example <a href="../../cgi-bin/hgTracks?db=hg19&hgt_tSearch=1&tsCurTab=simpleTab&tsSimple=GSM999240" target="_blank">GSM999240</a>. Or use the Advanced Track Search page and select "GEO sample accession" from the pull down menu displaying "Cell, tissue or DNA sample". Click the box next to your track resulting from the search and the "View in Browser" button. </p> <p> If you have data that is not already in the browser we recommend converting your BED files to <a href="../../goldenPath/help/bigBed.html" target="_blank">bigBed format</a>. You could download our source tools for converting from BED to bigBed (as described in the previous link) or use the tools at the <a href="http://galaxyproject.org/" target="_blank">Galaxy</a> website. For questions regarding Galaxy you will have to contact them directly. </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release1"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">ENCODE CELL TYPES</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> Which cell types are used by ENCODE? <p> <b><font color="#006666">Response:</font></b><br> On the left side of the <a href="/ENCODE/" target="_blank">ENCODE portal page</a> under <b><em>Human</em></b> and <b><em>Mouse</em></b> are links to the cell types used in all ENCODE experiments. </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release2"></A> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">ENCODE PROTOCOLS</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> Where can I find the ENCODE growth protocol for a specific cell type? For example RCC 7860? <p> <b><font color="#006666">Response:</font></b><br> To find a specific protocol, for example for human RCC 7860 cells, from the ENCODE portal navigate to the <a href="/ENCODE/cellTypes.html" target="_blank">Human Cell Types</a> page. Under the "Documents" column for RCC 7860, click the link to connect to see the growth protocol named after the lab that provided the document, in this case "Crawford". <p> Another path to ENCODE protocols is from the link <a href="/ENCODE/protocols" target="_blank">/ENCODE/protocols/</a>. Navigate to the cell protocols and then human directories to find the link to the same RCC 7860 protocol file as linked on the above <a href="/ENCODE/cellTypes.html" target="_blank">Human Cell Types</a> page. <p> If you have further questions about a protocol contact the lab that registered the protocol. </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release3"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">TRANSCRIPTION FACTORS</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> Has transcription factor xxx been mapped by ENCODE? <p> <b><font color="#006666">Response:</font></b><br> A quick way to view the list of transcription factors mapped by ENCODE is to view the ChIP-seq matrix for either <a href="/ENCODE/dataMatrix/encodeChipMatrixHuman.html" target="_blank">human</a> or <a href="/ENCODE/dataMatrix/encodeChipMatrixMouse.html" target="_blank">mouse</a>. Targets are listed horizontally across the top, indicating available mapped transcription factor data. Clicking on the green highlighted boxes will bring you to experiment data specific to the corresponding cell type and target. <p> Another option is to use the <a href="../../cgi-bin/hgTracks?db=hg19&hgt_tSearch=1&tsCurTab=advancedTab" target="_blank">Track Search</a> or <a href="../../cgi-bin/hgFileSearch" target="_blank">File Search</a> tools and to search the "Antibody or target protein" field to see if the desired transcription factor is listed. </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release4"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">MAPPING A CUSTOM TRACK TO TRANSCRIPTION FACTORS</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> How do I find overlaps between my own ChIP-seq regions and available ENCODE transcription factors? <p> <b><font color="#006666">Response:</font></b><br> By using the <a href="../../cgi-bin/hgTables" target="_blank">Table Browser tool</a> you can add your ChIP-seq information as a custom track and then use the "intersection" feature to intersect the Txn Factor ChIP track table listed under the Regulation group with your custom track. Note, your custom track should contain ChIP-seq regions in BED format, for more information visit our <a href="../../goldenPath/help/customTrack.html" target="_blank">custom tracks</a> page. <p> If you are unfamiliar with the Table Browser, please refer to our <a href="../../goldenPath/help/hgTablesHelp.html" target="_blank">help page</a> and the section on <a href="../../goldenPath/help/hgTablesHelp.html#Intersection" target="_blank">intersecting data</a>. </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release5"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">FILES NAMED xxxV2</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> What is the difference between a file xxx and the related file xxxV2? Why is the xxx file not displayed in the browser? <p> <b><font color="#006666">Response:</font></b><br> For files named similar to xxxV2, often the "V2" refers to a second version that revokes earlier versions that are therefore not displayed in the browser. Revoked files are still available for download, but they will be indicated as "replaced " or "revoked" in the related metadata file named "files.txt" present in the corresponding download directory. </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release6"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">ENCODE METADATA AND FILENAMES</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> How do I extract information about an ENCODE experiment from the filename? <p> <b><font color="#006666">Response:</font></b><br> This is not recommended. While ENCODE filenames have some metadata embedded, the information there is not complete nor easily extracted. Rather, use the file's metadata, for example in "files.txt", or access metadata in the following places: <ol>By opening "files.txt" - the metadata file located in each track's corresponding download directory. <br/>By clicking the blue down-arrow next to each subtrack listed on a track's Track Settings page. <br/>By using <a href="/ENCODE/search.html" target="_blank">Track Search or File Search</a> to filter files by metadata. <br/>By using the <a href="../../cgi-bin/hgTables" target="_blank">Table Browser tool</a> and setting "Group" to "All Tables" and selecting the "metaDb" table. Click the "describe table schema" button to learn more about the metaDb table. <br/>By using the <a href="../../goldenPath/help/mysql.html" target="_blank">public MariaDB database</a> to query the metaDb table for each database. </ol> The metadata uses controlled vocabulary (cv.ra), which can be downloaded as a text file <a href="http://hgdownload.soe.ucsc.edu/goldenPath/encodeDCC/cv.ra" target="_blank">here</a>. </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release7"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">ENCODE FILE FORMATS</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> How do I learn more about different ENCODE file formats? For example what is the difference between a file.bed and a file.bed9 in the ENCODE methylation data? <p> <b><font color="#006666">Response:</font></b><br> By clicking the <a href="../../ENCODE/fileFormats.html" target="_blank">File Formats</a> link from the ENCODE portal page you can reach a list of various file formats used in ENCODE. Every ENCODE file has metadata included under a "files.txt" file in the related downloads page. For example, from the <a href="http://hgdownload.soe.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeHaibMethylRrbs/">HudsonAlpha DNA methylation download page</a>, in the <a href="http://hgdownload.soe.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeHaibMethylRrbs/files.txt">files.txt</a> file, a line after the specific bed9 file in question, wgEncodeHaibMethylRrbsAg04449UwstamgrowprotSitesRep1.bed9, reads 'objstatus=replaced'. This metadata indicates this bed9 file was preliminary data that has since been replaced. A similar note in the automatically displayed README file states: "WARNING - Revoked and replaced data files may be present in this directory." </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release8"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">ENCODE SCORE DEFINITION</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> What is the definition of "score" in ENCODE tables? <p> <b><font color="#006666">Response:</font></b><br> The score (between 0-1000) determines how darkly an item is displayed in the browser (with 1000 being black). The darkness of an item's box is proportional to the maximum signal strength observed in any cell line. <p> To find out exactly how score has been calculated for a specific track, contact the lab that created the data. There are often several links to authors' labs in the credits section for each track at the bottom of a track's description page. </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release9"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">ENCODE DATA IN BED FORMAT</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> How do I download ENCODE histone data in BED format? From the Table Browser I can select to download the file in BED format, but I am limited to just a few thousand lines. When I looked in the ENCODE Downloads directory I could only find the path to a bigWig file, for example wgEncodeBroadHistoneGm12878H3k27acStdSig for human build hg19. <p> <b><font color="#006666">Response:</font></b><br> The ENCODE BED files you are looking to download are the 'peak calls', which are in the extended broadPeak or narrowPeak formats, described <a href="../../FAQ/FAQformat.html#ENCODE">here</a>. For example, within the database mentioned (H3K27ac histone mark in GM12878 cells) there is a BED representation in the file: "wgEncodeBroadHistoneGm12878H3k27acStdPk.broadPeak.gz". Using the <a href="../../ENCODE/search.html">File Search</a> tool you can use the setting "Data Format: Peaks Broad" to narrow your results to only these types of files.</p> </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release10"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">ENCODE BED FILE FORMAT</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> What does the name column represent for DNase clustered BED files? I downloaded the ENCODE BED file wgEncodeRegDnaseClustered.bed from the DNase footprinting assay. However, I am having trouble understanding the 4th column in this file. Usually this column, as I understand from the file format FAQ page, is assigned to name. <p> <b><font color="#006666">Response:</font></b><br> For the DNase cluster BED files, the name field represents the number of items in the cluster. To find out more information about each cluster, you can click on the item in the browser image and it will take you to a details page that will list all of the items in the cluster and the cell lines. <a href="../../cgi-bin/hgc?db=hg19&c=chr21&o=33032260&t=33033430&g=wgEncodeRegDnaseClustered&i=58&l=33032260&r=33033430">Here</a> is an example of a details page for a DNase item on chromosome 21. There are 58 items in this cluster and you can see the name value is 58. </p> </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release11"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">ENCODE ChIP-seq BED FILES</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> How do I find the meaning of a column of a BED file? I have downloaded ENCODE Chip-Seq BED files that have the following format:</p> <p>chr21 9825311 9827738 . 1000 . 4.51792 256.60845 261.34671 1809</p> <p>What is the meaning of the information from the fourth field forward? </p> <p> <b><font color="#006666">Response:</font></b><br> ENCODE has a number of <a href="../../FAQ/FAQformat.html#ENCODE">ENCODE-specific formats</a>. ENCODE ChIP-seq files are typically stored in the ENCODE <a href="../../FAQ/FAQformat.html#format12">narrowPeak</a> format. This format extends BED6 to include fields for signalValue, two measurements of statistical significance (pValue and qValue), and the offset of a single base 'point source' peak within the region. The dots are used for name and strand which are not applicable. </p> </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release12"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">DOWNLOAD ALL ENCODE DATA</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> Is there a service providing ENCODE data on a hard drive? What is the total data volume? We have been trying FTP, but it takes too much bandwidth and time. <p> <b><font color="#006666">Response:</font></b><br> The total volume of ENCODE data are greater than 31 TB. Unfortunately, it is not possible for you to obtain a disk copy, however, there is a new protocol to try called UDR (UDT Enabled Rsync). UDR provides users much faster download rates.</p> <p> Here is an example using UDR, once installed, to download all the mouse mm9 ENCODE information:</p> <pre class="code">$ udr rsync -avP hgdownload.soe.ucsc.edu::goldenPath/mm9/encodeDCC/ /my/local/mm9/</pre> Please read more about the new UDR method <a href="../../ENCODE/newsarch.html#091213" target="_blank">here</a>.</p> <p> For those not downloading high amounts of data, we highly recommend using rsync. For example:</p> <pre class="code">$ rsync -a -P rsync://hgdownload.soe.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeDir/wgEncodeFile ./</pre> <p> Using rsync has the advantage of starting up where it left off after a failure, when run again. </p> </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release13"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">ENCODE PAPERS</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> Where can I find ENCODE papers? I would like a list of the principal ENCODE Papers, can you send a link to a list of a core 30 papers detailing ENCODE's results? <p> <b><font color="#006666">Response:</font></b><br> References to the ENCODE analysis publications of September 2012 can be found here: <a href="/ENCODE/analysis.html#tools" target="_blank">ENCODE Analysis Package Publications</a>. There is also a comprehensive set of ENCODE-related publications listed on the <a href="/ENCODE/pubs.html"> Publications page</a> linked on the left side of the <a href="/ENCODE/">ENCODE portal page</a>. </p> </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release14"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">CONVERTING WIG FILES TO VARIABLESTEP</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> Can I convert WIG files into a variableStep format to use with SitePro? I am trying to use a tool called SitePro within Cistrome. This tool uses WIG and BED files to compute score profiles on the BED regions. I have downloaded, through Cistrome/Galaxy, the ENCODE WIG files which have BED-like structure: </p> <p> chr1 3002700 3002800 0.17 </p> <p> However, this WIG file's BED-like structure is not accepted by SitePro. Is there a way to format the WIG files as variablestep and not BED-like? <p> <b><font color="#006666">Response:</font></b><br> There is not a way to convert formats using the Genome Browser directly, but you could convert formats using a script. There is an example script in our genomewiki, <a href="http://genomewiki.ucsc.edu/index.php/Wiggle_BED_to_variableStep_format_conversion">here</a>. </p> </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release15"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">UNIQUE ENCODE DATA DETAILS</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> What does xxx mean in a file in hgdownload/encodeDCC/hg19/wgEncode(track)? For example downloadable files in the wgEncodeCaltechRnaSeq/ directory have a gene_id format like gene_id "GM12878-rep1.1045777" where the first part is the cell type. Would you know what does the last number 1045777 means? <p> <b><font color="#006666">Response:</font></b><br> At the top of the page for each of the download directories you are visiting there is a README.txt file that is automatically displayed. A link is provided that will bring you to a user interface enabling filtering of files by cell type and other parameters, as well as including additional information such as release status, restriction dates, track description, methods, and metadata that can answer such questions. </p> <p> For example in the README.txt file displayed at the top of the page in the <a href="http://hgdownload.soe.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeCaltechRnaSeq/">Caltech RNA-seq directory</a> you can find the following link: "http://genome.ucsc.edu/cgi-bin/hgFileUi?db=hg19&g=wgEncodeCaltechRnaSeq" </p> <p> By navigating to the page above, <a href="../../cgi-bin/hgFileUi?db=hg19&g=wgEncodeCaltechRnaSeq">Caltech RNA-seq Downloadable Files</a>, you can scroll to the bottom (or click the "Description" link in the top right corner) and read the track description's "Methods" section. In the "Data Processing and Analysis" section there is information explaining how the numbers in gene_id, "GM12878-rep1.####" represent de novo identifiers output by Cufflinks software. At the very bottom of the page is a "Credits" section where contacts are listed. You should send remaining process-specific questions about the data you are investigating to the appropriate contact listed. </p> </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release16"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">ENCODE PROTOCOLS</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> Which cell protocols were used in my track of interest? Did the Open Chromatin ENCODE tracks use standard ENCODE cell protocols? <p> <b><font color="#006666">Response:</font></b><br> Standard growth protocols were used for all ENCODE experiments, including the Open Chromatin ENCODE tracks. A directory of all ENCODE protocols is available here: <a href="/ENCODE/protocols" target="_blank">http://genome.ucsc.edu/ENCODE/protocols/<a>. </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release17"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">ENCODE GRAPHIC</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> May I use the the ENCODE figure from your homepage? I am writing my PhD thesis and I would like to use it in both electronic and printed form. <p> <b><font color="#006666">Response:</font></b><br> The <a href="/ENCODE/aboutScaleup.html">ENCODE graphic</a> displaying how investigators employ a variety of assays and methods to identify functional elements can be used in publications as long as credit is given. Please credit Ian Dunham at EBI and Darryl Leja at NHGRI as noted below the image. </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release18"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">ENCODE TABLE SCHEMA</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> How are the columns signalValue and peak calculated in narrowPeak files? For example, I want more information about UW Histone "wgEncodeUwHistone...PkRep1.narrowPeak" files. <p> <b><font color="#006666">Response:</font></b><br> The <a href="../../FAQ/FAQformat.html">File Format FAQ</a> provides explanations about various file formats. Also a file's related Track Description page may include important information, such as the <a href="../../cgi-bin/hgTrackUi?db=hg19&g=wgEncodeUwHistone">UW Histone</a> "Methods section", which describes how data were processed to produce peaks. In the "Credits" section there is also a lab contact. To request further information for UW Histone data, for example, you could contact the lab to learn more about the peak calling algorithm or other methods involved. </p> <p> When using the <a href="../../cgi-bin/hgTables">Table Browser</a> there is a "describe table schema" button that gives information similar to that located in the File Format FAQ, plus the related Track Description. </p> <p> For example with settings "group: Regulation", "track: UW Histone", and "table: wgEncode...PkRep#", if you click the "describe table schema" button you will find definitions for signalValue and peak. Scrolling down you will find the related Track Description for UW Histone with the explanation for peak calling under "Methods" and the laboratory contact under "Credits". </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release19"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">ENCODE SOFTWARE TOOLS</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> How do I learn more about peak calling algorithms used to generate <a href="../../FAQ/FAQformat.html#format12">narrowPeak</a> and <a href="../../FAQ/FAQformat.html#format13">broadPeak</a> files? <p> <b><font color="#006666">Response:</font></b><br> An excellent resource to review is the ENCODE Software Tools page, located on the lefthand side of the <a href="/ENCODE/">ENCODE portal</a> under "Software Tools." Click through to the <a href="/ENCODE/encodeTools.html">Software Tools Used to Create the ENCODE Resource</a> and here you can find references for the various peak calling algorithms under "ChIP-seq Peak Callers". <p> By visiting various ENCODE tracks such as <a href="../../cgi-bin/hgTrackUi?db=hg19&g=wgEncodeHaibTfbs">HAIB TFBS</a>, <a href="../../cgi-bin/hgTrackUi?db=hg19&g=wgEncodeSydhTfbs">SYDH TFBS</a>, or <a href="../../cgi-bin/hgTrackUi?db=hg19&g=wgEncodeUwHistone">UW Histone</a> you can learn more about the processes each lab used to generate peaks, and pick a method suitable for your data. Since these data were not generated by the UCSC Browser group, questions about the data methods need to be directed to the corresponding lab. Under the "Credits" section you will find a contact for further questions left unanswered by reading the descriptions. </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release20"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">ENCODE FILES</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> What program reads ".bb" TFBS files from ENCODE? I am interested in looking at the AWG TFBS data. I downloaded the files and one is called: spp.optimal.wgEncodeBroadHistoneGm12878CtcfStdAlnRep0_VS_wgEncodeBroadHistoneGm12878ControlStdAlnRep0.bb <p> However, I do not have a program that can open this file. What is the program for this file and where can I find it? <p> <b><font color="#006666">Response:</font></b><br> Files ending in ".bb" are <a href="../../FAQ/FAQformat.html#format1.5">bigBed</a> files. Click <a href="../../goldenPath/help/bigBed.html">here</a> for extensive information on the bigBed format and how to extract data with different binary utilities located in this <a href="http://hgdownload.soe.ucsc.edu/admin/exe/">directory</a>. </div><!--end content--> </div><!--end wrapper--> <!--Content Tables-------------------------------------------------------> <a name="release21"></a> <!--outer table is for border purposes--> <div class="wrapper"> <td align="right"><a href="#FAQ"><img src="../../images/top.gif" alt="" align="right" border="0"></a> <div class="bar"><h4 class="title">HUB EXAMPLES</h4></div> <div class="content"> <p> <b><font color="#006666">Question: </font></b><br> I am making a public hub for my paper, is there an example html file to use for my data description? <p> <b><font color="#006666">Response:</font></b><br> The browser's <a href="../../cgi-bin/hgHubConnect?" target="_blank">public hubs</a> provide excellent examples of hub documentation. Here are two examples of track description pages from the ENCODE Analysis hub: <p> <a href="http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hg19/uniformTfbs.html"> http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hg19/uniformTfbs.html</a></br> <a href="http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hg19/uniformRNA.html"> http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hg19/uniformRNA.html</a><br> <p> <b>Useful tips when writing your track descriptions:</b> <li>It is best to assume a broad audience of students as well as researchers. Spelling out common acronynms, for example, may be useful for those who are new to genomics.</li> <li>The paper's abstract may be a good start for your track's "Description" section.</li> <li>Provide as much detail as possible in the "Methods" section.</li> <li>A email address must be prominently displayed for questions relating to the track.</li> </p> <p> <b>Other Examples:</b><br> <p> Here are a few good examples of hub structure and configuration from the ENCODE Analysis hub: <p> <a href="http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hub.txt"> http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hub.txt</a><br> <a href="http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/genomes.txt"> http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/genomes.txt</a><br> <a href="http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hg19/trackDb.txt"> http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hg19/trackDb.txt</a><br> <p> Note: We recommend a minimal number of default visible tracks in your trackDb.txt to quicken hub loading time and to avoid overwhelming users. For more suggestions on hub structure, please see our <a href="http://genomewiki.soe.ucsc.edu/index.php/Public_Hub_Guidelines".>Public Hub Guidelines</a> wikipage. Also, for help defining unfamiliar terms, you may want to see the Hub Track Database Definition's <a href="http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#toc" target="_blank">table of contents</a>. </p> <p> </div><!--end content--> </div><!--end wrapper--> <!--END FAQ Content Tables-------------------------------------------------------> <p class="date">Updated 15 August 2014</p> </body> </html>