fe2053eb5d1611d03b4c4f91e53aba4ac6e0178a braney Sat Sep 12 15:29:42 2020 -0700 placeholders for bigLolly help diff --git src/hg/htdocs/goldenPath/help/bigLolly.html src/hg/htdocs/goldenPath/help/bigLolly.html index 00ca533..acf8896 100755 --- src/hg/htdocs/goldenPath/help/bigLolly.html +++ src/hg/htdocs/goldenPath/help/bigLolly.html @@ -1,214 +1,122 @@ - + -

bigChain Track Format

bigLolly Track Format

-The bigChain format describes a pairwise alignment that allow gaps in both sequences simultaneously, -just as chain files do; however, bigChain files are compressed and indexed -as bigBeds. Chain files are converted to bigChain files using the program bedToBigBed, -run with the -as option to pull in a special -autoSql (.as) file -that defines the fields of the bigChain.

+The bigLolly format is a standard bigBed file that generates a lollipop graph. +bigLolly files are compressed and indexed as bigBeds. The trackDb defines the fields used to create +the graph.

-The bigChain files are in an indexed binary format. The main advantage of this format is that only +The bigLolly files are in an indexed binary format. The main advantage of this format is that only those portions of the file needed to display a particular region are transferred to the Genome -Browser server. Because of this, bigChain files have considerably faster display performance than -regular chain files when working with large data sets. The bigChain file remains on your local +Browser server. The bigLolly file remains on your local web-accessible server (http, https or ftp), not on the UCSC server, and only the portion needed for the currently displayed chromosomal position is locally cached as a "sparse file". If -you do not have access to a web-accessible server and need hosting space for your bigChain files, +you do not have access to a web-accessible server and need hosting space for your bigLolly files, please see the Hosting section of the Track Hub Help documentation.

- -

bigChain format definition

+ +

bigLolly format definition

-The following autoSql definition is used to specify bigChain pairwise alignment files. This -definition, contained in the file bigChain.as, will be -pulled in when the bedToBigBed utility is run with the -as=bigChain.as -option. - -

    table bigChain
-    "bigChain pairwise alignment"
-        (
-        string chrom;       "Reference sequence chromosome or scaffold"
-        uint   chromStart;  "Start position in chromosome"
-        uint   chromEnd;    "End position in chromosome"
-        string name;        "Name or ID of item, ideally both human readable and unique"
-        uint score;         "Score (0-1000)"
-        char[1] strand;     "+ or - for strand"
-        uint tSize;         "size of target sequence"
-        string qName;       "name of query sequence"
-        uint qSize;         "size of query sequence"
-        uint qStart;        "start of alignment on query sequence"
-        uint qEnd;          "end of alignment on query sequence"
-        uint chainScore;    "score from chain"
-        )

-Note that the bedToBigBed utility uses a substantial amount of memory: approximately -25% more RAM than the uncompressed BED input file.

+Any bigBed file can be displayed as a bigLolly.

Creating a bigChain track

-To create a bigChain track, follow these steps:

-Step 1. -If you already have a chain file you would like to convert to a bigChain, skip to Step 3. -Otherwise download this example -chain file for the human GRCh38 (hg38) assembly.

-Step 2. -Download these autoSql files needed by bedToBigBed: -bigChain.as and -bigLink.as.

-Step 3. -Download the bedToBigBed and hgLoadChain programs from the UCSC -binary utilities directory.

-Step 4. -Use the fetchChromSizes script from the -same directory to create a -chrom.sizes file for the UCSC database with which you are working (e.g., hg38). -Alternatively, you can download the -chrom.sizes file for any assembly hosted at UCSC from our -downloads page (click on "Full -data set" for any assembly). For example, the hg38.chrom.sizes file for the hg38 -database is located at -http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes.

Creating a bigLolly track

-Step 5. -Use the hgLoadChain utility to generate the chain.tab and link.tab -files needed to create the bigChain file:

hgLoadChain -noBin -test hg38 bigChain chr22_KI2707731v1_random.hg38.mm10.rbest.chain

-Step 6. -Create the bigChain file from your input chain file using a combination of sed, -awk and the bedToBigBed utility: -

sed 's/.000000//' chain.tab | awk 'BEGIN {OFS="\t"} {print $2, $4, $5, $11, 1000, $8, $3, $6, $7, $9, $10, $1}' > chr22_KI270731v1_random.hg38.mm10.rbest.bigChain
-bedToBigBed -type=bed6+6 -as=bigChain.as -tab chr22_KI270731v1_random.hg38.mm10.rbest.bigChain hg38.chrom.sizes bigChain.bb

-Step 7. -To display your date in the Genome Browser, you must also create a binary indexed link file to -accompany your bigChain file:

awk 'BEGIN {OFS="\t"} {print $1, $2, $3, $5, $4}' link.tab | sort -k1,1 -k2,2n > bigChain.bigLink
-bedToBigBed -type=bed4+1 -as=bigLink.as -tab bigChain.bigLink hg38.chrom.sizes bigChain.link.bb

-Step 8. -Move the newly created bigChain (bigChain.bb) and bigLink (bigChain.link.bb) -files to a web-accessible http, https or ftp location.

-Step 9. -Construct a custom track using a single -track line. Note that any of the track attributes listed -here are applicable to tracks of type bigBed. The most basic -version of the track line will look something like this:

track type=bigChain name="My Big Chain" bigDataUrl=http://myorg.edu/mylab/bigChain.bb linkDataUrl=http://myorg.edu/mylab/bigChain.link.bb

-Step 10. -Paste the custom track line into the text box on the -custom track management page.

-The bedToBigBed program can be run with several additional options. For a full -list of the available options, type bedToBigBed (with no arguments) on the command line -to display the usage message.

- +To create a bigLolly track, follow these steps to build a bigBed here

Examples

Example #1

-In this example, you will create a bigChain custom track using an existing bigChain file, -bigChain.bb, located on the UCSC Genome Browser http server. This file contains data for +In this example, you will create a bigLolly custom track using an existing bigBed file, +bigBed.bb, located on the UCSC Genome Browser http server. This file contains data for the hg38 assembly.

-To create a custom track using this bigChain file: +To create a custom track using this bigLolly file:

Construct a track line that references the file:

track type=bigChain name="bigChain Example One" description="A bigChain file" bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigChain.bb linkDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigChain.link.bb

track type=bigLolly name="bigLolly Example One" description="A bigLolly file" bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigLolly.bb linkDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigLolly.link.bb

Paste the track line into the custom track management page for the human assembly hg38 (Dec. 2013).
Click the "submit" button.

Custom tracks can also be loaded via one URL line. -This link loads the same bigChain.bb track and sets additional display parameters in the URL:

http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&position=chr22_KI270731v1_random &hgct_customText=track%20type=bigChain%20name=Example %20bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigChain.bb %20linkDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigChain.link.bb%20visibility=pack

+This link loads the same bigLolly.bb track and sets additional display parameters in the URL:

http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&position=chr22_KI270731v1_random &hgct_customText=track%20type=bigLolly%20name=Example %20bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigLolly.bb %20linkDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigLolly.link.bb%20visibility=pack

-After this example bigChain is loaded in the Genome Browser, click into a chain on the browser's +After this example bigLolly is loaded in the Genome Browser, click into a chain on the browser's track display. Note that the details page displays information about the individual chains, similar to that which is available for a standard chain track.

Example #2

-In this example, you will create your own bigChain file from an existing chain input file.

+In this example, you will create your own bigLolly file from an existing chain input file.

Save this chain file to your - computer (Step 1 in Creating a bigChain track, above).

Step 1

Creating a bigLolly track

- Save the autoSql files bigChain.as and + Save the autoSql files bigLolly.as and bigLink.as to your computer (Step 2, above).
Download the bedToBigBed and hgLoadChain utilities (Step 3, above).
Save the hg38.chrom.sizes text file to your computer. This file contains the chrom.sizes for the human hg38 assembly (Step 4, above).
- Run the utilities in Steps 5-7, above, to create the bigChain and bigLink output + Run the utilities in Steps 5-7, above, to create the bigLolly and bigLink output files.
- Place the newly created bigChain (bigChain.bb) and and bigLink - (bigChain.link.bb) files on a web-accessible server (Step 8).

bigLolly.bb

bigLolly.link.bb

Step 8

- Construct a track line that points to the bigChain file (Step 9, above).

Step 9

Create the custom track on the human assembly hg38 (Dec. 2013), and view it in the Genome Browser (Step 10, above).

Sharing your data with others

-If you would like to share your bigChain data track with a colleague, learn how to create a URL by +If you would like to share your bigLolly data track with a colleague, learn how to create a URL by looking at Example 6 on this page.

Extracting data from the bigChain format

Extracting data from the bigLolly format

-Because the bigChain files are an extension of bigBed files, which are indexed binary files, it can +Because the bigLolly files are an extension of bigBed files, which are indexed binary files, it can be difficult to extract data from them. UCSC has developed the following programs to assist in working with bigBed formats, available from the binary utilities directory.

bigBedToBed — converts a bigBed file to ASCII BED format.
bigBedSummary — extracts summary information from a bigBed file.
bigBedInfo — prints out information about a bigBed file.

As with all UCSC Genome Browser programs, simply type the program name (with no parameters) at the command line to view the usage statement.

Troubleshooting

If you encounter an error when you run the bedToBigBed program, check your input file for data coordinates that extend past the the end of the chromosome. If these are present, run the bedClip program (available here) to remove the problematic row(s) in your input file before running the bedToBigBed program.