src/hg/makeDb/trackDb/README 1.99
1.99 2010/02/03 23:25:31 kent
Clarifying some view info.
Index: src/hg/makeDb/trackDb/README
===================================================================
RCS file: /projects/compbio/cvsroot/kent/src/hg/makeDb/trackDb/README,v
retrieving revision 1.98
retrieving revision 1.99
diff -b -B -U 1000000 -r1.98 -r1.99
--- src/hg/makeDb/trackDb/README 25 Jan 2010 19:03:31 -0000 1.98
+++ src/hg/makeDb/trackDb/README 3 Feb 2010 23:25:31 -0000 1.99
@@ -1,965 +1,1008 @@
This directory contains the track database for
the UCSC human genome browser. The program
hgTrackDb converts the information here into
a 'trackDb' table in the hgN MySQL database.
SEE ALSO: source tree file: src/product/README.trackDb
for a general discussion of developing tracks.
http://genome.ucsc.edu/admin/hgFindSpecHowTo.html
for instructions on hgFindSpec.
The bulk of the information about a track is
contained in the trackDb.ra file. Tracks are
separated from each other in this file by
a blank line. Each line begins with an
attribute name and is followed by attribute
value separated by white space.
The html text about a track is in separate
files named xxx.html where xxx is a
track name. The contents of the HTML is subject
to variable substitution as described below.
Note that the html file in lower-level directories
can override the html in higher-level directories
without requiring an entry in trackDb.ra.
There may be subdirectories for each organism such
as 'mouse', and underneath these directories for
each database, such as 'hg7'. There are a trackDb.ra
file and xxx.html files in these subdirectories.
Tracks descriptions in the subdirectory will
over-ride track descriptions in the parent directory.
This allows you to customize a track for
each version of the database.
After editing a track definition here
hgTrackDb to bring it into the MySQL database. To update
your personal copy (on hgwdev-userName) do
make update
to do it on genome-test do
make alpha
You can also restrict a track to alpha or beta using
by using the release field.
THE TRACK FIELD
Each entry in trackDb.ra should start with a track
field:
track trackName [override]
If override is specified, the entries are used in lower-level
trackDb.ra file to override fields in higher-level ones. Fields
not specified in the track override entry are not changed.
THE TYPE FIELD
One of the most important and complicated fields in the
trackDb.ra file is the type field. Here's an explanation
of that field. Explanations of several other less-than-obvious
fields follow.
Currently there are seventeen different track types:
axt, bed, chain, clonePos, ctgPos, expRatio, genePred,
maf, netAlign, psl, rmsk, sample, wigMaf, wig, bedGraph,
chromGraph, bigBed
(notation: <angle brackets> indicate required field,
[square brackets] indicate optional field)
The format of the type fields:
1. type axt <otherDb>
<otherDb> - other database this organism is aligned to
2. type bed <fieldCount> <extraInfo>
<fieldCount> - the number of standard fields
<extraInfo> - either '.' meaning no non-standard fields
or '+' if there are additional non-standard fields.
3. type chain <otherDb>
Alignment data
<otherDb> - other database this organism is aligned to
4. type clonePos
unique for Clone Coverage track on human assemblies
5. type ctgPos
unique for Physical Map Contigs track on human assemblies
6. type expRatio
DNA chip expression data
Nothing follows expRatio, but the following fields
must be defined elsewhere in the trackDb.ra record:
expTable - table in hgFixed with names of experiments etc.
expScale - maximum expression value
expColor - default coloring scheme: "redGreen", "redBlue", "yellowBlue",
"redBlueOnWhite", or "redBlueOnYellow"
expStep - amount to step in visible expression scale. Some
round number close to expScale/8 is best
expColorDense On - show avg. color in dense mode
chip - Name of microarray chip.
7. type genePred [pep_table] [mrna_table]
Gene prediction data, use '.' when a positional parameter must be skipped
[pep_table] - optional associated protein sequence table
[mrna_table] - optional associated representative mRNA table
[autoTranslate] - if set to 0 then translated protein won't be generated
in the details page.
Optional fields to be defined elsewhere in the trackDb.ra record:
itemAttrTbl - table used to color individual items in a track,
selected by name and locations.
8. type maf
A type of multiple alignment track, becoming obsolete,
to be replaced by wigMaf
9. type netAlign <otherDb> <otherDbChainTable>
<otherDb>
<otherDbChainTable>
Also requires an otherDb field to be defined in the trackDb.ra record
10. type psl <subtype> [otherDb]
<subtype> - one of: est, mrna, protein, xeno or .
Where . means regular human mRNA
[otherDb] - optional xeno subtype, database associated
with other organism. If present the display can be colored by
chromosome and the chromosome and position in kilobases is shown
as the item label.
Optional variable here, for psl tracks that also have sequence loaded.
The presence of this option in the trackDb entry enables its function.
pslSequence no # allows user to select the other two
pslSequence all # show nucleotide labels on all bases
pslSequence different # show nucleotide labes only on different bases
11. type rmsk
Unique type for Repeat Masker tracks
12. type sample [min] [max]
A continuous value graphing type of track.
Becoming obsolete, replaced by the 'type wig' tracks.
[min] - optional minimum limit of data, default 0. Used just for label.
[max] - optional maximum limit of data, default 1000. Used just for label.
13. type wigMaf <minVal> <maxVal>
A composite type of track to graph multiple and pairwise alignments.
track name is table name of the multiple alignment maf table,
with associated maf file in /gbdb (loaded with hgLoadMaf)
type line includes data value range for conservation wiggle,
as for wig track
standard wig track settings may be included (e.g.
maxHeightPixels, yLineOnOff, autoScale). These
apply to both conservation and pairwise wiggles
Settings in the trackDb.ra record:
wiggle - contains table names of conservation wiggles
(loaded with hgLoadWiggle)
in the format:
"wiggle <table1> <leftLabel1> <uiLabel1>... <tableN> <leftLabelN> <uiLabelN>"
where the first table is the default. The left label is used
to prefix the label 'Cons' in the left label area of the
conservation wiggle display. The UI label is displayed on
the trackUI page. If only one table is listed, and no label
is present, the default label "Conservation" will be displayed.
The labels cannot contain spaces -- underscores in the labels
are translated to spaces in the display.
speciesOrder - specifies the order of each pairwise items in the display
Each species is specified as in the MAF file
Organism names with embedded dots and/or spaces,
these are stripped and replaced with underscore.
E.g. (C. elegans -> c_elegans).
speciesGroup - list of "clades" to group the species
This option is an alternative to speciesOrder, used when
there are many species. Each speciesGroup in the list must have it's
own setting (sGroup_<group>), followed by a list of species,
specified as for speciesOrder.
speciesTree - the phylogenetic tree for the species represented
in the maf. This will be used for the "show shortest path" display
mode if "speciesTarget" is defined.
speciesTarget - the default species target for the "show shortest path"
mode of maf display. If this setting is present, then the default
maf display is "show shortest path".
treeImage - path from web server htdocs dir (normally /images)
to a file containing a picture of the phylogenetic tree for
this track
pairwise - contains suffix of the pairwise maf tables.
The prefix is generated from the species name, as specified above.
An underscore is used to separate (e.g. chimp_hmrg if
setting is "hmrg"). If there is a wig table named
<species>_<suffix>_wig, this is used for pairwise display.
In this case, the wiggle height for the pairwise can
be specified as the last word on this line.
(e.g. pairwise CFTR 20, will have 20 pixel height of
pairwise wiggle from tables named <species>_CFTR_wig).
pairwiseHeight N - sets pixel height of pairwise in full mode
summary - contains table name of maf summary table (used as
replacement for "pairwise" tables, above. A summary
table is created from a multiple alignment maf file
using the utility hgLoadMafSummary.
speciesDefaultOff - contains a list of species that are
not displayed in the track display unless explicitly
configured in from the track config page.
itemFirstCharCase - this controls if species names in the multiple alignment
should be capitalized in the pairwise display.
mafFile - optional path to MAF file. If specified, this is used to
find a single MAF file for the track instead of looking up the
file in the extFile tables. Use hgLoadMaf -custom when
when using this setting.
14. type wig [lower] [upper]
Continuous value graphing track.
[lower] - overall lower limit of the data, default 0.0
[upper] - overall upper limit of the data, default 127.0
trackDb record options:
autoScale on|off # default is off
gridDefault on|off # default is off (draw y=0.0 line)
maxHeightPixels max:default:min # default is 128:128:11
graphType bar|points # default is bar
viewLimits lower:upper # default is from the type line limits
yLineMark real-value # default is 0.0
yLineOnOff on|off # default is off (draw y=yLineMark line)
windowingFunction maximum|mean|minimum # default is mean
smoothingWindow off|[2-16] # default is off
wigColorBy <bed table> # use colors in bed for wiggle
# in overlapping regions
spanList s1,s2,s3... # list of spans in the loaded table
# you can find the spans by doing:
# "select span from <table> group by span"
# typically spanList is only one:
# spanList 1
# rarely it may be more:
# spanList 1,1000
# special efforts must be made to load extra spans
# into the table for special purposes.
15. type bedGraph [column]
Same type of graphing function as #14 above 'type wig'
In this case, the data table is a bed type of table loaded
with hgLoadBed. The [column] specified is a numeric column of data
in the table to be used for the graphing value. The default column
to graph would be column five, the 'score'. All graphing options as
described above in 'type wig' apply to this type of track.
Two extra options are used here to specify maximum graphing bounds:
minLimit <value> # default is 0
maxLimit <value> # default is 1000
16. type chromGraph
This draws lines connecting sparse, variably spaced data points.
Use wiggle for regularly spaced or dense data. When using chromGraph
the following other settings may be used:
maxGapToFill <value> # default is 25000. No line will be drawn between
# data points further apart than this
maxHeightPixels <max:val:min> # default 100:32:8
# Specifies allowed and actual height in full mode.
minMax <min,max> # default is calculated from data in track
# Specifies displayed data range
linesAt <val1,val2,...> # Default is none
# If present labeled horizontal lines will be drawn
# at the given values in full mode.
17. type bigBed <fieldCount> <extraInfo>
This uses a binary indexed file rather than a database table. It is, other than
substituting "bigBed" for "bed" in the type line, the same as the "type bed"
explained above.
bigBed-specific optional field:
denseCoverage maxVal
In dense mode do a density plot based on maximum coverage seen under pixel. The maxVal
corresponds to the count at which it gets as dark as it can get. If maxVal is 0 then this
will be calculated from the data itself.
18. type bigWig [lower] [upper]
This uses a binary indexed file rather than a database table. It is, other
than substituting "bigWig" for "wig" in the type line, the same as the
"type wig" explained above. The database loading procedure is simply:
hgsql hg19 -e 'drop table if exists myLocalBigWig; \
create table myLocalBigWig (fileName varchar(255) not null); \
insert into myLocalBigWig values
("/gbdb/hg19/bbi/myLocalBigWig.bw");'
I believe this works even if the entry is a URL to a remote
bigWig file.
19. type bam
This uses a binary indexed file (which can be a URL) named in a database table.
See samtools.sourceforge.net for documentation of the SAM/BAM format.
In addition to the named .bam file/url, the index must be available at the
same path plus the ".bai" suffix.
BAM-specific settings:
pairEndsByName <placeholder e.g. .> # presence indicates paired-end alignments
pairSearchRange <N> # max distance between paired alignments,
# default 20,000 bases
bamColorMode strand|gray|tag|off # coloring method, default is strand
bamGrayMode aliQual|baseQual|unpaired # grayscale metric, default is aliQual
bamColorTag <XX> # optional tag for RGB color, default is "YC"
minAliQual <N> # display only items with alignment quality
# at least N, default 0
Additional general settings that may be useful to speed up display for extremely
dense data e.g. next-gen sequencing reads: maxWindowToDraw, chromosomes (see below)
If something lacks a type field it needs to have custom
display routines. You can remove a track from the
browser by removing it from the .ra file and doing
a make update.
OTHER FIELDS
There are a number of generic attributes that can be used with any track type:
Attribute - Possible values - controls
========= =============== ========
visibility - hide dense squish pack full - default visibility
onlyVisibility - dense squish pack full - only this visibility and hide
- are possible for this track
maxWindowToDraw - a (large) positive number - if winEnd-winStart is larger, don't draw items
group - any "name" from the grp table - used to specify which group
e.g.: map, genes, rna, x - of track controls to place
regulation, compGeno, varRep - this track into
useScore - 1 - use score to shade color items
spectrum - on - same effect as useScore
thickDrawItem - on - keep width of bed item at least
- 3 pixels wide even at great
- zoom levels
color - r,g,b integer triplet - specifies primary color
- for items.
- red,green,blue values 0 to 255
altColor - r,g,b integer triplet - specifies secondary color
- for items
colorByStrand - r,g,b r,g,b - specifies plus and minus strand color as above
- first rgb is plus strand
- second rgb is minus strand
- this has no effect for elements w/out strand
priority - a decimal number - used to order this track
- within this track group
chromosomes - comma separated list - only these chroms have data
- for this track, this track is
- not shown on other chroms
metadata - space delimited name=val pairs - Purely informational. Gives additional information
- about a track which will be displayed in hgTrackUi
- and hgc. Especially useful for subtracks (see below)
boxedCfg - on - puts a box around setting controls, much like
- multi-view controls have.
scoreFilter - integer - default score filter value for a track
scoreFilterLimits - integer:integer - min:max range that score can take.
- (default 0:1000. Single value N implies N:1000)
scoreFilterByRange - on - Filter using both upper and lower bounds.
- (when used, set default bounds by 'scoreFilter N:M')
scoreFilterMax - integer - deprecated. Use scoreFilterLimits.
noScoreFilter . - to turn off Ui options for bed 5+ tracks, I don't know
- what the . is for, but it always appears to be used.
The shortLabel and longLabel fields and the associated HTML files
may have the following variables, which will be substituted:
$ORGANISM - all upper case organism, like 'MOUSE'
$Organism - initial capped organism, like 'Mouse'
$organism - all lower case organism, like 'mouse'
$db - database (like mm3, hg15, etc.)
$date - freeze date of underlying assembly
$blurb - If there is a blurb field in the .ra file this echos it.
$matrix - content of the matrix and optional matrixHeader trackDb setting
which will be converted to an HTML table. If there is no matrix
setting, an empty string is substituted.
$chainMinScore - value that gets substituted into this statement on the
chain or chainNet html page: 'Chains scoring below a minimum
score of "$chainMinScore" were discarded'.
$chainLinearGap - value for the -linearGap matrix used with axtChain
(e.g. loose, medium). Gets substituted into the chain or
chainNet html page.
$downloadsServer - the value of the hg.conf downloads.server variable, or
hgdownload.cse.ucsc.edu if not set.
In addition, if there is an $otherDb field set in the .ra file, these
variables are available:
$o_ORGANISM - all upper case other organism, like 'MOUSE'
$o_Organism - initial capped other organism, like 'Mouse'
$o_organism - all lower case other organism, like 'mouse'
$o_db - other database (like mm3, hg15, etc.)
$o_date - freeze date of underlying other assembly
Any other ra fields maybe referenced as a variable.
The reference can be in to form $name or ${name}. Without
the braces, name is terminated by a charter other than
[0-9A-Za-Z_]. A literal $ is represented as $$.
If there is a colorChromDefault field (values "on" or "off"), it
is used to set the default value for chromosome coloring
in affected tracks (type psl xeno <db>). Without the field,
the default is "on".
All ra attributes that are not explictly handled in
trackDbCustom.c:trackDbAddInfo are added to the
trackDb->settingsHash table.
Some settings used by various tracks:
o exonArrows - if specified to draw strand arrows on exons.
o itemAttrTbl - table used to color individual items in a track, selected
by name and locations. Currently works for genePred tracks.
o url - If present puts up a link to an external URL on the details
page. The url includes everything else on the line after the first word.
The string $$ will be replaced by the name of an item. Examples:
url http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?form=4&db=n&term=$$
url http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/av.cgi?db=human&l=$$
Additional replacement item strings can be:
$T - database table name
$S - chromomosome name (scaffold name on scaffold assemblies)
$[ - chromStart location (zero relative)
$] - chromEnd location
$s - chromomosome name without chr prefix
(or without scaffold_/Scaffold_ on scaffold assemblies)
$D - database name
$P - item name portion before first : in name
$p - item name portion after first : in name up to next colon
o urlLabel - Works with url attribute. Replaces generic "outside link:" label
with the given label.
o idInUrlSql - Works with url attribute. A snippet of SQL with a %s in it somewhere.
The item name gets substituted in for the %s, and the SQL gets used in the current
database returning a single string, which is substituted into the url in place of $$.
o maxItems - Maximum number of items to be displayed individually in full
mode. If there are more items than this they will be drawn on top of
each other on the last line. In packed mode this refers to the number
of lines rather than number of items.
o directUrl - If present this will replace the "hgc" details page with
whats in the url. The URL is formatted as a printf line including
the following fields in this order:
%s - item name
%s - chromosome name
%d - chromosome start position
%d - chromosome end position
%s - track name
%s - database name
An example is:
directUrl /cgi-bin/hgGene?hgg_gene=%s&hgg_chrom=%s&hgg_start=%d&hgg_end=%d&hgg_type=%s&db=%s
Note that it is possible to only include the first field or first few
fields. The fields must be in the given order if present though.
o hgsid - if present the hgsid=XXXXXX CGI variable will be appended to the
end of the directUrl string.
o dataVersion - prints out as the Data version on the details page
(used chiefly by ENCODE)
o origAssembly - prints out a message on the details page, indicating
that the data was lifted from this assembly (used chiefly by ENCODE)
o release <alpha | beta> - restricts inclusion of a trackDb entry
in the database. TrackDb entries marked 'release beta' are
included in the trackDb database when 'make beta'
(or 'make strict') are used to create it. Those marked
'release alpha' are included when 'make alpha' (or 'make')
is used. This setting is used to avoid inadvertently changing
the configuration of an existing public track when it is
undergoing further development. It provides a temporary
development-only version of the trackDb. In this situation,
the developer should mark the existing trackDb entry (as on
the public server) 'release beta', and then add a parallel
entry marked 'release alpha' (the development version).
The existence of two entries should be noted in the pushQ.
Q/A will retire the duplicate entry and remove the release
labels when the track is published.
o html - specifies alternate filename for track description. Should only
be used when updating an existing track (used chiefly by ENCODE).
in conjunction with the 'release alpha' setting.
o scoreMin, scoreMax - sets the range used to color the track if
colored by score. Allows use of the whole color range when data
has scores with a smaller range than 0-1000.
o minGrayLevel - sets a minimum color shade to display for a scored BED.
Range is 1-9 (indicates 11% - 99%).
o baseColorUseCds {genbank,given,table,none} - For genePred or psl+cds
(such as genbank mRNA, EST etc) tracks. Specifies where CDS coordinates
can be found (if any) so that codons can be drawn when viewing a
sufficiently small region. If `table' is specified, an additional
parameter of a table name, in cdsSpec format, is required.
o baseColorUseSequence
{genbank,seq,ss,extFile,nameIsSequence,seq1Seq2,hgPcrResult,lfExtra,none}
- For genePred or psl tracks, or bed/bigBed tracks.
Specifies where item sequence can be found (if any) so that
item sequence, or differences from genomic sequence, can be drawn when
viewing a sufficiently small region.
If `extFile' is specified, two additional parameters are required,
the name of the seq table followed by the name of the extFile
table to use in looking up the sequence.
These tables are loaded by hgLoadSeq.
If 'nameIsSequence' is specified then the 4th column ('name' or
'sequence') contains the sequence. (see hg/lib/encode/tagAlign.as)
If 'seq1Seq2' is specified then the 7th & 8th columns ('seq1' and
'seq2') contain the left and right pairs of the sequence.
(see hg/lib/encode/pairedTagAlign.as)
o baseColorDefault {genomicCodons,itemBases,itemCodons,diffBases,
diffCodons,none} - For tracks with CDS and/or sequence information.
(See baseColorUseSequence above)
Specifies the default drawing mode. itemBases, itemCodons, diffBases
and diffCodons are applicable only if the track has sequence
(baseColorUseSequence). genomicCodons, itemCodons and diffCodons
are applicable only if the track has CDS info (baseColorUseCds).
o baseColorTickColor {contrastingColor,lighterShade}
Choose a contrasting color (this is often white) or lighter shade
of color (should be the same color as would be chosen for the base
text if we were zoomed in to base level.
Only applies if baseColorDefault is set. Defaults to 'red'
(the CDS_STOP color) if this option is not supplied.
o indelDoubleInsert {on,off} - For psl tracks. If on, then highlight
alignment gaps in both target and query using double lines (like
chain tracks).
o indelPolyA {on,off} - For psl tracks that have sequence
(baseColorUseSequence). If on, then highlight an apparent valid
poly-a tail (a block of aaa's at the end of the item sequence, or
ttt's at the start, that are not aligned to the genome) by drawing
a vertical green line.
o indelQueryInsert {on,off} - For psl tracks that have sequence
(baseColorUseSequence). If on, then highlight an insert in the
query only (alignment gap in the target only) by drawing an
orange vertical line.
o showDiffBasesAllScales . - show bases differences for PSL tracks at all
zoom levels.
o showDiffBasesMaxZoom basesPerPixel - only show bases or codon zoomed
difference annotations for PSL tracks at if currently zoomed at no more
than basesPerPixel (a float). showDiffBasesAllScales should also be
set to make this useful.
o showIndelMaxZoom basesPerPixel - only show PSL annotations if currently
zoomed at no more than basesPerPixel (a float). Setting to 0.0 disables
showing indels.
o showCdsAllScales . - show CDS for PSL tracks at all zoom levels.
o showCdsMaxZoom basesPerPixel - only CDS for PSL tracks at if currently
zoomed at no more than basesPerPixel (a float). showCdsAllScales should
be set and showDiffBasesMaxZoom should be set to a value not more
than showCdsMaxZoom to make this useful.
o nextItemButton off - For tracks that have next-item buttons that
shouldn't, turn the buttons off until whatever else is fixed.
on - This will override the user's UI settings and display the next-item
button for a track even if the user has specified that next/prev-item
navigation be disabled.
o defaultLinkedTables table1,table2,... - in hgTables, when selecting
output fields, display these all.joiner-linked tables by default.
o bedFilter - If the hgTrackUi call is using bedUi to support it and bedFilter
is set then the bed names will be filtered based on the settings entered by
the user on the hgTrackUi page.
Specific, optional settings used only by snp125, snp126, and so on:
o chimpMacaqueOrthoTable table - table contains orthologous
alleles from the reference assemblies of chimp and macaque.
o chimpOrangMacOrthoTable table - same as above, plus orangutan.
o {chimpDb,orangDb,macaqueDb} db - database used for the
respective species' alignments when generating the OrthoTable.
o codingAnnotations table1,table2,... - each table describes
changes in a particular gene set's coding sequence caused by SNPs.
o codingAnnoLabel_<tableN> - proper name of geneSet for tableN in
codingAnnotations' table list
o defaultGeneTracks table1,table2,... - hgc shows functional annotations
of SNPs relative to these genePred tracks by default.
o hapmapPhase {II,III} - older tracks used HapMap Phase II data
collected in a table "hapmapAllelesSummary". Newer tracks use
hapmapSnps* tables directly.
Note: We have a restriction on table names: Track names (i.e. table
names, but omitting any chr*_ prefix) should not contain any
underscores. We use underscores in track table names to find sequence
name prefixes.
Filter by
============
Score
-----
There are currently 4 different numerical filters for bed tables with the named fields:
score, signalValue, pValue and qValue
The field 'score' is an integer but the other three are floating point.
All should be declared with the field name followed by these 3 suffixes:
{fieldName}Filter n[:m]
{fieldName}FilterLimit N:M
{fieldName}FilterByRange on
Examples:
To filter a track by qValue you might set
qValueFilter 0.5
qValueFilterLimit 0:300
which sets up the limits of the filter as between 0 and 300 but offers the user a default value of 0.5.
Likewise
scoreFilter 500:700
scoreFilterLimit 200:1000
scoreFilterByRange on
sets up a filter on score with limits of from 200 to 1000, but offers from 500 to 700 as default.
NOTE: Filtering by range has so far only been tested in the 'score' filter.
Category
--------
Using the 'filterBy' setting you can set up a filter by 1 or more categories.
Categories are simply a list of values that appear in a drop down list box ("DD"). One or more
of these categories may be selected, to set up the filter. The setting format:
filterBy {field1}:{Title1}=[+]category1a,category1b,... [{field2}:{Title2}=[+]category2a,category2b,...]...
Here {field1} must be the name of a field in the table,
{title1} is what the user will see as the title of the drop-down list box
[+] can be added to say that the table has numerical values (1,2,3...), while words are seen in the DD.
category1a,... a comma separated list of values that appear in the DD.
These values are found in the table unless the '+' is used.
Example:
filterBy level:Level=+Validated,Manual,Automatic class:Class=coding,snRNA,...
This sets up 2 filter by category drop downs. The first is on the table field 'level'. If the user selects
both 'Validated' and 'Manual', then an item must have a 1 or 2 in the level field in order to be seen.
And if the user also selects 'coding' in the second filter, an item would have to have level 1 or 2 and
class='coding'.
COMPOSITE TRACKS
================
Tracks that are somehow related can be grouped in a single "composite track",
using the "compositeTrack on" setting. The track name for the
composite track is a placeholder, that is referenced in entries for the
subtracks. The subtracks have a "subTrack" setting that references
the composite track name. The composite track entry contains all
normal settings for the common track type if there is one.
The subtrack track entries need contain only the short and long labels,
they may optionally have a color setting and a priority within the
composite track.
If the subtracks lack color settings, and if color is set on the composite
track, the subtracks will be displayed with different colors in
a gradient from color to altColor.
To hide a subtrack initially, append "off" to the end of the
subtrack setting.
To include a subtrack of a different type, specify
the complete settings for the subtrack type, and use the "noInherit on"
trackDb setting. NOTE: This is required for "Multi-View" (see below).
Also see "ClosestToHome" methods below.
For readability, please indent the subtrack entries in trackDb.ra by 4 spaces.
Subtrack Groups
---------------
To allow flexibility in user configuration of subtrack
visibility, subtracks can be grouped. A subtrack group
(composite track subgroup) is defined with a trackDb setting,
in the format:
subGroup# name label tag1=title1 tag2=title2 ...
where # is assigned 1-N based on the number of groups.
The "name" and "tag" are identifiers used to tie subgroups together
in C and javascript code. They should have no special characters.
The "label" and "title" appear in the UI and are human readable.
Any "_" chars in a label or title will be replaced by " " on the
generated html page.
Subgroups for a subtrack are set as so:
subGroups name1=tag1 name2=tag2 ...
with the subgroup "name" associated with the one "tag" in that subgroup
that the subtrack actually belongs to. All subtracks should belong to
one tag in each of all the subgroups.
COMPOSITE SUBTRACK SELECTION - 4 ways
-------------------------------------
1) Select the check boxes for individual subtracks - no special settings needed
2) Add All [+][-] buttons - To the composte track add the setting below.
allButtonPair on
Not compatible with either of the next two methods
3) Matrix selection of subgroups. A 2 "dimension" grid of subgroups.
See MATRIX below.
4) Subgroup buttons. This happens if you have subgroups but no 'allButtonPair"
or "dimensions" defined.
COMPOSITE SUBTRACK CONFIGURATION - 3 ways
-----------------------------------------
1) All subtracks have the same type.
trackDb cfg settings can be at composite level (noInherit off - default)
Optionally use 'boxedCfg" to put a box around setting controls.
2) Subtrack settings have different types. "Multi-view" required (see below).
Cfg settings can be at level of composite, view, or subtrack.
3) Individual subtracks configurable - used in combination with 1 or 2.
"configurable on" setting, which can be at composite, view, or subtrack level.
Composite Track "ClosestToHome" paradigm
----------------------------------------
With a composite track with or without multi-view, there are many settings that
could be at the subtrack, view or composite level. The usual way of querying settings,
trackDbSetting(), will look for the setting in the subtrack first, then in the view,
and finally in the composite level. In cases where one only wants the settings at a
particular level, use trackDbLocalSetting() instead.
Composite Track ADDITIONAL OPTIONS
----------------------------------
MATRIX or DIMENSIONS
--------------------
Composite Tracks can be organized into a matrix of X and Y Dimensions of
subtracks. This method of organization also may include a "view" subGroup.
The organization is achieved with a dimensions setting which refers to
subgroups already defined. Example:
subGroup2 cellType Cell_Type ES=ES MEF=MEF NP=NP EShyb=ES-hybrid
subGroup3 factor Factor K4=H3K4me3 K9=H3K9me3 K20=H4K20me3 K27=H3K27me3 K36=H3K36me3 PAN=pan-H3 WCE=WCE RPOL=RPol-II
dimensions dimensionX=cellType dimensionY=factor
This will result in a composite track configuration page with a matrix of
check boxes with 'Cell Types' along the X axis and 'Factors' along the Y. All
individual subtracks will be displayed below, but the matix of check boxes will
allow easy selection by subgrouping. The dimensions X&Y have no explicit defaults
(checked or unchecked) declared. This is becuase their state can be determined by
default settings of the tracks that belong to them.
NOTE: Additional dimensions 'A'-'W' and 'Z' (called 'ABC' dimensions) may be added.
However any 'ABC' dimensions will not be part of the matrix, but appear above it as
single rows of checkboxes. Unlike X&Y, any 'ABC' dimensions must have explicit
defaults set with the following:
dimensions dimensionX=cellType dimensionY=factor dimA=rep dimB=prot
dimensionAchecked rep1
dimensionBchecked protA,protB
This setting says that replicate 1 and protocols A and B are checked by default but
all other replicates and protocols are unchecked by default.
VIEWS or MULTI-VIEW
-------------------
A composite track may have subtracks with similar data subtracks which have
different types or views. For instance, ChIP-seq tracks may cover raw aligns,
a signal track and called peaks. The aligns may be 'bed 3', peaks 'bed 5 +'
and signal 'wig'. These views need different controls, though all signal
subtracks may share the same controls (based upon subgroups). This organization
is achieved by inserting a "view" level of tracks in between the composite level and the
-subtracks. The view track needs to have a view tag. Settings that are common to all tracks in
-a view can also be put in the view level track. In addition a specialize specialized subgroup
-"views" needs to be put in the view or subtrack level. Example:
- subGroup1 view Views Hmm=Sites-HMM Win=Sites-Windowing Sig=Signal_Densities Aln=Alignments
+subtracks. Settings that are common to all tracks in a composite can be put at the
+composite level, settings that are common to all tracks in a view can be put in the view
+level, and individual track settings can be put at the lowest level. In addition
+the composite parent track needs to have a specialized subGroup1 setting 'view' and
+view must be contained in the subGroup setting at the lowest level. Example:
+
+ track compositeParent
+ shortLabel Composite Sample
+ subGroup1 view Views Sig=Signal_Densities Aln=Alignments
+
+ track sigViewTrack
+ view Sig
+ shortLabel Signal Densities
+ subTrack compositeParent
+
+ track brainSignalTrack
+ shortLabel Signals in Brain
+ subTrack sigViewTrack
+ subGroup view=Sig
+
+ track liverSignalTrack
+ shortLabe Signals in Liver
+ subTrack sigViewTrack
+ subGroup view=Sig
+
+ track alnViewTrack
+ view Aln
+ shortLabel Alignments
+ subTrack compositeParent
+
+ track brainAlignmentTrack
+ shortLabel Signals in Brain
+ subTrack alnViewTrack
+ subGroup view=Aln
+
+ track liverAlignmentTrack
+ shortLabe Signals in Liver
+ subTrack alnViewTrack
+ subGroup view=Aln
+
+Note that the view level track needs to have a view tag. There is some redundancy between
+information in the view track and the subGroup tag of the composite parent. Apologies. This
+is largely so that the same sorting machinery for non-view subGroups can be reused here. In
+particular the value of the view tag needs to be the same as the right hand in the var=value pair
+of the subGroup, and the same as the left hand side of the var=val pair in the subGroup1.
+
+Each view (Sig, Aln) will be controlled by a separate drop down
where each view (Hmm,Win,Sig,Aln) will be controlled by a separate drop down
(hide,dense,...) and auxilary controls. All subtracks for a given view must
be of the same track type. Each subtrack of a multi-view should have all the proper settings
for the subtrack type.
MULTI-VIEW VISIBILITY DEFAULTS
------------------------------
The setting
viewUi on
inside of a view if present will make the view's configuration settings displayed by default.
MULTI-VIEW INDIVIDUAL SUBTRACK CONFIGURATION
--------------------------------------------
While in many cases all subtracks for a given view type should share the same
configuration settings (eg "Track height:"), in some cases it is desirable to
allow configuration of individual subtracks. To achieve this, add the following
setting to the subtrack (not to the composite track):
configurable on
It should be understood how individually configured settings work. Whenever
settings are configured by the user at the view level, that value will be set
for all subtracks. If a subtrack is then individually configured, the setting
will apply only to that subtrack. If the setting is then changed at the view
level again, the individual setting is lost.
SORT
----
The order of subtracks within a composite track are originally defined by the
"priority" setting. But the subtrack order can be made sortable with the
"sortOrder" setting. It defines the original sort order, but results in the
UI allowing the resorting of the subtracks which will effect both the hgTrackUi
display and the hgTracks display of the track data. The sortOrder relies
upon subGroups:
subGroup1 view Views Hmm=Sites-Hmm Win=Sites-Windowing Sig=Signal_Densities Aln=Alignments
subGroup2 cellType Cell_Type ES=ES MEF=MEF NP=NP EShyb=ES-hybrid
subGroup3 factor Factor K04=H3K4me3 K09=H3K9me3 K20=H4K20me3 K27=H3K27me3 K36=H3K36me3
sortOrder cellType=+ factor=+ view=-
These settings result in an original order by cellType,factor,view, but view is
in reverse order. [why reverse order??] NOTE: the value sorted will be the subGroup Ids, NOT the values.
In the example, subGroup3 refers to factors with values displayed in the UI such as
"H4K20me3" and "H4K20me3". An alphabetic sort of these values would place H4K20me3
before H4K20me3. However the actual sort will be of the Ids K09 and K20, which will
result in a more desirable order. NOTE: At some time, sortOrder may be extended to
allow sorting by shortLabel and longLabel, but currently only subgroups are sortable.
NOTE: This setting is incompatible with dividers and hierarchy and supercedes those
settings.
DRAG AND DROP
-------------
Subtracks of a composite track may be reordered by dragAndDrop as well. To enable
this functionality add the "dragAndDrop subTracks" setting to the composite (parent)
track.
DIVIDERS
--------
In order to more easily recognize the organization of a large list of subtracks
in a composite configuration page, dividers can be places between different
groups of subtracks. Example:
subGroup2 cellType Cell_Type ES=ES MEF=MEF NP=NP EShyb=ES-hybrid
subGroup3 factor Factor K4=H3K4me3 K9=H3K9me3 K20=H4K20me3 K27=H3K27me3 K36=H3K36me3 PAN=pan-H3 WCE=WCE RPOL=RPol-II
dividers cellType factor
This will place a simple dividing line anytime the cellType or factor changes
when listing the subtracks on the subtrack configuration page. In addition,
alternate groups will be displayed with slightly different background color.
NOTE: This setting is incompatible with and overridden by sortOrder.
HIERARCHY
---------
This setting allows one more means of visually organizing many subtracks on
the composite track configuration page. It is especially useful in easily
distinguishing views which are likely to be different within a single group
of subtracks. Example:
hierarchy view Hmm=0 Win=0 Sig=1 Aln=2
While a particular cell type and factor combination may be set off with dividers,
between the dividers the Hmm and Win type subtracks have no indentation, while
the signal track is indented by one and the alignment subtrack by two.
NOTE: This setting is incompatible with and overridden by sortOrder.
METADATA
--------
With many subtracks it can be important to give further distinguishing details.
In the subtracks list in hgTrackUi the long label can be followed by a "..."
which, when clicked will open a list of name=value pairs which have few requirements.
metadata name1=value1 name2="value 2" ...
Certain specific terms may carry special meanings in code:
(see controlledVvocabulary below)
dateSubmitted,dateUnrestricted used by ENCODE
fileName is converted into a link to the downloads directory in wgEncode
dataVersion is shown separately from metadata
which is seen below.
CONTROLLED VOCABULARY
---------------------
A special feature of the whole-genome ENCODE consortium effort is that there
are standardized cellTypes, antibodies and other known entities that are
defined in a "controlled vocabulary". The definitions of the vocabulary will
be reachable from the hgTrackUi page but this special case must be defined in
the trackDb.ra file. To do this, use the controlledVocabulary setting. Its
contents will first refer to the cv.ra file to be referenced, then the
subGroups that are controlled vocabulary:
controlledVocabulary encode/cv.ra cellType factor
The term that is sought in the controlled vocabulary is defined in the metadata.
That is the following composite/subtrack settings
subGroup2 cellTreat Cell_Treatment ES=Estrogen_alone ESP=Estrogen_and_Progesterone
controlledVocabulary encode/cv.ra cellTreat=treatment cell=cell factor=antibody
track yaleChipseqTreatmentsA43567
subGroups factor=CTCF cellType=K562 view=PK cellTreat=ESP
metadata cell=K562 antibody=CTCF treatment=estro-134/Pro-93a
will show "Estrogen and Progesterone" in the UI, but tie that to the specific treatment
term "estro-134/Pro-93a" in the controlled vocabulary file. A more complete definition
of what that term means, including a protocol, may be provided when the controlled
vocab term is looked up and presented to the user.
A special trick can be included in the label. In the example label if
"Estrogen_and_Progesterone" were replaced with "Estrogen and_Progesterone", then
only the first part of the label (Estrogen) would be represented as a link to the controlled
vocabulary.
EXAMPLE of Composite Track with ADDITIONAL OPTIONS
-------------------------------------------------
track broadChromatinChIPSeq
compositeTrack on
shortLabel Broad Chromatin ChIP-Seq
longLabel Broad Institute Chromatin State Mapping using ChIP-Seq
group regulation
subGroup1 view Views Hmm=Sites-HMM Win=Sites-Windowing Sig=Signal_Densities Aln=Alignments
subGroup2 cellType Cell_Type ES=ES MEF=MEF NP=NP EShyb=ES-hybrid
subGroup3 factor Factor K04=H3K4me3 K09=H3K9me3 K20=H4K20me3 K27=H3K27me3 K36=H3K36me3
dimensions dimensionX=cellType dimensionY=factor
sortOrder cellType=+ factor=+ view=-
dragAndDrop subTracks
#dividers cellType factor
#hierarchy view Hmm=0 Win=0 Sig=1 Aln=2
noInherit on
visibility hide
controlledVocabulary encode/cv.ra cellType=cell factor=antibody
priority 130
type bed 3
track broadChromatinChIPSeqViewHmm
shortLabel Sites-HMM
view Hmm
visibility pack
subTrack broadChromatinChIPSeq
track broadStemChipHmmSitesH3K4me3Es
shortLabel H3K4me3-ES HMM
longLabel Broad Stem Cell Chromatin IP Sites by HMM (H3K4me3 ab, Embryonic Stem (ES) cells)
subTrack broadChromatinChIPSeqViewHmm
subGroups factor=K04 cellType=ES view=Hmm
type bed 3
color 25,25,150
configurable on
metadata project=wgEncode grant=Bernstein lab=Broad dataType=ChipSeq cell=M9ES antibody=H3K4me3 softwareVersion="PeakSeq v2.1"
track broadChromatinChIPSeqViewWin
shortLabel Sites Windowing
view Win
viewUi on
visibility pack
subTrack broadChromatinChIPSeq
track broadStemChipWinSitesH3K4me3Es
shortLabel H3K4me3-ES Win
longLabel Broad Stem Cell Chromatin IP Sites by Windowing (H3K4me3 ab, Embryonic Stem (ES) cells)
subTrack broadChromatinChIPSeqViewWin
subGroups factor=K04 cellType=ES view=Win
type bed 5 +
useScore 1
color 25,25,150
configurable on
metadata project=wgEncode grant=Bernstein lab=Broad dataType=ChipSeq cell=M9ES antibody=H3K4me3 softwareVersion="Maq v2.1"
track broadChromatinChIPSeqViewSig
shortLabel Signal Densities
view Sig
visibility full
subTrack broadChromatinChIPSeq
track broadStemChipSignalH3K4Es
shortLabel H3K4me3-ES Sig
longLabel Broad Stem Cell Chromatin IP Signal (H3K4me3 ab, Embryonic Stem (ES) cells)
subTrack broadChromatinChIPSeqViewSig
subGroups factor=K04 cellType=ES view=Sig
type wig 0 35
yLineOnOff On
yLineMark 1.0
color 25,150,25
metadata project=wgEncode grant=Bernstein lab=Broad dataType=ChipSeq cell=M9ES antibody=H3K4me3 softwareVersion="Maq v2.1"
track broadChromatinChIPSeqViewAln
shortLabel Sites Alignments
view Aln
visibility hide
subTrack broadChromatinChIPSeq
track broadStemChipAlignmentsH3K4Me3Es
shortLabel H3K4me3-ES Align
longLabel Broad Stem Cell Chromatin IP Alignments (H3K4me3 ab, Embryonic Stem (ES) cells)
subTrack broadChromatinChIPSeqViewAln
subGroups factor=K04 cellType=ES view=Aln
type bed 9 +
useScore 1
metadata project=wgEncode grant=Bernstein lab=Broad dataType=ChipSeq cell=M9ES antibody=H3K4me3 softwareVersion="Maq v2.1, unique, mismatch<=2"
. . .
This example set up a composite track configuration page with an X and Y
dimension matrix of check boxes for Cell Types by Factors, and a multi-view.
The subtracks will be sortable on either of the 2 dimensions and the view type.
Additionally, the subtracks can be dragged and dropped into a preferred order
by the user. Notice that dividers and hierarchy are commented out, since they
are incompatible and ignored when sort is enabled. Also notice that the
entire composite track defaults to 'hide' when the user first visits the
browser. However, if the user then sets this to 'full' then 2 of 4 types
of subtracks will be displayed as packed, the signal will be displayed as
full and the alignments subtracks will still be hidden. Also, the configurable
settings for "Sites-Windowing" will be displayed by default. Additionally all
of the subtracks are individually configurable except alignments. Notice this is
accomplised using "ClosestToHome" methods for 2 of the views (Hmm and Win), but
directly for the other 2 (Sig and Aln). The wigs have many default settings but
only the one wig shown has yLine. Notice this example defines a controlled
vocabulary for both cellTypes and factors with definitions found in the
encode/cv.ra directory (off apache/cgi-bin/). (Don't be confused though, this
example mm8 composite track doesn't really have controlled vocabulary.)
Finally, metadata settings were added to provide further subtrack details.
---------------------
SUPER TRACKS
================
Tracks can be organized in configuration groupings called 'super tracks'.
A 'super track' is just a container for a group of related tracks that
can be made visible or hidden as a unit. There is an overall
description page displayed for the super track, and the super track
has a track control with special visibilities (hide/show) on the
hgTracks main page. The tracks which are contained in the super track
(super track members) do not have track controls on the hgTracks
main page. To define super tracks and their members, use the
'superTrack' setting as follows:
* Super track: 'superTrack on [show]'
* Member tracks: 'superTrack <super> [vis]
The Super track entry is very limited -- just the
track name, labels, group, priority, and the superTrack setting.
The optional 'show' field indicates that the default visibility
for the super track is not hidden.
The Member track entry is a full trackDb entry -- either for a regular
track or a composite. The <super> field gives the track name of
the super track for this member. The optional 'vis' field
indicates the default visibility for this track _only_ when it's a
member of a supertrack. The normal visibility field applies
when the track is not in a super track. (Note that a track can be
configured to be viewable as a supertrack member or not -- it will
only be part of a super track if the super track to which it refers
is configured in the assembly trackDb).