4a6170904fe3901af94b7cf0494e9e991e40115e jcasper Wed Apr 22 03:39:05 2026 -0700 First pass at a faceted composite help doc, refs #36320 diff --git src/hg/htdocs/goldenPath/help/facetedComposite.html src/hg/htdocs/goldenPath/help/facetedComposite.html new file mode 100755 index 00000000000..ce3bf4e4eba --- /dev/null +++ src/hg/htdocs/goldenPath/help/facetedComposite.html @@ -0,0 +1,250 @@ + + + + + + + +

Faceted Composite Tracks

+ +

Overview

+

Composite tracks are a standard way to collect a large number of related tracks within +the browser and interact with them in a unified interface. For example, our Conservation tracks +are often organized as composites as are each of our "All GENCODE" tracks (see +here). +The standard user interface for these composite tracks presents a plain list of all of the +subtracks where each one can be configured individually. More structure can be added to that +interface by adding "Views" to the composite, which group subtracks together and provide a +track selection matrix. That system works for intermediate numbers of tracks (around 20-40), +but the matrix approach fails when the number of subtracks scales into the thousands. +Faceted composites provide an alternate user interface for composite tracks that is designed for +these situations.

+

In a faceted composite, the list of subtracks is presented as a plain list with a list of +facets and text filters that can be used to narrow down the view to tracks of interest. This +is particularly useful for data sets that include information on a wide variety of cell types, +for example, where only a few of them may be of interest for any particular user. Because the +focus is on simply helping users identify which subtracks are relevant to them, the subtrack +configuration options are reduced to "is this subtrack displayed or not". Users can then +alter the display of individual subtracks using the right-click Configure meny from the main +hgTracks browser display.

+

+ +

+ Heatmap track showing a color-coded grid of expression values across genomic positions +
+ +

Contents

+ +
Quick Start +
TrackDb Settings for Building a Faceted Composite
+
Troubleshooting
+ + +

Quick Start

+

+The TL;DR version of this is that a faceted composite is like any other composite, +but don't add views or subgroups. All tracks in a mix of types live under the same +composite parent. The metadata file (which must be web-accessible) describes the +facet data for the tracks. The composite's trackDb settings must include a "primaryKey" +setting that names one of the fields in the metadata file. Child tracks must have +names that match "<parent_name>_<primaryKey>".
+Brief example:
+TrackDb entries
+

+  track myComposite
+  compositeTrack faceted
+  metaDataUrl https://url/to/metadata.tsv
+  primaryKey name
+  shortLabel Blood tests
+  longLabel Blood tests
+
+  track myComposite_ex1
+  parent myComposite
+  type bigBed
+  bigDataUrl https://url/to/ex1.bb
+  shortLabel ex1 peaks
+  longLabel ex1 Blood data peaks
+
+  track myComposite_ex2
+  parent myComposite
+  type bigBed
+  bigDataUrl https://url/to/ex2.bb
+  shortLabel ex1 peaks
+  longLabel ex2 Blood data peaks
+
+metadata.tsv
+
+name	collection_date	cell_type	lab
+ex1	2026-01-01	erythrocyte	Richter
+ex2	2026-01-03	erythrocyte	Helsing
+
+

+ + +

TrackDb Settings for Building a Faceted Composite

+

+Our trackDb documentation +includes details about settings relevant to faceted composites, but some of them +deserve a bit more exposition. +

+

+view and subGroups
+These settings are not used in faceted composites. Instead, the UI for a faceted +composite is governed by the dataTypes and metaDataUrl +settings. Most composite track needs can be addressed without using the +dataTypes setting at all, so we're going to ignore it to start with. +We'll first consider the case where the dataTypes setting +is not in use, and then a case where it might be helpful and what changes are +required. +

+In most situations, the desired user interface for a faceted composite track +presents a table where each row is a separate subtrack from the composite. The +user has full flexibility to decide which subtracks they want to see. +Clicking on individual rows adds them to the list of displayed subtracks; +clicking again deselects that track, removing it from the display. Facet filters +are provided to help narrow down the list interactively, as the list of subtracks +is often too long to easily scroll all the way through. +

+metaDataUrl
+In order to set up the facets, however, the track needs to include a description +of which facets exist and what the associated values for each track are. This +data comes from a separate web-accessible TSV (tab-separated value) file named +in the metaDataUrl setting of the track.
+Example:
+

+accession	tissue	protocol	treatment	_date	__count
+SRR11111	blood	ATAC-seq	control	2026-01-01	12
+SRR11112	blood	ATAC-seq	IFNg6h	2026-01-01	31
+SRR11113	spleen	ATAC-seq	control	2024-08-21	8
+SRR11114	spleen	ATAC-seq	IFNg6h	2026-08-22	17
+
+

+This data would be pasted into a file called something like "myTrackMetadata.tsv", +and it would be added to your faceted composite by adding

+
+metaDataUrl https://url/to/myTrackMetadata.tsv
+
+

+to the trackDb settings for the faceted composite track. A particular note +about two field names in this example file. The "date" field begins with one +underscore, and the "count" field begins with two underscores. These prefixes +modify the facet interface for the track. By default, each field apart from +the primaryKey field will have an associated facet created for it on the page, +and a searchbox will be provided in the table. When a field name begins with +one underscore, however, no facet will be created (a search box will still +be provided in the table header). When a field name begins with two underscores, +there will be no facet for it and no search box in the table header. +

+dataTypes
+In the above examples, the assumption is that there is one track for each accession. +In some situations, however, there may be multiple tracks associated with each +accession in a formulaic way. For example, each accession could have a raw counts +bigWig track, a scaled counts bigWig track, and a peak calls bigBed track. +Instead of having three entries in the table that all share the same metadata +(one for each track), you can use the dataTypes setting to describe +which data types (raw counts, scaled counts, and peaks) are available for each +sample accession.
+When this setting is used, an additional selector is placed near the top of the +page to permit users to identify which data types they want to display. The +selected data types will be turned on for every selected sample in the table, +so the interface is a bit less flexible than the plain one-row-per-track table. +In this alternate setup, however, the one-row-per-sample arrangement can save +significant space both in the configuration UI and in the metadata TSV file.
+A very important note here: the rules for subtrack names changes when +the dataTypes setting is active. Without dataTypes, subtrack names +are expected to match <parent track name>_<primary key>. An example +of that can be seen in the quick start near the top of the page. When +dataTypes are used, however, then subtrack names are expected to match +<parent track name>_<primary key>_<dataType>. For example, +if the data types "signal" and "peaks" are in use for that same quick start +example, then the track look like this:
+

+  track myComposite
+  compositeTrack faceted
+  metaDataUrl https://url/to/metadata.tsv
+  primaryKey name
+  shortLabel Blood tests
+  longLabel Blood tests
+  dataTypes signal peaks
+
+  track myComposite_ex1_signal
+  parent myComposite
+  type bigWig
+  bigDataUrl https://url/to/ex1.bw
+  shortLabel ex1 signal
+  longLabel ex1 Blood data signal
+
+  track myComposite_ex1_peaks
+  parent myComposite
+  type bigBed
+  bigDataUrl https://url/to/ex1.bb
+  shortLabel ex1 peaks
+  longLabel ex1 Blood data peaks
+
+  track myComposite_ex2_signal
+  parent myComposite
+  type bigWig
+  bigDataUrl https://url/to/ex2.bw
+  shortLabel ex2 signal
+  longLabel ex2 Blood data signal
+
+  track myComposite_ex2
+  parent myComposite
+  type bigBed
+  bigDataUrl https://url/to/ex2.bb
+  shortLabel ex2 peaks
+  longLabel ex2 Blood data peaks
+
+

+metadata.tsv
+

+name	collection_date	cell_type	lab
+ex1	2026-01-01	erythrocyte	Richter
+ex2	2026-01-03	erythrocyte	Helsing
+
+

+One other note: sometimes you may wish to have more descriptive text than just "peaks" +or "signal" for the selector, but the better labels aren't compatible with being used +as part of a track name (maybe because they include spaces). This can be handled +by specifying each data type as <name>|"<label>". The "name" +value will be used to generate track names, while the label will be used for display.
+Example:
+

+  dataTypes signal|"Methylation signal (scaled)" peaks|"Highly methylated regions"
+
+

+subtrackUrls
+It can also be useful to have certain fields provide links out to external resources, +particularly when accessions are in use. The subtrackUrls setting describes +which fields are to be used to generate links out and what the format of those URLs should be. +Bringing back this example metadata file:
+

+accession	tissue	protocol	treatment	_date	__count
+SRR11111	blood	ATAC-seq	control	2026-01-01	12
+SRR11112	blood	ATAC-seq	IFNg6h	2026-01-01	31
+SRR11113	spleen	ATAC-seq	control	2024-08-21	8
+SRR11114	spleen	ATAC-seq	IFNg6h	2026-08-22	17
+
+

+It might be helpful to provide links out from the accession column to SRA, and the protocol +column to a description page of the protocol. This could be achieved by adding the following +subtrackUrls setting to the composite's trackDb block: +

+subtrackUrls accession=https://www.ncbi.nlm.nih.gov/sra/$$ protocol=https://www.protocols.io/view/$$
+
+

+For each of these URLs, $$ will be replaced with the relevant value from that field (whether one +of the SRR strings for the accession field, or "ATAC-seq" for the protocol field). +

+ + +

Troubleshooting

+

+The most likely place to encounter problems when building a faceted composite is a mismatch +between the metadata TSV file and the subtrack names in the trackDb block. Check carefully +to ensure that the values in the primaryKey column match the names of the subtracks, +including capitalization. The hubCheck tool has not yet been updated to automate these +checks, but that work is in progress. +

+ +