5de72d67529fe096500e1009c53a83afaec25b96
kate
  Mon Apr 27 18:26:26 2020 -0700
Update GTEx Gene track description for V8. refs #25130

diff --git src/hg/makeDb/trackDb/human/gtexGeneV8.html src/hg/makeDb/trackDb/human/gtexGeneV8.html
new file mode 100644
index 0000000..901a6d3
--- /dev/null
+++ src/hg/makeDb/trackDb/human/gtexGeneV8.html
@@ -0,0 +1,176 @@
+<H2>Description</H2>
+<P>
+The
+<a target="_blank" href="https://commonfund.nih.gov/GTEx/index">NIH Genotype-Tissue Expression (GTEx) project</a>
+was created to establish a sample and data resource for studies on the relationship between 
+genetic variation and gene expression in multiple human tissues. 
+This track shows median gene expression levels in 52 tissues and 2 cell lines, 
+based on RNA-seq data from the GTEx final data release (V8, August 2019).
+This release is based on data from 17382 tissue samples obtained from 948 adult post-mortem individuals.</P>
+
+<H2>Display Conventions</H2>
+<P>
+In Full and Pack display modes, expression for each gene is represented by a colored bargraph,
+where the height of each bar represents the median expression level across all samples for a 
+tissue, and the bar color indicates the tissue.
+Tissue colors were assigned to conform to the GTEx Consortium publication conventions.
+<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<img border='1' src="../images/gtex/gtexGeneTcap.png"><br>
+The bargraph display has the same width and tissue order for all genes.
+Mouse hover over a bar will show the tissue and median expression level.
+The Squish display mode draws a rectangle for each gene, colored to indicate the tissue
+with highest expression level if it contributes more than 10% to the overall expression
+(and colored black if no tissue predominates).
+In Dense mode, the darkness of the grayscale rectangle displayed for the gene reflects the total
+median expression level across all tissues.</p>
+<p>
+The GTEx transcript model used to quantify expression level is displayed below the graph,
+colored to indicate the transcript class 
+(<span style='color: #0c0c78'>coding</span>, 
+<span style='color: #006400'>noncoding</span>, 
+<span style='color: #FF33FF'>pseudogene</span>, 
+<span style='color: #FE0000'>problem</span>), 
+following GENCODE conventions.
+</p>
+<P>
+Click-through on a graph displays a boxplot of expression level quartiles with outliers, 
+per tissue, along with a link to the corresponding gene page on the GTEx Portal.</P>
+The track configuration page provides controls to limit the genes and tissues displayed,
+and to select raw or log transformed expression level display.</P>
+
+<H2>Methods</H2>
+Tissue samples were obtained using the GTEx standard operating procedures for informed consent
+and tissue collection, in conjunction with the 
+<a target="_blank" href="https://biospecimens.cancer.gov/resources/sops/gtex.asp">
+National Cancer Institute Biorepositories and Biospecimen</a>.
+All tissue specimens were reviewed by pathologists to characterize and
+verify organ source.
+Images from stained tissue samples can be viewed via the 
+<a target="_blank" href="https://brd.nci.nih.gov/brd/image-search/searchhome">
+NCI histopathology viewer</a>.
+The Qiagen PAXgene non-formalin tissue preservation product was used to stabilize 
+tissue specimens without cross-linking biomolecules.</P>
+<P>
+RNA-seq was performed by the GTEx Laboratory, Data Analysis and Coordinating Center 
+(LDACC) at the Broad Institute.
+The Illumina TruSeq protocol was used to create an unstranded polyA+ library sequenced
+on the Illumina HiSeq 2000 and HiSeq 2500 platforms to produce 76-bp paired end reads with a coverage
+goal of 50M (median achieved was ~82M total reads).
+</P>
+Sequence reads were aligned to the hg38/GRCh38 human genome using STAR v2.5.3a
+assisted by the GENCODE 26 transcriptome definition. 
+The alignment pipeline is available
+<a target="_blank" href="https://github.com/broadinstitute/gtex-pipeline/tree/master/rnaseq">here</a>.
+</p>
+<a>
+Gene annotations were produced using a custom isoform collapsing procedure that excluded
+retained intron and read through transcripts, merged overlapping exon intervals and then excluded
+exon intervals overlapping between genes.
+Gene expression levels in TPM were called via the RNA-SeQC tool (v1.1.9), after filtering for 
+unique mapping, proper pairing, and exon overlap.
+For further method details, see the 
+<a target="_blank" href="https://gtexportal.org/home/documentationPage#staticTexAnalysisMethods">
+GTEx Portal Documentation</a> page.
+<P>
+UCSC obtained the gene-level expression files, gene annotations and sample metadata from the 
+GTEx Portal Download page.
+Median expression level in TPM was computed per gene/per tissue.</P>
+
+<H2>Subject and Sample Characteristics</H2>
+<P>
+The scientific goal of the GTEx project required that the donors and their biospecimen 
+present with no evidence of disease. 
+The tissue types collected were chosen based on their clinical significance, logistical 
+feasibility and their relevance to the scientific goal of the project and the 
+research community. 
+<!--
+Postmortem samples were collected from non-diseased donors with ages ranging from 20 to 79. 34.4% of donors were female and 65.6% male. 
+<div> <img border=1 src='/images/gtex/gtexSampleRin.V6.png'></div>
+<p></p>
+<div><img border=1 src='/images/gtex/gtexSampleAge.V6.png'></div></p>
+<p>
+-->
+Summary plots of GTEx sample characteristics are available at the 
+<a target="_blank" href="https://gtexportal.org/home/tissueSummaryPage">
+GTEx Portal Tissue Summary</a> page.</p>
+
+
+<h2>Data Access</h2>
+<p>
+The raw data for the GTEx Gene expression track can be accessed interactively through the 
+<a href="hgTables?db=$db&hgta_track=$db&hgta_group=allTables&hgta_table=gtexGeneV8">
+Table Browser</a> or <a href="hgIntegrator">Data Integrator</a>. Metadata can be 
+found in the connected tables below.
+<ul>
+<li><strong><a 
+href="hgTables?db=$db&hgta_track=$db&hgta_group=allTables&hgta_table=gtexGeneModelV8">
+gtexGeneModelV8</a></strong> describes the gene names and coordinates in genePred format.</li> 
+<li><strong><a 
+href="hgTables?db=$db&hgta_track=hgFixed&hgta_group=allTables&hgta_table=hgFixed.gtexTissueV8">
+hgFixed.gtexTissueV8</a></strong> lists each of the 53 tissues in alphabetical order,
+corresponding to the comma separated expression values in gtexGeneV8.</li>
+<li><strong><a 
+href="hgTables?db=$db&hgta_group=allTables&hgta_track=hgFixed&hgta_table=hgFixed.gtexSampleDataV8">
+hgFixed.gtexSampleDataV8</a></strong> has TPM expression scores for each individual gene-sample 
+data point, connected to gtexSampleV8.</li>
+<li><strong><a 
+href="hgTables?db=$db&hgta_group=allTables&hgta_track=hgFixed&hgta_table=hgFixed.gtexSampleV8">
+hgFixed.gtexSampleV8</a></strong> contains metadata about sample time, collection site,
+and tissue, connected to the donor field in the gtexDonorV8 table.</li>
+<li><strong><a 
+href="hgTables?db=$db&hgta_group=allTables&hgta_track=hgFixed&hgta_table=hgFixed.gtexDonorV8">
+hgFixed.gtexDonorV8</a></strong> has anonymized information on the tissue donor.</li></ul></p>
+<p>
+For automated analysis and downloads, the track data files can be downloaded from 
+<a href="https://hgdownload.soe.ucsc.edu/gbdb/$db/gtex/">our downloads server</a>
+or <a href="../goldenPath/help/api.html">the JSON API</a>.
+Individual regions or the whole genome annotation can be accessed as text using our utility
+<code>bigBedToBed</code>. Instructions for downloading the utility can be found 
+<a href="http://hgdownload.soe.ucsc.edu/downloads.html#utilities_downloads">here</a>. 
+That utility can also be used to obtain features within a given range, e.g. 
+<code>bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/gtex/gtexGeneV8.bb -chrom=chr21
+-start=0 -end=100000000 stdout</code></p>
+<p>
+Data can also be obtained directly from GTEx at the following link:
+<a href="https://gtexportal.org/home/datasets" target=_blank>
+https://gtexportal.org/home/datasets</a></p>
+
+<H2>Credits</H2>
+<P>
+Statistical analysis and data interpretation was performed by The GTEx Consortium Analysis 
+Working Group. 
+Data was provided by the GTEx LDACC at The Broad Institute of MIT and Harvard.</P>
+
+<H2>References</H2>
+<p>
+GTEx Consortium.
+<a href="https://www.nature.com/ng/journal/v45/n6/full/ng.2653.html" target="_blank">
+The Genotype-Tissue Expression (GTEx) project</a>.
+<em>Nat Genet</em>. 2013 Jun;45(6):580-5.
+PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/23715323" target="_blank">23715323</a>; 
+PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4010069/" target="_blank">PMC4010069</a> </p>
+
+<p>
+Carithers LJ, Ardlie K, Barcus M, Branton PA, Britton A, Buia SA, Compton CC, DeLuca DS, Peter-Demchok J, Gelfand ET <em>et al</em>.
+<a href="https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/26484571/" target="_blank">
+A Novel Approach to High-Quality Postmortem Tissue Procurement: The GTEx Project</a>.
+<em>Biopreserv Biobank</em>. 2015 Oct;13(5):311-9.
+PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/26484571" target="_blank">26484571</a>; 
+PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4675181/" target="_blank">PMC4675181</a></p>
+
+Mel&#233; M, Ferreira PG, Reverter F, DeLuca DS, Monlong J, Sammeth M, Young TR, Goldmann JM,
+Pervouchine DD, Sullivan TJ <em>et al</em>.
+<a href="https://science.sciencemag.org/content/348/6235/660" target="_blank">
+Human genomics. The human transcriptome across tissues and individuals</a>.
+<em>Science</em>. 2015 May 8;348(6235):660-5.
+PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/25954002" target="_blank">25954002</a>; PMC: <a
+href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4547472/" target="_blank">PMC4547472</a></p>
+
+<p>
+DeLuca DS, Levin JZ, Sivachenko A, Fennell T, Nazaire MD, Williams C, Reich M, Winckler W, Getz G.
+<a href="https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/bts196"
+target="_blank">
+RNA-SeQC: RNA-seq metrics for quality control and process optimization</a>.
+<em>Bioinformatics</em>. 2012 Jun 1;28(11):1530-2.
+PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/22539670" target="_blank">22539670</a>; PMC: <a
+href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3356847/" target="_blank">PMC3356847</a></p>
+