2bd5a55547bf3e1c1b95622b6bb7dcd7cb4d2369 max Wed Oct 13 07:31:15 2021 -0700 finishing up gtex rna-seq coverage track, refs #27964 diff --git src/hg/makeDb/trackDb/human/gtexCov.html src/hg/makeDb/trackDb/human/gtexCov.html new file mode 100644 index 0000000..8f250b7 --- /dev/null +++ src/hg/makeDb/trackDb/human/gtexCov.html @@ -0,0 +1,144 @@ +<H2>Description</H2> +<P> +The +<a target="_blank" href="https://commonfund.nih.gov/GTEx/index"> +NIH Genotype-Tissue Expression (GTEx) project</a> +determined genetic variation and gene expression in 52 tissues and 2 cell lines +using RNA-seq data (V8, August 2019), on 17,382 samples from 948 adults. +This track focuses on the gene expression part. It shows read coverage, from one +single sample per tissue, selected for high-quality and high read depth. +The data is summarized to one number per base pair, the number of sequencing +reads that cover this position. The plot allows finding out if a given exon is +transcribed primarily in certain tissues and also whether transcription is +uniform over the length of a single exon. +</P> + +<H2>Display Conventions</H2> +<P> +This track follows the display conventions for composite +"wiggle" tracks. The subtracks, one per tissue, of this track +may be configured in a variety of ways to highlight different aspects of the +displayed data. The graphical configuration options are shown at the top of +the track description page, followed by a list of subtracks. To display only +selected subtracks, uncheck the boxes next to the tracks you wish to hide. +For more information about the graphical configuration options, click the +<A HREF="../goldenPath/help/hgWiggleTrackHelp.html" TARGET=_blank>Graph +configuration help</A> link.</P> +Tissue colors were assigned to conform to the GTEx Consortium publication conventions. +</P> + +In Dense mode, the darkness of the grayscale rectangle displayed for the gene reflects the absolute +read count. +</p> + +<H2>Methods</H2> +<p>For background information about GTEx sample selection, see our +<a href="hgGtexTrackSettings?g=gtexGeneV8" target=_blank>GTEx gene expression +track</a>. In short, samples were sequenced with the Illumina TrueSeq protocol +on unstranded polyA+ librarires to obtain 76-bp paired end reads with +HiSeq 2000 and 2500 machines.</p> + +<p> +Sequence reads were aligned to the hg38/GRCh38 human genome using STAR v2.5.3a +and the GENCODE 26 transcriptome. +The alignment pipeline is available +<a target="_blank" href="https://github.com/broadinstitute/gtex-pipeline/tree/master/rnaseq">here</a>. +For further method details, see the +<a target="_blank" href="https://gtexportal.org/home/documentationPage#staticTexAnalysisMethods"> +GTEx Portal Documentation</a> page. +</p> + +<P> +To obtain read coverage, the GTEx Laboratory, Data Analysis and Coordinating +Center (LDACC) at the Broad Institute decided to select a single, high-quality +representative sample for each tissue type, since aggregated tracks may +obscure certain features or even introduce some artifacts (e.g. intronic +coverage). For each tissue, the selected sample has the highest RIN value with +a high coverage (>80M reads) and exonic rate (>85%). +The alignment-to-coverage pipeline is available from Github: +<a target="_blank" href="https://github.com/broadinstitute/gtex-pipeline/blob/master/rnaseq/src/bam2coverage.py">Python script</a>, +<a target="_blank" href="https://github.com/broadinstitute/gtex-pipeline/blob/master/rnaseq/Dockerfile">Docker file</a> and +<a target="_blank" href="https://github.com/broadinstitute/gtex-pipeline/blob/master/rnaseq/bam2coverage.wdl">Pipeline WDL description</a>. +</p> +<p>To show the exact GTEx sample that was used for each tissue, +click the "Schema" link on the track configuration page (above), the filename +under "bigDataUrl" includes the identifier.</p> + +<H2>Subject and Sample Characteristics</H2> +<P> +The scientific goal of the GTEx project required that the donors and their biospecimen +present with no evidence of disease. +The tissue types collected were chosen based on their clinical significance, logistical +feasibility and their relevance to the scientific goal of the project and the +research community. +Summary plots of GTEx sample characteristics are available at the +<a target="_blank" href="https://gtexportal.org/home/tissueSummaryPage"> +GTEx Portal Tissue Summary</a> page.</p> + +<h2>Data Access</h2> +<p> +The raw data for the GTEx Read Coverage track can be accessed interactively through the +<a href="hgTables?db=$db&hgta_track=$db&hgta_group=allTables&hgta_table=gtexCov"> +</p> +</p> +For automated analysis and downloads, the track data files can be downloaded from +<a href="https://hgdownload.soe.ucsc.edu/gbdb/$db/gtex/cov/">our downloads server</a> +or <a href="../goldenPath/help/api.html">the JSON API</a>. +Individual regions or the whole genome annotation can be accessed as text using our utility +<code>bigBedToBed</code>. Instructions for downloading the utility can be found +<a href="http://hgdownload.soe.ucsc.edu/downloads.html#utilities_downloads">here</a>. +That utility can also be used to obtain features within a given range, e.g. +<code>bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/gtex/gtexGeneV8.bb -chrom=chr21 +-start=0 -end=100000000 stdout</code></p> +<p> +Data can also be obtained directly from GTEx at the following link: +<a href="https://gtexportal.org/home/datasets" target=_blank> +https://gtexportal.org/home/datasets</a></p> + +<H2>Credits</H2> +<P> +Statistical analysis and data interpretation was performed by The GTEx Consortium Analysis +Working Group. +Data was provided by the GTEx LDACC at The Broad Institute of MIT and Harvard.</P> + +<H2>References</H2> +<p> +GTEx Consortium. +<a href="https://www.researchgate.net/publication/336246636_The_GTEx_Consortium_atlas_of_genetic_regulatory_effects_across_human_tissues" target="_blank"> +The GTEx Consortium atlas of genetic regulatory effects across human tissues.</a> +<em>In press.</em></p> + +<p> +GTEx Consortium. +<a href="https://www.nature.com/ng/journal/v45/n6/full/ng.2653.html" target="_blank"> +The Genotype-Tissue Expression (GTEx) project</a>. +<em>Nat Genet</em>. 2013 Jun;45(6):580-5. +PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/23715323" target="_blank">23715323</a>; +PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4010069/" target="_blank">PMC4010069</a> </p> + +<p> +Carithers LJ, Ardlie K, Barcus M, Branton PA, Britton A, Buia SA, Compton CC, DeLuca DS, +Peter-Demchok J, Gelfand ET <em>et al</em>. +<a href="https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/26484571/" target="_blank"> +A Novel Approach to High-Quality Postmortem Tissue Procurement: The GTEx Project</a>. +<em>Biopreserv Biobank</em>. 2015 Oct;13(5):311-9. +PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/26484571" target="_blank">26484571</a>; +PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4675181/" target="_blank">PMC4675181</a></p> + +Melé M, Ferreira PG, Reverter F, DeLuca DS, Monlong J, Sammeth M, Young TR, Goldmann JM, +Pervouchine DD, Sullivan TJ <em>et al</em>. +<a href="https://science.sciencemag.org/content/348/6235/660" target="_blank"> +Human genomics. The human transcriptome across tissues and individuals</a>. +<em>Science</em>. 2015 May 8;348(6235):660-5. +PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/25954002" target="_blank">25954002</a>; PMC: <a +href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4547472/" target="_blank">PMC4547472</a></p> + +<p> +DeLuca DS, Levin JZ, Sivachenko A, Fennell T, Nazaire MD, Williams C, Reich M, Winckler W, Getz G. +<a href="https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/bts196" +target="_blank"> +RNA-SeQC: RNA-seq metrics for quality control and process optimization</a>. +<em>Bioinformatics</em>. 2012 Jun 1;28(11):1530-2. +PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/22539670" target="_blank">22539670</a>; PMC: <a +href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3356847/" target="_blank">PMC3356847</a></p> +