be4311c07e14feb728abc6425ee606ffaa611a58 markd Fri Jan 22 06:46:58 2021 -0800 merge with master diff --git src/hg/makeDb/trackDb/human/exomeProbesets.html src/hg/makeDb/trackDb/human/exomeProbesets.html new file mode 100755 index 0000000..3331019 --- /dev/null +++ src/hg/makeDb/trackDb/human/exomeProbesets.html @@ -0,0 +1,275 @@ +<h1>Description</h1> +<p> +This set of tracks shows the genomic positions of <b>probes</b> and <b>targets</b> from a full +suite of in-solution-capture target enrichment exome kits for <b>Next Generation Sequencing (NGS)</b> +applications. Also known as <b>exome sequencing</b> or <b>whole exome sequencing (WES)</b>, +this technique allows high-throughput parallel sequencing of all exons (e.g. coding region of genes +which affect protein function), constituting about 1% of the human genome, or approximately 30 +million base pairs. +</p> +<p> +The tracks are intended to show the major differences in target genomic regions between the +different exome capture kits from the major players in the NGS sequencing market: +<a target=blank href="https://www.illumina.com"><b>Illumina Inc.</b></a>, +<a target=blank href="https://www.roche.com"><b>Roche NimbleGen Inc.</b></a>, +<a target=blank href="https://www.agilent.com"><b>Agilent Technologies Inc.</b></a>, +<a target=blank href="https://en.mgi-tech.com"><b>MGI Tech</b></a>, +<a target=blank href="https://www.twistbioscience.com"><b>Twist Bioscience</b></a>, and +<a target=blank href="https://www.idtdna.com"><b>Integrated DNA Technologies Inc.</b></a>. +</p> + +<h1>Display Conventions and Configuration</h1> + +<p> +Items are shaded according to manufacturing company: +<ul> +<li><b><font color="#FFB000">IDT (Integrated DNA Technologies)</font></b></li> +<li><b><font color="#FE6100">Twist Biosciences</font></b></li> +<li><b><font color="#DC267F">MGI Tech (Beijing Genomics Institute)</font></b></li> +<li><b><font color="#648FFF">Roche NimbleGen</font></b></li> +<li><b><font color="#785EF0">Agilent Technologies</font></b></li> +<li><b><font color="#163EA4">Illumina</font></b></li> +</ul> +</p> + +<p> +Tracks labeled as <em><b>Probes (P)</em></b> indicate the footprint of the oligonucleotide probes +mapped to the human genome. This is the technically relevant targeted region by the assay. However, +the sequenced region will be bigger than this since flanking sequences are sequenced as well. +Tracks labeled as <em><b>Target Regions (T)</em></b> indicate the genomic regions targeted by the +assay. This is the biologically relevant target region. It's not granted that all targeted regions +will be sequenced perfectly, it might be some capture bias on certain locations. The Target +Regions are those normally used for coverage analysis. +</p> + +<h1>Methods</h1> + +<p> +The capture of the genomic regions of interest using <b>in-solution capture</b>, is achieved +through the hybridization of a set of probes (oligonucleotides) with a sample of fragmented genomic +DNA in a solution environment. The probes hybridize selectively to the genomic regions of interest +which, after a process of exclusion of the non-selective DNA material, can be pulled down and +sequenced enabling selective DNA sequencing of the genomic regions (e.g. exons) of interest. +In-solution capture sequencing is a sensitive method to detect single nucleotide variants, +insertions and deletions, and copy number variations. +<p> + +<style> +#kit, #kit table, #kit th, #kit td { + border: 1px solid black; + border-collapse: collapse; + padding: 2px; +} +</style> + +<table id="kit" width=74%> + <tr> + <th>Kit</th> + <th>Targeted Region</th> + <th>Databases Used for Design</th> + <th>Year of Release</th> + </tr> + <tr> + <td>IDT - xGen Exome Research Panel V1.0</td> + <td>39 Mb</td> + <td>Coding sequences from RefSeq (19,396 genes)</td> + <td>2015</td> + </tr> + <tr> + <td>IDT - xGen Exome Research Panel V2.0</td> + <td>34 Mb</td> + <td>Coding sequences from RefSeq 109 (19,433 genes)</td> + <td>2020</td> + </tr> + <tr> + <td>Twist - RefSeq Exome Panel</td> + <td>3.6 Mb</td> + <td>Curated subset of protein coding genes from CCDS</td> + <td>N/A</td> + </tr> + <tr> + <td>Twist - Core Exome Panel</td> + <td>33 Mb</td> + <td>Protein coding genes from CCDS</td> + <td>N/A</td> + </tr> + <tr> + <td>Twist - Comprehensive Exome Panel</td> + <td>36.8 Mb</td> + <td>Protein coding genes from RefSeq, CCDS, and GENCODE </td> + <td>2020</td> + </tr> + <tr> + <td>MGI - Easy Exome Capture V4</td> + <td>59 Mb</td> + <td>CCDS, GENCODE, RefSeq, and miRBase</td> + <td>N/A</td> + </tr> + <tr> + <td>MGI - Easy Exome Capture V5</td> + <td>69 Mb</td> + <td>CCDS, GENCODE, RefSeq, miRBase, and MGI Clinical Database</td> + <td>N/A</td> + </tr> + <tr> + <td>Agilent - SureSelect Clinical Research Exome</td> + <td>54 Mb</td> + <td>Disease-associated regions from OMIM, HGMD, and ClinVar</td> + <td>2014</td> + </tr> + <tr> + <td>Agilent - SureSelect Clinical Research Exome V2</td> + <td>63.7 Mb</td> + <td>Disease-associated regions from OMIM, HGMD, ClinVar, and ACMG</td> + <td>2017</td> + </tr> + <tr> + <td>Agilent - SureSelect Focused Exome</td> + <td>12 Mb</td> + <td>Disease-associated regions from HGMD, OMIM and ClinVar</td> + <td>2016</td> +</tr> + <tr> + <td>Agilent - SureSelect All Exon V4</td> + <td>51 Mb</td> + <td>Coding regions from CCDS, RefSeq, and GENCODE v6, miRBase v17, TCGA v6, and UCSC known genes</td> + <td>2011</td> + </tr> + <tr> + <td>Agilent - SureSelect All Exon V4 + UTRs</td> + <td>71 Mb</td> + <td>Coding regions and 5' and 3' UTR sequences from CCDS, RefSeq, and GENCODE v6, regions from miRBase v17, TCGA v6, and UCSC known genes</td> + <td>2011</td> + </tr> + <tr> + <td>Agilent - SureSelect All Exon V5 </td> + <td>50 Mb</td> + <td>Coding regions from Refseq, GENCODE, UCSC, TCGA, CCDS, and miRBase (21.522 genes)</td> + <td>2012</td> + </tr> + <tr> + <td>Agilent - SureSelect All Exon V5 + UTRs</td> + <td>74 Mb</td> + <td>Coding regions and 5' and 3' UTR sequences from Refseq, GENCODE, UCSC, TCGA, CCDS, and miRBase (21.522 genes)</td> + <td>2012</td> + </tr> + <tr> + <td>Agilent - SureSelect All Exon V6 r2</td> + <td>60 Mb</td> + <td>Coding regions from RefSeq, CCDS, GENCODE, HGMD, and OMIM</td> + <td>2016</td> + </tr> + <tr> + <td>Agilent - SureSelect All Exon V6 + COSMIC r2</td> + <td>66 Mb</td> + <td>Coding regions from RefSeq, CCDS, GENCODE, HGMD, and OMIM, and targets from both TCGA and COSMIC</td> + <td>2016</td> + </tr> + <tr> + <td>Agilent - SureSelect All Exon V6 + UTR r2</td> + <td>75 Mb</td> + <td>Coding regions and 5' and 3' UTR sequences from RefSeq, GENCODE, CCDS, and UCSC known genes,and miRNAs and lncRNA sequences</td> + <td>2016</td> + </tr> + <tr> + <td>Agilent - SureSelect All Exon V7</td> + <td>35.7 Mb</td> + <td>Coding regions from RefSeq, CCDS, GENCODE, and UCSC known genes</td> + <td>2018</td> + </tr> + <tr> + <td>Roche - KAPA HyperExome</td> + <td>43Mb </td> + <td>Coding regions from CCDS, RefSeq, Ensembl, GENCODE,and variants from ClinVar</td> + <td>2020</td> + </tr> + <tr> + <td>Roche - SeqCap EZ Exome V3</td> + <td>64 Mb</td> + <td>Coding regions from RefSeq RefGene CDS, CCDS, and miRBase v14 databases, plus coverage of 97% Vega, 97% Gencode, and 99% Ensembl</td> + <td>2018</td> + </tr> + <tr> + <td>Roche - SeqCap EZ Exome V3 + UTR</td> + <td>92 Mb</td> + <td>Coding sequences from RefSeq RefGene, CCDS, and miRBase v14, plus coverage of 97% Vega, 97% Gencode, and 99% Ensembl and UTRs from RefSeq RefGene table from UCSC GRCh37/hg19 March 2012 and Ensembl (GRCh37 v64)</td> + <td>2018</td> + </tr> + <tr> + <td>Roche - SeqCap EZ MedExome</td> + <td>47 Mb</td> + <td>Coding sequences from CCDS 17, RefSeq, Ensembl 76, VEGA 56, GENCODE 20, miRBase 21, and disease-associated regions from GeneTests, ClinVar, and based on customer input</td> + <td>2014</td> + </tr> + <tr> + <td>Roche - SeqCap EZ MedExome + Mito</td> + <td>47 Mb</td> + <td>Coding sequences and mitochondrial genes from CCDS 17, RefSeq, Ensembl 76, VEGA 56, GENCODE 20 and miRBase 21, disease-associated regions from GeneTests, ClinVar, and based on customer input</td> + <td>2014</td> + </tr> + <tr> + <td>Illumina - Nextera DNA Exome V1.2</td> + <td>45 Mb</td> + <td>Coding regions from RefSeq, CCDS, Ensembl, and GENCODE v19</td> + <td>2015</td> + </tr> + <tr> + <td>Illumina - Nextera Rapid Capture Exome</td> + <td>37 Mb</td> + <td>212,158 targeted exonic regions with start and stop chromosome locations in GRCh37/hg19</td> + <td>2013</td> + </tr> + <tr> + <td>Illumina - Nextera Rapid Capture Exome V1.2</td> + <td>37 Mb</td> + <td>Coding regions from RefSeq, CCDS, Ensembl, and GENCODE v12</td> + <td>2014</td> + </tr> + <tr> + <td>Illumina - Nextera Rapid Capture Expanded Exome</td> + <td>66 Mb</td> + <td>Coding regions from RefSeq, CCDS, Ensembl, and GENCODE v12</td> + <td>2013</td> + </tr> + <tr> + <td>Illumina - TruSeq DNA Exome V1.2</td> + <td>45 Mb</td> + <td>Coding regions from RefSeq, CCDS, and Ensembl</td> + <td>2017</td> + </tr> + <tr> + <td>Illumina - TruSeq Rapid Exome V1.2</td> + <td>45 Mb</td> + <td>Coding regions from RefSeq, CCDS, Ensembl, and GENECODE v19</td> + <td>2015</td> + </tr> + <tr> + <td>Illumina - TruSight ONE V1.1</td> + <td>12 Mb</td> + <td>Coding regions of 6700 genes from HGMD, OMIM, and GeneTest</td> + <td>2017</td> + </tr> + <tr> + <td>Illumina - TruSight Exome</td> + <td>7 Mb</td> + <td>Disease-causing mutations as curated by HGMD</td> + <td>2017</td> + </tr> + <tr> + <td>Illumina - AmpliSeq Exome Panel</td> + <td>N/A</td> + <td>CCDS coding regions</td> + <td>2019</td> + </tr> +</table> + +<h1>Credits</h1> + +<p> +Thanks to Illumina (U.S), Roche NimbleGen, Inc. (U.S.), Agilent Technologies (U.S.), MGI Tech +(Beijing Genomics Institute, China), Twist Bioscience (U.S.), and Integrated DNA Technologies (IDT), +Inc. (U.S.). for making this data available and to Tiana Pereira, Pranav Muthuraman, Began Nguy and Anna Benet-Pages for enginering this tracks. +</p> + + +