8143fc04dcf6eabe44a2ef8be2ba0f11223d3ddf dschmelt Tue Feb 2 16:42:53 2021 -0800 Modifying display and html for exon track refs #24598 diff --git src/hg/makeDb/trackDb/human/exomeProbesets.html src/hg/makeDb/trackDb/human/exomeProbesets.html index 3331019..41a822b 100755 --- src/hg/makeDb/trackDb/human/exomeProbesets.html +++ src/hg/makeDb/trackDb/human/exomeProbesets.html @@ -1,275 +1,292 @@
This set of tracks shows the genomic positions of probes and targets from a full suite of in-solution-capture target enrichment exome kits for Next Generation Sequencing (NGS) applications. Also known as exome sequencing or whole exome sequencing (WES), this technique allows high-throughput parallel sequencing of all exons (e.g. coding region of genes which affect protein function), constituting about 1% of the human genome, or approximately 30 million base pairs.
The tracks are intended to show the major differences in target genomic regions between the different exome capture kits from the major players in the NGS sequencing market: Illumina Inc., Roche NimbleGen Inc., Agilent Technologies Inc., MGI Tech, Twist Bioscience, and Integrated DNA Technologies Inc..
Items are shaded according to manufacturing company:
Tracks labeled as Probes (P) indicate the footprint of the oligonucleotide probes mapped to the human genome. This is the technically relevant targeted region by the assay. However, the sequenced region will be bigger than this since flanking sequences are sequenced as well. Tracks labeled as Target Regions (T) indicate the genomic regions targeted by the assay. This is the biologically relevant target region. It's not granted that all targeted regions will be sequenced perfectly, it might be some capture bias on certain locations. The Target Regions are those normally used for coverage analysis.
The capture of the genomic regions of interest using in-solution capture, is achieved through the hybridization of a set of probes (oligonucleotides) with a sample of fragmented genomic DNA in a solution environment. The probes hybridize selectively to the genomic regions of interest which, after a process of exclusion of the non-selective DNA material, can be pulled down and sequenced enabling selective DNA sequencing of the genomic regions (e.g. exons) of interest. In-solution capture sequencing is a sensitive method to detect single nucleotide variants, insertions and deletions, and copy number variations.
Kit | Targeted Region | Databases Used for Design | Year of Release |
---|---|---|---|
IDT - xGen Exome Research Panel V1.0 | 39 Mb | Coding sequences from RefSeq (19,396 genes) | 2015 |
IDT - xGen Exome Research Panel V2.0 | 34 Mb | Coding sequences from RefSeq 109 (19,433 genes) | 2020 |
Twist - RefSeq Exome Panel | 3.6 Mb | Curated subset of protein coding genes from CCDS | N/A |
Twist - Core Exome Panel | 33 Mb | Protein coding genes from CCDS | N/A |
Twist - Comprehensive Exome Panel | 36.8 Mb | Protein coding genes from RefSeq, CCDS, and GENCODE | 2020 |
MGI - Easy Exome Capture V4 | 59 Mb | CCDS, GENCODE, RefSeq, and miRBase | N/A |
MGI - Easy Exome Capture V5 | 69 Mb | CCDS, GENCODE, RefSeq, miRBase, and MGI Clinical Database | N/A |
Agilent - SureSelect Clinical Research Exome | 54 Mb | Disease-associated regions from OMIM, HGMD, and ClinVar | 2014 |
Agilent - SureSelect Clinical Research Exome V2 | 63.7 Mb | Disease-associated regions from OMIM, HGMD, ClinVar, and ACMG | 2017 |
Agilent - SureSelect Focused Exome | 12 Mb | Disease-associated regions from HGMD, OMIM and ClinVar | 2016 |
Agilent - SureSelect All Exon V4 | 51 Mb | Coding regions from CCDS, RefSeq, and GENCODE v6, miRBase v17, TCGA v6, and UCSC known genes | 2011 |
Agilent - SureSelect All Exon V4 + UTRs | 71 Mb | Coding regions and 5' and 3' UTR sequences from CCDS, RefSeq, and GENCODE v6, regions from miRBase v17, TCGA v6, and UCSC known genes | 2011 |
Agilent - SureSelect All Exon V5 | 50 Mb | Coding regions from Refseq, GENCODE, UCSC, TCGA, CCDS, and miRBase (21.522 genes) | 2012 |
Agilent - SureSelect All Exon V5 + UTRs | 74 Mb | Coding regions and 5' and 3' UTR sequences from Refseq, GENCODE, UCSC, TCGA, CCDS, and miRBase (21.522 genes) | 2012 |
Agilent - SureSelect All Exon V6 r2 | 60 Mb | Coding regions from RefSeq, CCDS, GENCODE, HGMD, and OMIM | 2016 |
Agilent - SureSelect All Exon V6 + COSMIC r2 | 66 Mb | Coding regions from RefSeq, CCDS, GENCODE, HGMD, and OMIM, and targets from both TCGA and COSMIC | 2016 |
Agilent - SureSelect All Exon V6 + UTR r2 | 75 Mb | Coding regions and 5' and 3' UTR sequences from RefSeq, GENCODE, CCDS, and UCSC known genes,and miRNAs and lncRNA sequences | 2016 |
Agilent - SureSelect All Exon V7 | 35.7 Mb | Coding regions from RefSeq, CCDS, GENCODE, and UCSC known genes | 2018 |
Roche - KAPA HyperExome | 43Mb | Coding regions from CCDS, RefSeq, Ensembl, GENCODE,and variants from ClinVar | 2020 |
Roche - SeqCap EZ Exome V3 | 64 Mb | Coding regions from RefSeq RefGene CDS, CCDS, and miRBase v14 databases, plus coverage of 97% Vega, 97% Gencode, and 99% Ensembl | 2018 |
Roche - SeqCap EZ Exome V3 + UTR | 92 Mb | Coding sequences from RefSeq RefGene, CCDS, and miRBase v14, plus coverage of 97% Vega, 97% Gencode, and 99% Ensembl and UTRs from RefSeq RefGene table from UCSC GRCh37/hg19 March 2012 and Ensembl (GRCh37 v64) | 2018 |
Roche - SeqCap EZ MedExome | 47 Mb | Coding sequences from CCDS 17, RefSeq, Ensembl 76, VEGA 56, GENCODE 20, miRBase 21, and disease-associated regions from GeneTests, ClinVar, and based on customer input | 2014 |
Roche - SeqCap EZ MedExome + Mito | 47 Mb | Coding sequences and mitochondrial genes from CCDS 17, RefSeq, Ensembl 76, VEGA 56, GENCODE 20 and miRBase 21, disease-associated regions from GeneTests, ClinVar, and based on customer input | 2014 |
Illumina - Nextera DNA Exome V1.2 | 45 Mb | Coding regions from RefSeq, CCDS, Ensembl, and GENCODE v19 | 2015 |
Illumina - Nextera Rapid Capture Exome | 37 Mb | 212,158 targeted exonic regions with start and stop chromosome locations in GRCh37/hg19 | 2013 |
Illumina - Nextera Rapid Capture Exome V1.2 | 37 Mb | Coding regions from RefSeq, CCDS, Ensembl, and GENCODE v12 | 2014 |
Illumina - Nextera Rapid Capture Expanded Exome | 66 Mb | Coding regions from RefSeq, CCDS, Ensembl, and GENCODE v12 | 2013 |
Illumina - TruSeq DNA Exome V1.2 | 45 Mb | Coding regions from RefSeq, CCDS, and Ensembl | 2017 |
Illumina - TruSeq Rapid Exome V1.2 | 45 Mb | Coding regions from RefSeq, CCDS, Ensembl, and GENECODE v19 | 2015 |
Illumina - TruSight ONE V1.1 | 12 Mb | Coding regions of 6700 genes from HGMD, OMIM, and GeneTest | 2017 |
Illumina - TruSight Exome | 7 Mb | Disease-causing mutations as curated by HGMD | 2017 |
Illumina - AmpliSeq Exome Panel | N/A | CCDS coding regions | 2019 |
+The raw data can be explored interactively with the Table Browser +or cross-referenced with Data Integrator. The data can be +accessed from scripts through our API, with track names +found in the Table Schema page for each subtrack after "Primary Table:". + +
+For downloading the data, the annotations are stored in bigBed files that +can be accessed at + +our download directory. +Regional or the whole genome text annotations can be obtained using our utility +bigBedToBed. Instructions for downloading utilities can be found +here. +
+Thanks to Illumina (U.S), Roche NimbleGen, Inc. (U.S.), Agilent Technologies (U.S.), MGI Tech (Beijing Genomics Institute, China), Twist Bioscience (U.S.), and Integrated DNA Technologies (IDT), Inc. (U.S.). for making this data available and to Tiana Pereira, Pranav Muthuraman, Began Nguy and Anna Benet-Pages for enginering this tracks.