1c1e57e11e060937653c32bed082331dcaa93107 ann Tue Jun 4 14:17:57 2019 -0700 3-way browser genome release agreement diff --git src/hg/htdocs/browserAgreement.html src/hg/htdocs/browserAgreement.html new file mode 100755 index 0000000..e52df18 --- /dev/null +++ src/hg/htdocs/browserAgreement.html @@ -0,0 +1,98 @@ + + + + + + + +

Browser Genome Release Agreement

+ +

Purpose

+

The purpose of this document is to establish a common set of minimum requirements for public +display of genome data by the Ensembl, NCBI and UCSC browsers/annotation groups. This is a +follow up document to informal discussions held at the Biology of Genomes meeting at Cold +Spring Harbor, NY in May of 2008.

+ +

Background

+

Previously, the only agreement among the major browsers was to display the same set of +reference coordinates for the human genome reference assembly. This has largely extended to +other organisms as well, but issues remain that can lead to differences in the data provided +by the browsers. The issue that likely causes the largest number of problems is the annotatio +and display of genome assembly data prior to deposition of the genome assembly to the +International Nucleotide Sequence Database Collaboration (INSDC), commonly referred to as +DDBJ/EMBL/GenBank. The most common problems are (in increasing order of severity):

+ +
Inconsistent sequence identifiers amongst browsers.
+

Inconsistent sequence identifiers increase the level of difficulty when trying to exchange +annotation sets. This has been apparent as NCBI and Ensembl have tried to exchange gene model +datasets for organisms other than human and mouse. Of note, all browsers get these two assemblies +from a single source.

+ +
Inconsistent assembly identifiers amongst browsers.
+

Inconsistent assembly identifiers make it difficult for users to know which coordinate system +is being displayed, regardless of the data source.

+ +
Different sequence data amongst browsers.
+

Upon deposition of the data to the INSDC, quality control exercises will often uncover problems +with the assembly that is initially submitted. In many cases, the submitter is interested in +correcting these errors but the corrections may not get propagated to any browser that has already +picked up the data. An addendum to this item is the inconsistent handling of unplaced sequences. +Some groups choose to concatenate these sequences into a pseudo molecule, while others leave these +as independent sequences. The inconsistent use of sequence identifiers increases the difficulty of +mapping annotations amongst sources.

+ +
Some assemblies not ever submitted to INSDC.
+

Once assemblies get picked up, annotated and displayed in a browser, the initial sequencing and +assembly group may have little incentive to submit this assembly to the INSDC. For example, the +Xenopus tropicalis assembly has been 'available' since August, 2005 but has never been +submitted to the INSDC.

+ +

Agreement

+

Beginning in the spring of 2009, with the release of the Genome Reference Consortium Human +Build 37 release Ensembl, NCBI and UCSC agree that:

+ + + +

The terms of this document are not meant to be retroactive, and data currently displayed in any +of the browsers that do not meet these criteria do not need to be removed. However, we should +endeavor to begin bringing all genome assembly data into compliance moving forward.

+ +