1c1e57e11e060937653c32bed082331dcaa93107
ann
  Tue Jun 4 14:17:57 2019 -0700
3-way browser genome release agreement

diff --git src/hg/htdocs/browserAgreement.html src/hg/htdocs/browserAgreement.html
new file mode 100755
index 0000000..e52df18
--- /dev/null
+++ src/hg/htdocs/browserAgreement.html
@@ -0,0 +1,98 @@
+<!DOCTYPE html>
+<!--#set var="TITLE" value="Browser Genome Release Agreement" -->
+<!--#set var="ROOT" value="." -->
+
+<!-- Relative paths to support mirror sites with non-standard GB docs install -->
+<!--#include virtual="$ROOT/inc/gbPageStart.html" -->
+
+<h1>Browser Genome Release Agreement</h1>
+
+<h2>Purpose</h2>
+<p>The purpose of this document is to establish a common set of minimum requirements for public 
+display of genome data by the Ensembl, NCBI and UCSC browsers/annotation groups. This is a 
+follow up document to informal discussions held at the Biology of Genomes meeting at Cold 
+Spring Harbor, NY in May of 2008.</p> 
+
+<h2>Background</h2>
+<p>Previously, the only agreement among the major browsers was to display the same set of 
+reference coordinates for the human genome reference assembly. This has largely extended to 
+other organisms as well, but issues remain that can lead to differences in the data provided 
+by the browsers. The issue that likely causes the largest number of problems is the annotatio
+and display of genome assembly data prior to deposition of the genome assembly to the 
+International Nucleotide Sequence Database Collaboration (INSDC), commonly referred to as 
+DDBJ/EMBL/GenBank. The most common problems are (in increasing order of severity):</p>
+
+<h6>Inconsistent sequence identifiers amongst browsers.</h6>
+<p>Inconsistent sequence identifiers increase the level of difficulty when trying to exchange 
+annotation sets. This has been apparent as NCBI and Ensembl have tried to exchange gene model 
+datasets for organisms other than human and mouse. Of note, all browsers get these two assemblies 
+from a single source.</p>
+
+<h6>Inconsistent assembly identifiers amongst browsers.</h6>
+<p>Inconsistent assembly identifiers make it difficult for users to know which coordinate system 
+is being displayed, regardless of the data source.</p>
+
+<h6>Different sequence data amongst browsers.</h6>
+<p>Upon deposition of the data to the INSDC, quality control exercises will often uncover problems 
+with the assembly that is initially submitted. In many cases, the submitter is interested in 
+correcting these errors but the corrections may not get propagated to any browser that has already 
+picked up the data. An addendum to this item is the inconsistent handling of unplaced sequences. 
+Some groups choose to concatenate these sequences into a pseudo molecule, while others leave these 
+as independent sequences. The inconsistent use of sequence identifiers increases the difficulty of 
+mapping annotations amongst sources.</p>
+
+<h6>Some assemblies not ever submitted to INSDC.</h6>
+<p>Once assemblies get picked up, annotated and displayed in a browser, the initial sequencing and 
+assembly group may have little incentive to submit this assembly to the INSDC. For example, the 
+<i>Xenopus tropicalis</i> assembly has been 'available' since August, 2005 but has never been 
+submitted to the INSDC.</p>
+
+<h2>Agreement</h2>
+<p>Beginning in the spring of 2009, with the release of the Genome Reference Consortium Human 
+Build 37 release Ensembl, NCBI and UCSC agree that:</p>
+
+<ul> 
+  <li>Data will be displayed only after it has been released by the INSDC.</li>
+    <ul>
+      <li>This document deals solely with the deposition of the genome assembly (contigs + 
+      scaffolds). Submission of annotation is not a requirement for public display of genome 
+      assembly data in any of the browsers.</li>
+      <li>It is anticipated that most genome assemblies will be able to be deposited to the INSDC. 
+      However, in the event that a genome assembly does not meet the INSDC criteria for submission,
+      the genome browsers will be free to show this data. It is anticipated that the browsers will 
+      work together to provide a consistent view of the assembly and its identifiers.</li>
+      <li>Assembly submitters can use the Hold Until Publication (HUP) mode of submission. Once the
+      assembly is accessioned, even if it has a HUP status, the submitter can distribute the 
+      assembly to any third party browser/annotation group. However, the data for these assemblies 
+      should not be made public by the browsers until the HUP status has been removed and the 
+      assembly data are public in the INSDC.</li>  
+    </ul>
+  <li>The sequence identifiers used in the browser and publicly distributed via FTP should be 
+  correlated with the INSDC records.</li>
+    <ul>
+      <li>Browsers can use alternate sequence identifiers but it should be clear how these 
+      identifiers map to the INSDC record. Ideally, this will have a minimal disruption on dataflow
+      but still provide a framework for easy data exchange between the various groups. This implies
+      that the starting AGP files should use the INSDC accession.version to identify all of the 
+      objects and components describing the assembly. For a reminder of AGP definitions, please see
+      the specification found 
+      <a href=http://www.ncbi.nlm.nih.gov/projects/genome/assembly/agp/AGP_Specification.shtml target="_blank">here</a>.</li>
+    </ul>
+  <li>All browsers will refer to any given assembly by the same name, preferably a submitter 
+  approved name. This should be collected at the time of assembly submission and guidance should 
+  be given to the submitted group in terms of selecting an appropriate name.</li>
+    <ul>
+      <li>There are several assemblies submitted that have no real submitter approved name. In 
+      these cases, every effort should be made by the browsers to reconcile the names/assemblies 
+      so that it is clear to users what data is being supplied at each browser and that data 
+      exchange between the browsers/annotation groups is facilitated.</li>
+      <li>Browser-specific assembly names are permitted only as an adjunct to the official, 
+      submitter-approved name, not as a replacement for the official name.</li>
+    </ul>
+</ul>
+
+<p>The terms of this document are not meant to be retroactive, and data currently displayed in any 
+of the browsers that do not meet these criteria do not need to be removed. However, we should 
+endeavor to begin bringing all genome assembly data into compliance moving forward.</p>
+
+<!--#include virtual="$ROOT/inc/gbPageEnd.html" -->