src/hg/htdocs/goldenPath/history.html 2321e33442cb48591aae4c29b40424a402b182a7

2321e33442cb48591aae4c29b40424a402b182a7
jnavarr5
  Mon Jan 4 15:38:16 2021 -0800
Adding images and captions to the history page. Still need to fix the format/style. refs #20314

diff --git src/hg/htdocs/goldenPath/history.html src/hg/htdocs/goldenPath/history.html
index 54b4c6d..4378f13 100755
--- src/hg/htdocs/goldenPath/history.html
+++ src/hg/htdocs/goldenPath/history.html
@@ -122,30 +122,53 @@
 (HMMs) to the task of computer gene-finding. This application of HMMs had quickly become the
 dominant gene-finding methodology and was used successfully on the <i>Drosophila melanogaster</i>
 (fruit fly) genome.</p>
 <p>
 At the time UCSC entered the International Human Genome Project (IHGP), the IHGP was assembling the
 sequence one piece (or, in the jargon of molecular biology, one &quot;clone&quot;) at a time, and
 intending to string the pieces together based on a precisely constructed clone map. This approach
 had been shown to work very well with <i>Caenorhabditis elegans</i> (a roundworm) and human
 chromosome 22. But the process of making sure every last part of the sequence is read and put
 together properly is quite labor-intensive.</p>
 <p>
 Haussler enlisted Jim Kent, then a graduate student at UCSC's Department of Molecular, Cell, &amp;
 Developmental Biology, along with systems engineer Patrick Gavin, and graduate students Terrence
 Furey and David Kulp (who had led the gene-finding effort on the Drosophila genome). This was the
 birth of the UCSC Genome Browser group.</p>
+<div class="row">
+  <div class="col-md-6">
+    <img class="text-center" alt="Jim next to his computer" src="/images/jim-in-garage.jpg"
+      style="margin-botton:5px; width:500px">
+      <div style="text-align:center; line-height:1">
+        <font SIZE=-1>
+          Jim in his garage sitting next to the computer where he wrote the 10,000 lines of computer
+          code to assemble the first draft assembly of the human genome.
+        </font>
+      </div>
+  </div>
+  <div class="col-md-6">
+    <img class="text-center" alt="Jim, David, Scot, Patrick, and Gavin at UCSC."
+      src="/images/Jim-Kent-David-Haussler-Scot-Kennedy-Patrick-Gavin-genome-assembly-era-Group-2000.jpg"
+      style="margin-botton:5px; width:400px">
+      <div style="text-align:center; line-height:1">
+        <font SIZE=-1>
+          Jim Kent, David Haussler, Scot Free Kennedy, and Patrick Gavin at UCSC.
+        </font>
+      </div>
+   </div>
+</div>
+  
 
 <a name="celera"></a>
 <h3>New challenger, Celera Genomics</h3>
 <p>
 It was a crucial time for the international project. A private company, <a target="_blank"
 href="https://en.wikipedia.org/wiki/Celera_Corporation">Celera Genomics</a>, had announced its
 intention to assemble the human genome sequence well in advance of the public effort, raising the
 fear that the sequence would be protected by patents and thus not be freely available to scientists.
 Celera Genomics was using an alternative approach, a so-called whole genome &quot;shotgun&quot;
 method, where small bits of the sequence are read at random from the genome, and then a computer
 program assembles these bits into an approximation of the genome as a whole. By using this approach,
 Celera's assembly would still have numerous gaps and ambiguities, but the entire project from start
 to finish could be done in less than half the time the IHGP planned for their effort. A further
 complication was the fact that Celera had access to the fruits of the public project, while keeping
 their own results private.</p>
@@ -159,65 +182,97 @@
 outstanding mapping information provided by Bob Waterston's group at Washington University, the
 second-stage assembly turned out to be like an extremely difficult jigsaw puzzle, with many layers
 of conflicting evidence having similar-looking, non-contiguous, overlapping pieces.</p>
 <p>
 At least partly in response to competition from Celera, the IHGP changed its focus from producing
 finished clones to producing draft clones. To sequence a clone, the IHGP adopted a shotgun approach
 in miniature. Bits of a clone were read at random, and the bits were stitched together by a computer
 program into pieces called &quot;contigs.&quot; After the shotgun phase, a clone was typically in
 5-50 contigs, but the relative order of the contigs was not known. This was the state of the genome
 when David Haussler first attempted to locate the genes computationally, and he quickly discovered
 that computational gene-finding was nearly impossible, since the average size of a contig was
 considerably smaller than the average size of a human gene.</p>
 
 <a name="push"></a>
 <h3>Push to the Finish Line</h3>
+<div class="row">
+  <div class="col-md-6">
 <p>
 Motivated to prevent Celera and its clients from locking up significant portions of the human genome
 in patents, Jim Kent dropped his other work in May of 2000 to focus on the assembly problem. In a
 remarkable display of energy and talent, Kent developed within four weeks a 10,000-line computer
 program that assembled the working draft of the human genome. The program, called GigAssembler,
 constructed the first working draft of the human genome on June 22, 2000, just days before Celera
 completed its first assembly. The IHGP working draft combined anonymous genomic information from
 human volunteers of diverse backgrounds, accepted on a first-come, first-taken basis. The Celera
 sequence was of a single individual. Since the public consortium finished the genome ahead of the
 private company, the genome and the information it contains are available free to researchers
 worldwide. Kent's assembly was celebrated at a White House ceremony on June 26, 2000, announcing the
 completion of the first drafts of the human genome by the IHGP and Celera.</p>
+  </div>
+  <div class="col-md-6">
+    <img class="text-center" alt="Copy of first draft of the human genome on a CD"
+      src="/images/genome_cd.jpg"
+      style="margin-botton:5px; width:400px">
+      <div style="text-align:center; line-height:1">
+        <font SIZE=-1>
+           Copy of first draft of the human genome sequence presented to
+           <a href="https://www.soe.ucsc.edu/news/article/1020" target="_blank">President
+           Clinton</a> and deposited in the Smithsonian.
+        </font>
+      </div>
+   </div>
+</div>
+
+<div class="row">
+  <div class="col-md-6">
 <p>
 On July 7, 2000, after further examination by the principal scientists of the public genome project,
 and to facilitate the annotation process, the UCSC Genome Browser group released this first working
 draft on the web at <a href="../" target="_blank">https://genome.ucsc.edu</a>.
 In the first 24 hours of free and unrestricted access to the human genome, the scientific community
 downloaded one-half trillion bytes of information from the assembled blueprint of our human
 species. The initial assembled human genome sequence was referred to as a working draft because
 there remained gaps where DNA sequence was missing, due either to a lack of raw sequence data or
 ambiguities in the positions of the fragments. With the gene assembly 90% complete, the assembled
 genome was published along with the findings of hundreds of researchers worldwide in the
 <a href="https://www.nature.com/nature/volumes/409/issues/6822" target="_blank">February 15, 2001
 issue of <i>Nature</i></a>, which was largely devoted to the human genome. In the months
 following the release of the working draft, the UCSC team worked with other researchers worldwide to
 fill in the gaps. The resulting sequence made its debut in April of 2003. It encompasses
 99% of the gene-containing regions of the human genome and is 99.99% accurate.</p>
 <p>
 The UCSC Genome Browser was designated as the official repository of the early human genome assembly
 iterations. Once the human genome sequence became available, other genome browsers also came online,
 most notably those at the <a href="https://www.ncbi.nlm.nih.gov/" target="_blank">National Center
 for Biotechnology Information (NCBI)</a> and at the <a href="https://www.ebi.ac.uk/"
 target="_blank">European Bioinformatics Institute (EBI)</a>. Reciprocal links provided on each of
 the three browsers allow researchers to jump from any place in the human genome to the same region
 on either of the other two browsers.</p>
+  </div>
+  <div class="col-md-6">
+    <img class="text-center" alt="David in front of Dell cluster"
+      src="/images/david_cluster.jpg"
+      style="margin-botton:5px; width:400px">
+      <div style="text-align:center; line-height:1">
+        <font SIZE=-1>
+           David Haussler next to the original Dell computer cluster used for the assembly of the
+           first human genome.
+        </font>
+      </div>
+   </div>
+</div>
 
 <a name="ENCODE"></a>
 <h2>The ENCODE Project</h2>
 <p>
 The human genome contains vast amounts of information, and all of the functions of a human cell are
 implicitly coded in the human genome. With the molecular sequence known, researchers have been
 mining it for clues as to how the body works in health and in disease, ultimately laying out the
 plan for the complex pathways of molecular interactions that the sequence orchestrates. The UCSC
 Genome Browser aids the worldwide scientific community in its challenge to understand the genome, to
 probe it with new experimental and informatics methodologies, and to decode the genetic program of
 the cell.</p>
 <p>
 After the sequence of the genome was first available, a researcher's ability to decode that sequence
 and tap into the wealth of information it holds was still quite limited. The next step beyond
 viewing the genome is gaining an understanding of the instructions encoded in it. Toward this end,
@@ -227,48 +282,62 @@
 components in the human genome.</p>
 <p>
 ENCODE is a scientific reconnaissance mission aimed at discovering all regions of the human genome
 crucial to biological function. Before ENCODE, scientists focused on finding the genes, or
 protein-coding regions, in DNA sequences; but these account for only about 1.5% of the genetic
 material of humans and other mammals. Non-coding regions of the genome have important functions
 serving as the instruction set for when and in which tissues genes are turned on and off.
 The ENCODE project is developing a comprehensive &quot;parts list&quot; by identifying and precisely
 locating all functional elements in the human genome. This project, sponsored by the
 <a href="https://www.genome.gov/" target="_blank">National Human Genome Research Institute
 (NHGRI)</a>, involves an international consortium of scientists from government, industry, and
 academia.</p>
 
 <a name="ucsc"></a>
 <h3>UC Santa Cruz's Role</h3>
+<div class="row">
+  <div class="col-md-6">
 <p>
 UC Santa Cruz developed and ran the data coordination center for the ENCODE project from its
 inception in 2003 through the <a href="../ENCODE/" target="_blank">end of the
 first production phase in 2012</a>. During that time, the UCSC Genome Browser group, directed by
 Jim Kent with technical management by Kate Rosenbloom, provided the database and web interface for
 all sequence-related data to the ENCODE project. This included integrating the data into the UCSC
 Human Genome Browser (where it continues to reside) on specialized tracks, and providing further
 in-depth information on detail pages. UC Santa Cruz also developed, performed, and presented
 computational and comparative analyses to glean further genomic and functional information from the
 collective data.</p>
 <p>
 UC Santa Cruz worked closely with labs producing data for the ENCODE project and with data analysis
 groups to define data and metadata reporting standards for a broad range of genomics assays. They
 implemented data submission and validation pipelines, created and maintained the
 <a href="https://www.encodeproject.org/" target="_blank">encodeproject.org</a> website, developed
 user access tools for ENCODE data, exported all ENCODE data to repositories at the National Center
 for Biotechnology Information (NCBI), and provided outreach and tutorial support for the project.
 </p>
+  </div>
+  <div class="col-md-6">
+    <img class="text-center" alt="Picture of Kate Rosenbloom from 2003" src="/images/kate2003.jpg"
+      style="margin-botton:5px; width:250px">
+      <div style="text-align:center; line-height:1">
+        <font SIZE=-1>
+          <p class="gbsCaption text-center">
+          Picture of Kate Rosenbloom from 2003 while working on the ENCODE Project.</p>
+        </font>
+      </div>
+   </div>
+</div>
 <p>
 The ENCODE data coordination was passed on to the Michael Cherry laboratory at Stanford University
 in late 2012. UC Santa Cruz, however, continues to support existing ENCODE data and resources on the
 UCSC Genome Browser website. Newer ENCODE data of broad interest,  particularly integrative and
 summary data, will be incorporated into the browser.</p>
 <p>
 <em>The following paper describes ENCODE resources at UC Santa Cruz:</em></p>
 <p>
 Rosenbloom KR, Sloan CA, Malladi VS, Dreszer TR, Learned K, Kirkup VM, Wong MC, Maddren M, Fang R,
 Heitner SG <em>et al</em>.
 <a href="https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gks1172" target="_blank">
 ENCODE data in the UCSC Genome Browser: year 5 update</a>.
 <em>Nucleic Acids Res</em>. 2013 Jan;41(Database issue):D56-63.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/23193274" target="_blank">23193274</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531152/" target="_blank">PMC3531152</a>