659fdce4ba2cc6eea5b7555344d4155047125abc kuhn Wed Jan 6 17:49:14 2021 -0800 added paragraph written by david haussler. it is more jargony than most of the page. diff --git src/hg/htdocs/goldenPath/history.html src/hg/htdocs/goldenPath/history.html index 7774fd8..ca4d729 100755 --- src/hg/htdocs/goldenPath/history.html +++ src/hg/htdocs/goldenPath/history.html @@ -244,54 +244,67 @@ there remained gaps where DNA sequence was missing, due either to a lack of raw sequence data or ambiguities in the positions of the fragments. With the assembly 90% complete, the assembled genome was published along with the findings of hundreds of researchers worldwide in the <a href="https://www.nature.com/nature/volumes/409/issues/6822" target="_blank">February 15, 2001 issue of <i>Nature</i></a>, which was largely devoted to the human genome. In the months following the release of the working draft, the UCSC team worked with other researchers worldwide to fill in the gaps. The resulting sequence made its debut in April of 2003. It encompasses 99% of the gene-containing regions of the human genome and is 99.99% accurate. UCSC was designated as the official repository of the early human genome assembly iterations. Eventually, the <a href="https://www.ncbi.nlm.nih.gov/" target="_blank">National Center for Biotechnology Information (NCBI)</a> and then the <a href = "https://www.ncbi.nlm.nih.gov/grc" target = _blank>Genome Reference Consortium (GRC)</a> would take over the assembly and official release of improvements on the genome assembly.</p> <p> -The genome sequence at the time of release, however, was simply a few billion characters -of Gs, As, Ts and Cs, many of them assigned to chromsomes. As indicated above, however, -without landmarks it is unintelligible. -During this time, Kent was also working on a computer program that would allow him -to view genes of <em>C. elegans</em> and show via a web interface which parts of the genes -are ultimately used by the cell to encode proteins. The process of "splicing" -removes sequence called introns and was visualizable using Jim's program, The Intronerator. +The UCSC team was a key part of the Hard Core Analysis Group that published in the +Feb 15, 2001 issue of Nature. We linked the genome sequence to previous genetic, +cytogenetic, and radiation hybrid maps, and to the new physical clone map. We did +this both to refine and validate the sequence assembly, and to explore phenomena +such as positional and gender variation in recombination rate, regional isochore +structure and repeat structure at the single base resolution for the first time. +David Kulp performed the mapping of STS markers, messenger RNAs and ESTs, +Terry Furey mapped the chromosome band positions, cytogenetic markers (~8,000 gene +regions mapped by Fluorescence In-Situ Hybridization) and isochores, and integrated +these data with the radiation hybrid and genetic maps. </p> + </div> <div class="col-md-6"> <img class="text-center" alt="David in front of Dell cluster" src="/images/david_cluster.jpg" style="margin-botton:5px; width:400px"> <div style="text-align:center; line-height:1"> <font SIZE=-1> David Haussler next to the original Dell computer cluster used for the assembly of the first human genome. </font> </div> </div> <div class="col-md-12"> <p> +The genome sequence at the time of release, however, was simply a few billion characters +of Gs, As, Ts and Cs, many of them assigned to chromsomes. As indicated above, however, +without landmarks it is unintelligible. +During this time, Kent was also working on a computer program that would allow him +to view genes of <em>C. elegans</em> and show via a web interface which parts of the genes +are ultimately used by the cell to encode proteins. The process of "splicing" +removes sequence called introns and was visualizable using Jim's program, The Intronerator. +</p> +<p> The Intronerator evolved into Genome Browser and ultimately became a tool to provide information about the functional significance of many other parts of the genome sequence. The process of annotation, as it is called, identifies sequences that represent not only the genes and which parts of the genes encode proteins, but also the control sequences that tell cells when and where to activate genes, which regions of the genome are conserved through evolution and can be found in other animals, and many other significant regions. Essentially, in the Browser the genome became a coordinate system upon which to hang any functionally significant annotation. </p> <p> Once the human genome sequence became available and the Browser built to visualize it, other genome browsers also came online, most notably those at NCBI and at the <a href="https://www.ebi.ac.uk/" target="_blank">European Bioinformatics Institute (EBI)</a>. Reciprocal links provided on each of