59be73feaf94cd72a1dd32ccaff5b479ccfce015 jnavarr5 Wed Jan 6 16:25:58 2021 -0800 Capitalizing 'genome browser' when talking about the UCSC Genome Browser. Editing a sentence from Bob's suggestions, refs #20314 diff --git src/hg/htdocs/goldenPath/history.html src/hg/htdocs/goldenPath/history.html index f1d1647..7774fd8 100755 --- src/hg/htdocs/goldenPath/history.html +++ src/hg/htdocs/goldenPath/history.html @@ -53,66 +53,66 @@ <p> Genome sequences are difficult to read because they consist of letter strings with no breaks or punctuation. The example below contains 7 different letters (genomes contain only 4). Can you understand what it is saying? (Line borrowed from the movie, <em><a href = "https://en.wikipedia.org/wiki/Charly" target = _blank>Charly</a></em>.)</p> <pre> THATTHATISISTHATTHATISNOTISNOTISTHATITITIS</pre> <p> With word breaks and punctuation, it starts to make sense:</p> <pre> THAT THAT IS, IS. THAT THAT IS NOT, IS NOT. IS THAT IT? IT IS!</pre> <p> The UCSC Genome Browser group played a pivotal role in bringing this extraordinary life script into the light of science. The browser presents both experimentally validated and computer-predicted genes along with dozens of lines of evidence that help scientists recognize the key features of -genes and predict their function. The databases for the genome browser are updated nightly with new +genes and predict their function. The databases for the Genome Browser are updated nightly with new information generated by researchers throughout the world.</p> <a name="tools"></a> <h3>Genomic Tools</h3> <p> When directed to focus on a particular segment of the genome, the browser displays a range of data that are stacked vertically. At the top, it shows the chromosome number and the current position on the chromosome. Underneath, it shows several rows of data about genes that have been found experimentally or have been predicted by a number of different methods. Below those are lines of information about gene expression and regulation, followed by comparisons with the genomes of other species and other information, such as single-nucleotide polymorphisms (SNPs).</p> <p> Far from simply displaying the genetic code, the UCSC browser brings the code to life by aligning relevant areas with experimental and computational data and images. It also links to international databases, giving researchers instant access to deeper information about the genome. An experienced user can form a hypothesis and verify it in minutes using this tool. Together this information represents an extremely comprehensive view of the genome, helping scientists recognize important -features of the sequence and providing strong evidence of function. For instance, the genome browser +features of the sequence and providing strong evidence of function. For instance, the Genome Browser helps unravel the varied splicing patterns whereby one gene can make many different proteins. This process of alternative splicing is thought to explain how a human can be so complex, yet have only about twice as many genes as a roundworm.</p> <p> -The UCSC Genome Browser group continues to add functions to the genome browser, such as the Track +The UCSC Genome Browser group continues to add functions to the Genome Browser, such as the Track Collection Builder, which allows multiple continuous-value graphing tracks to be copied and grouped into one composite track or "collection." Once the tracks are inside of a collection, the Track Collection Builder tool allows you to sort by similarity and magnitude, as well as alter the aggregate/overlay graphing view options to compare results. By merging experimental results from multiple sources, this powerful tool allows researchers to better understand how genes function.</p> <p> Today, the UCSC Genome Browser group continues to make genome sequences even more useful for science and medicine by facilitating the visualization of aggregate data so that it is easily accessible to researchers. This process of discovery and categorization is a critical step toward fully understanding the workings of the human genome, a project that will occupy science and medicine for many years. The browser platform has multiple potential uses that can aid in disease prevention, diagnostics, and the search for cures. The usefulness of the UCSC Genome Browser lead to -spin-offs, or genome browser mirrors, such as the following:</p> +spin-offs, or Genome Browser mirrors, such as the following:</p> <ul> <li><a href="https://news.ucsc.edu/2008/05/2242.html" target="_blank">The HIV Data Browser</a></li> <li><a href="https://xena.ucsc.edu/welcome-to-ucsc-xena/" target="_blank">The UCSC Cancer Genomics Browser</a></li> <li><a href="../ENCODE/" target="_blank">The data collection center for the international ENCODE project</a></li> <li><a href="../ebolaPortal/" target="_blank">The UCSC Ebola Virus Genome Browser</a></li> <li><a href="../covid19.html" target="_blank">The UCSC SARS-CoV-2 coronavirus Browser</a></li> </ul> <a name="race"></a> <h2>Human Genome Project — The Race</h2> @@ -175,31 +175,32 @@ program uses sequence overlap of these bits to assemble an approximation of the genome as a whole. Using this approach, Celera's assembly would still have numerous gaps and ambiguities, but the entire project from start to finish could be done in less than half the time the IHGP planned for their effort. A further complication was the fact that Celera had access to the fruits of the public project, while keeping their own results private.</p> <p> An approach resulting in numerous gaps and ambiguities was necessary if the IHGP's draft sequence was to have similar utility to Celera's sequence, and in particular to prevent Celera and its clients from locking up significant portions of the human genome under patents. A number of groups within the IHGP were working on the second stage of assembly that would merge the approximately 400,000 pieces into larger pieces and order them along the human chromosomes so that research groups could find the human genes. However, the process was slow and arduous. Even with the outstanding mapping information provided by Bob Waterston's group at Washington University, the second-stage assembly turned out to be like an extremely difficult jigsaw puzzle, with many layers -of conflicting evidence arising from similar-looking, non-contiguous, overlapping pieces.</p> +of conflicting evidence arising from similar-looking, non-contiguous pieces caused by repeats +scattered throughout the genome.</p> <p> At least partly in response to competition from Celera, the IHGP changed its focus from producing finished clones to producing draft clones. To sequence a clone, the IHGP adopted a shotgun approach in miniature. Bits of a clone were read at random, and the bits were stitched together by a computer program into pieces called "contigs." After the shotgun phase, a clone was typically in 5-50 contigs, but the relative order of the contigs was not known. This was the state of the genome when David Haussler first attempted to locate the genes computationally, and he quickly discovered that computational gene-finding was nearly impossible, because the average size of a contig was considerably smaller than the average size of a human gene.</p> <a name="push"></a> <h3>Push to the Finish Line</h3> <div class="row"> <div class="col-md-6"> <p> @@ -391,31 +392,31 @@ <a href="https://www.encodeproject.org/" target="_blank">ENCODE data portal</a></li> <li> <a href="https://www.nytimes.com/2008/11/11/science/11gene.html" target="_blank">New York Times article: "Now: the rest of the genome"</a></li> <li> <a href="https://www.genome.gov/11009066/" target="_blank">NHGRI announcement of the ENCODE Project</a></li> </ul> <a name="primer"></a> <h2>UCSC Genome Research Primer</h2> <a name="comparative"></a> <h3>Comparative Genomics</h3> <p> -Besides developing, supporting, and continuing to improve the genome browser, the UCSC Genome +Besides developing, supporting, and continuing to improve the Genome Browser, the UCSC Genome Browser group conducts research into the functional elements of the human genome that have evolved under natural selection. Since the first assembly of the human genome, a growing number of species have been added to the UCSC Genome Browser, including roundworm, pufferfish, chicken, mouse, and chimpanzee. In 2018, the UCSC Genome Browser surpassed 200 assemblies for the various species hosted on the browser. Interspecies alignments allow researchers to compare human genes to similar genes in other species. The UCSC Genome Browser allows rapid comparisons between species, which can lead to many different types of new discoveries:</p> <ul> <li> New gene discoveries can result from searching the human genome for sequences that match those with known functions in other organisms. The molecular genetics behind disease development and progression in model organisms can be leveraged to discover potential disease-related genes in humans, moving us closer to diagnostic advances and targeted treatments.</li> <li> We can reconstruct the evolutionary history of the human genome by identifying the origins of