3081e8a9fcda2142e033075b48ddce002d742f02 lrnassar Fri Apr 3 07:52:18 2020 -0700 First data release announcement for coronavirus browser refs #25267 diff --git src/hg/htdocs/goldenPath/newsarch.html src/hg/htdocs/goldenPath/newsarch.html index 41e6fd1..1556ebc 100755 --- src/hg/htdocs/goldenPath/newsarch.html +++ src/hg/htdocs/goldenPath/newsarch.html @@ -39,30 +39,139 @@
+ + +

Apr. 2, 2020    First data release for SARS-CoV-2 genome browser

+

+In recent months, we have seen the beginning of the global effort against the +coronavirus. Here at the UCSC Genome Browser, we have also been directing +some work towards that front and will continue to do so.

+

+This has brought new users to our site who may have previously been unfamiliar +with this resource. The Genome Browser aims to facilitate genome +research by offering data visualization, genome annotations, and other tools. +We encourage anyone who would like to learn more to see our +user guide.

+

+With this in mind, we would like to announce a new landing page as well as the first release +of novel coronavirus annotation data for the SARS-CoV-2 genome assembly browser released +in early February. The SARS-CoV-2 genome browser includes displays of the virus's +molecular evolution in other species and its further evolution during this human +pandemic. We have also added multiple lung datasets to the UCSC Single Cell Browser. + +This information is made freely available to researchers everywhere, +with the goal of advancing our knowledge of SARS-CoV-2 (COVID-19).

+

+These latest data were primarily sourced from outside groups and include different +kinds of information such as gene annotations, variant data, and locally produced +multiple genome alignments.

+

+Note: Genome Browser data is often referred to as 'tracks', and the +term 'track' and 'data annotation track' can be used interchangeably.


+ + + +

+More information on any of these data annotation tracks can be found by clicking +on the names. This will lead to the track description page, which also allows +for configuration of various display options. Additional information on how to use +the UCSC Genome Browser can be found on our training page.

+

+We would like to thank NCBI, UniProt, Nextstrain, the Sgourakis Research Group, +Tomer Altman, and Jason Fernandes as well as the scientists from UCSC for providing these data.

+

+The SARS-CoV-2 genome browser and data annotation tracks are funded by generous individual +donors including Pat & Rowland Rebele.

+ +

Mar. 17, 2020    New mitochondrial sequence for human (hg19)

We are pleased to announce the release of a patch to the hg19 assembly that will introduce a new mitochondrial sequence, chrMT, to the assembly. We used GenBank sequence NC_001807 for chrM in hg19 and earlier, but the sequence preferred by the community is the revised Cambridge Reference Sequence (rCRS), NC_012920. The new chrMT is the rCRS, NC_012920. Patch sequences from GRCh37.p13 have also been added to hg19.

More information on how patch sequences are incorporated can be found on the Patching up the Genome blog post. @@ -72,30 +181,88 @@ sequences to hg19, we can expect to see BLAT return more matches to the genome.

Also, with these patches, the hg19 genome is not optimal anymore for aligners. So we added an "analysis set" version of the hg19 genome fasta file to our bigZips directory, and indexes for BWA, Bowtie2, and Hisat2.

We would like to thank the Genome Research Consortium for creating the patches to hg19. We would also like to thank Angie Hinrichs and Jairo Navarro at UCSC for implementing and testing the latest patch to hg19.

Feb. 27, 2020    New NCBI RefSeq Select + MANE track for Human (hg19/hg38)

+mitochondrial sequence, chrMT, to the assembly. We used GenBank sequence NC_001807 for chrM in hg19 and +earlier, but the sequence preferred by the community is the revised Cambridge Reference Sequence +(rCRS), NC_012920. The +new chrMT is the rCRS, NC_012920. Patch sequences from +GRCh37.p13 have +also been added to hg19.

+

+More information on how patch sequences are incorporated can be found on the +Patching up the Genome blog post. +The blog post contains details about the new +/latest download directory on the downloads server. With the addition of new +sequences to hg19, we can expect to see BLAT return more matches to the genome.

+

+Also, with these patches, the hg19 genome is not optimal anymore for aligners. So we added an +"analysis set" version of the hg19 genome fasta file to our +bigZips +directory, and indexes for BWA, Bowtie2, and Hisat2.

+

+We would like to thank the Genome Research Consortium for creating the patches to hg19. We would +also like to thank Angie Hinrichs and Jairo Navarro at UCSC for implementing and testing the latest +patch to hg19. + + +

Feb. 27, 2020    New NCBI RefSeq Select + MANE track for Human (hg19/hg38)

+

+mitochondrial sequence, chrMT, to the assembly. We used GenBank sequence NC_001807 for chrM in hg19 and +earlier, but the sequence preferred by the community is the revised Cambridge Reference Sequence +(rCRS), NC_012920. The +new chrMT is the rCRS, NC_012920. Patch sequences from +GRCh37.p13 have +also been added to hg19.

+

+More information on how patch sequences are incorporated can be found on the +Patching up the Genome blog post. +The blog post contains details about the new +/latest download directory on the downloads server. With the addition of new +sequences to hg19, we can expect to see BLAT return more matches to the genome.

+

+Also, with these patches, the hg19 genome is not optimal anymore for aligners. So we added an +"analysis set" version of the hg19 genome fasta file to our +bigZips +directory, and indexes for BWA, Bowtie2, and Hisat2.

+

+We would like to thank the Genome Research Consortium for creating the patches to hg19. We would +also like to thank Angie Hinrichs and Jairo Navarro at UCSC for implementing and testing the latest +patch to hg19. + + +

Feb. 27, 2020    New NCBI RefSeq Select + MANE track for Human (hg19/hg38)

+

We are pleased to announce a new track, RefSeq Select+MANE, for the GRCh37/hg19 and GRCh38/hg38 human assemblies. This track is a combination of NCBI transcripts with the RefSeq Select tag, as well as transcripts with the MANE Select tag. The result is a track with a single representative transcript for every protein-coding gene. The track can be found as part of the NCBI RefSeq composite track.

RefSeq Select transcripts are chosen by an NCBI pipeline to be representative of every protein-coding gene. As part of the MANE project, however, RefSeq Select transcripts that have a 100% identical match to a transcript in the Ensembl annotation are given the MANE Select designation. In this way, MANE Select is a subset of RefSeq Select. It should be noted, however, that there are @@ -12565,88 +12732,30 @@

The Aug. 6 freeze is progressing through the pipeline. We've recently received an updated accession map from Wash U. Ensembl will shortly be integrating this with chromosome specific maps from the sequencing centers. We are still on track for an early September next release.

Aug. 28, 2001    Custom annotation track functionality added

Meanwhile we've been continuing work on the Genome Browser. It's now possible to upload your own annotations to be displayed alongside the built-in tracks. Please scroll to the bottom of the browser gateway pages for further information. The browser has also been sped up, particularly on the larger chromosomes by using a "binning" technique suggested by Lincoln Stein and Richard Durbin.

Aug. 23, 2001    New annotation tracks on Apr. 2001 browser

-Tracks continue to be added to the Apr. -2001 browser. Our old friend the Exofish track is back. The blat mouse homology track is now up -as well, computed at somewhat more sensitive settings than it was in the Dec. 2000 browser.

-

-We've recently received some significant funding from NHGRI to maintain and extend this site. This -has allowed us among other things to hire an artist, Jenny Draper, who is responsible for the new -look.

- -

Apr. 1, 2001 Freeze

-

-Jul. 13, 2001: fixed bug where some UTRs were mis-annotated in the known genes on the minus -strand.

- -

Dec. 12, 2000 Freeze

-

-Jul. 13, 2001: fixed bug where some UTRs were mis-annotated in the known genes on the minus -strand.

-

-Apr. 5, 2001: chromosome level files (but not contig level files) updated to fix bug where some of -the centromeres were misplaced

-

-Apr. 1, 2001: chromosome Y updated to fix a bug that put a large gap between each clone. This bug -was limited to the Y chromosome.

- -

Oct. 7, 2000 Freeze

-

-Apr. 7, 2001: Affymetrix gene predictions updated and available for bulk download.

-

-Jan. 9, 2001: all files were updated after a bug that had caused some finished clones to be flipped -in the assembly was caught and fixed. Our apologies for any inconvenience this has caused.

-

-Nov. 10, 2000: the sequence (.fa) files for two contigs: X/ctg18523/ctg18523.fa and -7/ctg15082/ctg15082.fa were updated. These files had null (zero valued) characters that have been -replaced with N characters. These characters were a result of a mismatch between clone sizes in the -map and in finished NT contigs. The sequence for chromosomes X and 7, which contain these contigs, -has also been updated.

-

-

Sep. 5, 2000 Freeze

-

-Oct. 11, 2000: chr21.agp and chr22.fa were updated. chr21.agp was a version which went with the UCSC -draft assembly rather than the Sanger/NCBI final assembly of this chromosome. chr21.agp and chr21.fa -are now in sync. chr22.fa and chr22.agp were also previously out of sync. chr22.fa was obtained -from Sanger while chr22.agp had been obtained by NCBI. With this update they are consistent, both -NCBI versions. Apologies for any rework this causes you.

-

-Oct. 9, 2000: chr21_random.* and chr22_random.* were removed from the zip-files in the Sep. 5 -freeze. These files were relics that should not have been included in the first place. The files -chr9_random.agp, chr10_random.agp, chr11_random.agp, chr12_random.agp, chr13_random.agp and -chr14_random.agp were updated. There was a bug where the initial field of the initial lines in these -files was "null)" rather than "chrN_random" as it should have been.

-

-

Jul. 17, 2000 Freeze

-

-Sep. 22, 2000: The zip-files chromFa.zip and chromAgp.zip under the Jul. 17th full data set were -updated to fix some duplications of clone contigs that occurred in the chrN_random.agp and -chrN_random.fa files contained within these zip-files. These "_random" files contain -clone contigs that were mapped to a particular chromosome, but could not be placed at a specific -position within that chromosome. They correspond to the "RANDOM" sections of the WashU map. None of the regular chrN.agp or chrN.fa files were affected by this update, nor was any of the information in the contigAgp.zip or contigFa.zip files changed. For convenience, we include two new files, chromRandAgp.zip and chromRandFa.zip, for users who would like to download only the data that has changed. These zips consist of the updated chrN_random.agp and chrN_random.fa files, respectively.

Sep. 4, 2000: The files chromFa.zip and contigFa.zip under the July 17th full data set and the files under July 17th data by individual clone contig were updated to fix some incorrect (null(0)) characters that needed to be replaced by 'n' characters in some Fasta files. The following contig Fasta files on chromosomes 1,8,11,12,16, 17 and 19 were affected:

     1/ctg14250/ctg14250.fa
     8/ctg17325/ctg17325.fa
     8/ctg16307/ctg16307.fa
     8/ctg25150/ctg25150.fa
     8/ctg15216/ctg15216.fa