d93c426ef1ad5fbb32b754408599eaf380a199e5 max Tue Apr 21 13:34:58 2026 -0700 choriCloneEnds: reorganize danRer11 CHORI BAC clone end placements as a superTrack, refs #35059 - Rename ncbiCloneEndsCH1073 to choriCloneEnds throughout (trackDb, HTML, makeDoc, scripts dir, /hive and /gbdb layout). User-visible label is now "CHORI Clones" since all three libraries (CH1073, CH73, CH211) are CHORI/BACPAC BAC libraries; data source (NCBI Clone DB) is cited in Methods. - Wrap the existing CH1073 track in a choriCloneEnds superTrack and add two new subtracks built from the parallel unique_concordant GFFs at ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/ : CH73 (99,141 placements, 23 oversize) CH211 (70,231 placements, 46 oversize) CH1073 is rebuilt with the same pipeline (210,777 placements). - Build all three bigBeds with -extraIndex=name and register searchTable / searchType bigBed stanzas with searchIndex name on each subtrack, so clone names (CH1073-100A1, CH73-1A1, CH211-1A1, ...) resolve from the Genome Browser position box. - Single shared HTML description page; Methods now links to the NCBI FTP source and to the UCSC makeDoc and scripts dir on GitHub. Co-Authored-By: Claude Opus 4.7 (1M context) diff --git src/hg/makeDb/doc/danRer11/ncbiCloneEndsCH1073.txt src/hg/makeDb/doc/danRer11/ncbiCloneEndsCH1073.txt deleted file mode 100644 index a8631027908..00000000000 --- src/hg/makeDb/doc/danRer11/ncbiCloneEndsCH1073.txt +++ /dev/null @@ -1,34 +0,0 @@ -# NCBI CH1073 clone end placements track, refs #35059 -# 2026-04-21 Claude max - -mkdir -p /hive/data/genomes/danRer11/bed/ncbiCloneEndsCH1073 -cd /hive/data/genomes/danRer11/bed/ncbiCloneEndsCH1073 - -# NCBI assembly report (has the UCSC-style name in column 10, so we just -# project col 7 (RefSeq accession) onto col 10 (UCSC name)) -curl -sS -o GCF_000002035.6.assembly.txt \ - 'https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/002/035/GCF_000002035.6_GRCz11/GCF_000002035.6_GRCz11_assembly_report.txt' - -# CH1073 unique_concordant placements (210777 clone_insert rows, ~48 MB) -curl -sS -o CH1073.unique_concordant.gff \ - 'https://ftp.ncbi.nlm.nih.gov/repository/clone/reports/Danio_rerio/CH1073.GCF_000002035.6.105.unique_concordant.gff' - -# RefSeq acc -> UCSC chrom name -~/kent/src/hg/makeDb/scripts/ncbiCloneEndsCH1073/refSeqNames.py \ - GCF_000002035.6.assembly.txt > refSeq.ucscName.tab -# 1923 mappings, all names present in /hive/data/genomes/danRer11/chrom.sizes - -# Parse GFF -> BED 6+7 (matches cloneEnds.as). All 210777 clone_insert rows -# map to UCSC names; 26 are flagged as oversize (insertSize > 500 kb). -~/kent/src/hg/makeDb/scripts/ncbiCloneEndsCH1073/makeBed.py \ - refSeq.ucscName.tab /hive/data/genomes/danRer11/chrom.sizes \ - CH1073.unique_concordant.gff \ - > ncbiCloneEndsCH1073.bed 2> makeBed.log - -# Sort and convert to bigBed -sort -k1,1 -k2,2n ncbiCloneEndsCH1073.bed > ncbiCloneEndsCH1073.sorted.bed -bedToBigBed -type=bed6+7 \ - -as=~/kent/src/hg/makeDb/scripts/ncbiCloneEndsCH1073/cloneEnds.as \ - -tab ncbiCloneEndsCH1073.sorted.bed \ - /hive/data/genomes/danRer11/chrom.sizes \ - danRer11.ncbiCloneEndsCH1073.bb