d93c426ef1ad5fbb32b754408599eaf380a199e5 max Tue Apr 21 13:34:58 2026 -0700 choriCloneEnds: reorganize danRer11 CHORI BAC clone end placements as a superTrack, refs #35059 - Rename ncbiCloneEndsCH1073 to choriCloneEnds throughout (trackDb, HTML, makeDoc, scripts dir, /hive and /gbdb layout). User-visible label is now "CHORI Clones" since all three libraries (CH1073, CH73, CH211) are CHORI/BACPAC BAC libraries; data source (NCBI Clone DB) is cited in Methods. - Wrap the existing CH1073 track in a choriCloneEnds superTrack and add two new subtracks built from the parallel unique_concordant GFFs at ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/ : CH73 (99,141 placements, 23 oversize) CH211 (70,231 placements, 46 oversize) CH1073 is rebuilt with the same pipeline (210,777 placements). - Build all three bigBeds with -extraIndex=name and register searchTable / searchType bigBed stanzas with searchIndex name on each subtrack, so clone names (CH1073-100A1, CH73-1A1, CH211-1A1, ...) resolve from the Genome Browser position box. - Single shared HTML description page; Methods now links to the NCBI FTP source and to the UCSC makeDoc and scripts dir on GitHub. Co-Authored-By: Claude Opus 4.7 (1M context) diff --git src/hg/makeDb/trackDb/zebrafish/danRer11/choriCloneEnds.html src/hg/makeDb/trackDb/zebrafish/danRer11/choriCloneEnds.html new file mode 100644 index 00000000000..22770b44520 --- /dev/null +++ src/hg/makeDb/trackDb/zebrafish/danRer11/choriCloneEnds.html @@ -0,0 +1,142 @@ +

Description

+Bacterial artificial chromosomes (BACs) are large inserts of genomic DNA +(typically 150–300 kb) carried in bacteria. Sequencing a single +short read from each end of a BAC and mapping those end sequences to a +reference genome yields the approximate start and stop of the full BAC +insert. These BAC end placements are useful for confirming the order, +orientation, and span of the reference assembly, for identifying large +structural variants that disrupt concordant pair placement, and for +locating a BAC containing a gene of interest for downstream laboratory +work. The individual clones in all three libraries shown here can be +ordered from +BACPAC Resources +(CHORI/BACPAC) for use at the bench. +

+This track container shows three CHORI (Children's Hospital Oakland +Research Institute) zebrafish BAC libraries: +

CH1073 – also known as RZPD-1073 / DanioKey; 210,777 +unique-concordant placements.
CH73 – RZPD-73 / DanioKey Pilot; 99,141 placements.
CH211 – 70,231 placements.

+All three libraries were end-sequenced and placed on the GRCz11 +(danRer11) assembly by the NCBI Clone DB group; only +unique concordant placements are shown, i.e. clones whose +two end reads place uniquely and at the expected orientation and +approximate distance. Each row represents one clone insert inferred +from a pair of mapped ends; one clone may have several placements if +its ends also map to an alt haplotype scaffold. +

+ +

Display Conventions and Configuration

+Each item is drawn as a single block spanning the inferred BAC insert +(start of the upstream end to end of the downstream end). Clicking an +item opens a details page showing the clone name, NCBI placement ID, +insert size, concordance and uniqueness flags, assembly unit +(Primary Assembly, ALT_DRER_TU_1, etc.), and an +oversize flag set for placements larger than 500 kb +— far longer than a typical BAC — so users can filter out +likely-spurious mappings. +

+The clone name links out to a ZFIN search for cross-reference information on the +clone. Clone names (e.g. CH1073-100A1, CH73-1A1, +CH211-1A1) are indexed and can be entered directly in the +Genome Browser position/search box to jump to a clone. +

+Three categorical filters are available in each subtrack: +

End-pair concordance – TRUE/FALSE
Unique placement – TRUE/FALSE
Oversize placement (>500kb) – TRUE/FALSE

+By default no filter is applied. +

+ +

Methods

+The source data were produced by the NCBI Clone DB group from end +sequences of the three CHORI libraries. NCBI maps each end sequence to +the reference assembly and categorizes the pair as concordant (expected +orientation and insert size) or discordant, and as uniquely placed or +multiply placed. The full set of per-library placement reports for +zebrafish is available from the NCBI FTP server at +ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/. +

+To build the UCSC tracks, the three +*.GCF_000002035.6.105.unique_concordant.gff files were +downloaded and converted to BED. RefSeq contig accessions in the GFFs +(e.g. NC_007114.7, NW_018394540.1) were mapped to +UCSC-style chromosome names (e.g. chr3, +chr1_KZ114997v1_alt) using the NCBI GRCz11 assembly report. +An oversize flag was set on any insert longer than 500 kb; +these records are retained so researchers can inspect them but are +easy to exclude via the track filter. The resulting BEDs were converted +to bigBed with bedToBigBed using a name search index +so clone names can be looked up from the browser position box. +

+The step-by-step track build commands (downloads, RefSeq-to-UCSC +mapping, BED conversion, bigBed build) are recorded in the UCSC +makeDoc for this track: +src/hg/makeDb/doc/danRer11/choriCloneEnds.txt. +The GFF-to-BED converter, the RefSeq-to-UCSC mapping script, and the +autoSql schema live in +src/hg/makeDb/scripts/choriCloneEnds/. +

+ +

Data Access

+The data can be explored interactively in table format with the +Table Browser or the +Data Integrator and exported +from there to spreadsheet or tab-sep tables. From scripts, the data +can be accessed through our API, with +track=choriCloneEndsCH1073, +track=choriCloneEndsCH73, or +track=choriCloneEndsCH211. +

+For automated download and analysis, each library's annotation is +stored in a bigBed file that can be downloaded from +our download server: CH1073.bb, +CH73.bb, CH211.bb. Individual regions or the whole +genome annotation can be obtained using our tool bigBedToBed, +which can be compiled from the source code or downloaded as a +precompiled binary for your system. Instructions for downloading source +code and binaries can be found +here. The tool can also be used to obtain features +within a given range, e.g. +bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/danRer11/choriCloneEnds/CH1073.bb -chrom=chr1 -start=0 -end=10000000 stdout. +

+ +

Credits

+Clone placements produced by the NCBI Clone DB group. The CHORI +zebrafish BAC libraries (CH73, CH211, CH1073) were constructed by +Pieter de Jong +and colleagues at BACPAC Resources (CHORI/BACPAC). +

+ +

References

+Schneider VA, Chen HC, Clausen C, Meric PA, Zhou Z, Bouk N, Husain N, Maglott DR, Church DM. + +Clone DB: an integrated NCBI resource for clone-associated data. +Nucleic Acids Res. 2013 Jan;41(Database issue):D1070-8. +PMID: 23193260; PMC: PMC3531087 +