d93c426ef1ad5fbb32b754408599eaf380a199e5
max
Tue Apr 21 13:34:58 2026 -0700
choriCloneEnds: reorganize danRer11 CHORI BAC clone end placements as a superTrack, refs #35059
- Rename ncbiCloneEndsCH1073 to choriCloneEnds throughout (trackDb, HTML,
makeDoc, scripts dir, /hive and /gbdb layout). User-visible label is
now "CHORI Clones" since all three libraries (CH1073, CH73, CH211) are
CHORI/BACPAC BAC libraries; data source (NCBI Clone DB) is cited in
Methods.
- Wrap the existing CH1073 track in a choriCloneEnds superTrack and
add two new subtracks built from the parallel unique_concordant GFFs
at ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/ :
CH73 (99,141 placements, 23 oversize)
CH211 (70,231 placements, 46 oversize)
CH1073 is rebuilt with the same pipeline (210,777 placements).
- Build all three bigBeds with -extraIndex=name and register
searchTable / searchType bigBed stanzas with searchIndex name on each
subtrack, so clone names (CH1073-100A1, CH73-1A1, CH211-1A1, ...)
resolve from the Genome Browser position box.
- Single shared HTML description page; Methods now links to the NCBI
FTP source and to the UCSC makeDoc and scripts dir on GitHub.
Co-Authored-By: Claude Opus 4.7 (1M context)
+Bacterial artificial chromosomes (BACs) are large inserts of genomic DNA
+(typically 150–300 kb) carried in bacteria. Sequencing a single
+short read from each end of a BAC and mapping those end sequences to a
+reference genome yields the approximate start and stop of the full BAC
+insert. These BAC end placements are useful for confirming the order,
+orientation, and span of the reference assembly, for identifying large
+structural variants that disrupt concordant pair placement, and for
+locating a BAC containing a gene of interest for downstream laboratory
+work. The individual clones in all three libraries shown here can be
+ordered from
+BACPAC Resources
+(CHORI/BACPAC) for use at the bench.
+
+This track container shows three CHORI (Children's Hospital Oakland
+Research Institute) zebrafish BAC libraries:
+Description
+
+
+All three libraries were end-sequenced and placed on the GRCz11
+(danRer11) assembly by the NCBI Clone DB group; only
+unique concordant placements are shown, i.e. clones whose
+two end reads place uniquely and at the expected orientation and
+approximate distance. Each row represents one clone insert inferred
+from a pair of mapped ends; one clone may have several placements if
+its ends also map to an alt haplotype scaffold.
+
+Each item is drawn as a single block spanning the inferred BAC insert +(start of the upstream end to end of the downstream end). Clicking an +item opens a details page showing the clone name, NCBI placement ID, +insert size, concordance and uniqueness flags, assembly unit +(Primary Assembly, ALT_DRER_TU_1, etc.), and an +oversize flag set for placements larger than 500 kb +— far longer than a typical BAC — so users can filter out +likely-spurious mappings. +
++The clone name links out to a ZFIN search for cross-reference information on the +clone. Clone names (e.g. CH1073-100A1, CH73-1A1, +CH211-1A1) are indexed and can be entered directly in the +Genome Browser position/search box to jump to a clone. +
++Three categorical filters are available in each subtrack: +
+The source data were produced by the NCBI Clone DB group from end +sequences of the three CHORI libraries. NCBI maps each end sequence to +the reference assembly and categorizes the pair as concordant (expected +orientation and insert size) or discordant, and as uniquely placed or +multiply placed. The full set of per-library placement reports for +zebrafish is available from the NCBI FTP server at +ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/. +
++To build the UCSC tracks, the three +*.GCF_000002035.6.105.unique_concordant.gff files were +downloaded and converted to BED. RefSeq contig accessions in the GFFs +(e.g. NC_007114.7, NW_018394540.1) were mapped to +UCSC-style chromosome names (e.g. chr3, +chr1_KZ114997v1_alt) using the NCBI GRCz11 assembly report. +An oversize flag was set on any insert longer than 500 kb; +these records are retained so researchers can inspect them but are +easy to exclude via the track filter. The resulting BEDs were converted +to bigBed with bedToBigBed using a name search index +so clone names can be looked up from the browser position box. +
++The step-by-step track build commands (downloads, RefSeq-to-UCSC +mapping, BED conversion, bigBed build) are recorded in the UCSC +makeDoc for this track: +src/hg/makeDb/doc/danRer11/choriCloneEnds.txt. +The GFF-to-BED converter, the RefSeq-to-UCSC mapping script, and the +autoSql schema live in +src/hg/makeDb/scripts/choriCloneEnds/. +
+ ++The data can be explored interactively in table format with the +Table Browser or the +Data Integrator and exported +from there to spreadsheet or tab-sep tables. From scripts, the data +can be accessed through our API, with +track=choriCloneEndsCH1073, +track=choriCloneEndsCH73, or +track=choriCloneEndsCH211. +
++For automated download and analysis, each library's annotation is +stored in a bigBed file that can be downloaded from +our download server: CH1073.bb, +CH73.bb, CH211.bb. Individual regions or the whole +genome annotation can be obtained using our tool bigBedToBed, +which can be compiled from the source code or downloaded as a +precompiled binary for your system. Instructions for downloading source +code and binaries can be found +here. The tool can also be used to obtain features +within a given range, e.g. +bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/danRer11/choriCloneEnds/CH1073.bb -chrom=chr1 -start=0 -end=10000000 stdout. +
+ ++Clone placements produced by the NCBI Clone DB group. The CHORI +zebrafish BAC libraries (CH73, CH211, CH1073) were constructed by +Pieter de Jong +and colleagues at BACPAC Resources (CHORI/BACPAC). +
+ ++Schneider VA, Chen HC, Clausen C, Meric PA, Zhou Z, Bouk N, Husain N, Maglott DR, Church DM. + +Clone DB: an integrated NCBI resource for clone-associated data. +Nucleic Acids Res. 2013 Jan;41(Database issue):D1070-8. +PMID: 23193260; PMC: PMC3531087 +