d93c426ef1ad5fbb32b754408599eaf380a199e5
max
Tue Apr 21 13:34:58 2026 -0700
choriCloneEnds: reorganize danRer11 CHORI BAC clone end placements as a superTrack, refs #35059
- Rename ncbiCloneEndsCH1073 to choriCloneEnds throughout (trackDb, HTML,
makeDoc, scripts dir, /hive and /gbdb layout). User-visible label is
now "CHORI Clones" since all three libraries (CH1073, CH73, CH211) are
CHORI/BACPAC BAC libraries; data source (NCBI Clone DB) is cited in
Methods.
- Wrap the existing CH1073 track in a choriCloneEnds superTrack and
add two new subtracks built from the parallel unique_concordant GFFs
at ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/ :
CH73 (99,141 placements, 23 oversize)
CH211 (70,231 placements, 46 oversize)
CH1073 is rebuilt with the same pipeline (210,777 placements).
- Build all three bigBeds with -extraIndex=name and register
searchTable / searchType bigBed stanzas with searchIndex name on each
subtrack, so clone names (CH1073-100A1, CH73-1A1, CH211-1A1, ...)
resolve from the Genome Browser position box.
- Single shared HTML description page; Methods now links to the NCBI
FTP source and to the UCSC makeDoc and scripts dir on GitHub.
Co-Authored-By: Claude Opus 4.7 (1M context)
-Bacterial artificial chromosomes (BACs) are large inserts of genomic DNA
-(typically 150–300 kb) carried in bacteria. Sequencing a single
-short read from each end of a BAC and mapping those end sequences to a
-reference genome yields the approximate start and stop of the full BAC
-insert. These BAC end placements are useful for confirming the order,
-orientation, and span of the reference assembly, for identifying large
-structural variants that disrupt concordant pair placement, and for
-locating a BAC containing a gene of interest for downstream laboratory
-work.
-
-This track shows the NCBI CH1073 zebrafish BAC library (also
-known as RZPD-1073 / DanioKey) placements labeled by NCBI as
-unique concordant—clones whose two end reads place
-uniquely in GRCz11 and at the expected orientation and approximate
-distance. Each row represents one clone insert inferred from the two
-mapped ends; one clone may have several placements when the ends map
-to an alt haplotype scaffold in addition to the primary assembly.
-
-Each item is drawn as a single block spanning the inferred BAC insert
-(start of the upstream end to end of the downstream end). Clicking an
-item opens a details page showing the clone name, NCBI placement ID,
-insert size, concordance and uniqueness flags, assembly unit
-(Primary Assembly, ALT_DRER_TU_1, etc.), and an
-oversize flag that is set for placements larger than
-500 kb—far longer than a typical BAC—so users can
-filter out likely-spurious mappings.
-
-The clone name links out to a ZFIN search for cross-reference information on the
-clone.
-
-Three categorical filters are available in the track configuration
-interface:
-Description
-Display Conventions and Configuration
-
-
-By default no filter is applied.
-
-The source data were produced by the NCBI Clone DB group from end -sequences of the CH1073 library. NCBI maps each end sequence to the -reference assembly and categorizes the pair as concordant (expected -orientation and insert size) or discordant, and as uniquely placed or -multiply placed. The full set of per-library placement reports is -available from the NCBI FTP server at -ftp.ncbi.nlm.nih.gov/repository/clone/reports/Danio_rerio/. -
--To build the UCSC track, the -CH1073.GCF_000002035.6.105.unique_concordant.gff file was -downloaded and converted to BED. RefSeq contig accessions in the GFF -(e.g. NC_007114.7, NW_018394540.1) were mapped to -UCSC-style chromosome names (e.g. chr3, -chr1_KZ114997v1_alt) using the NCBI GRCz11 assembly report. -A fixed oversize flag was set on any insert longer than -500 kb; these records are retained so researchers can inspect -them but are easy to exclude via the track filter. The resulting -BED was converted to bigBed with bedToBigBed. -
- --The data can be explored interactively in table format with the -Table Browser or the -Data Integrator and exported -from there to spreadsheet or tab-sep tables. From scripts, the data -can be accessed through our API, track=ncbiCloneEndsCH1073. -
--For automated download and analysis, the annotation is stored in a -bigBed file that can be downloaded from -our download server. The file for this track is -CH1073.bb. Individual regions or the whole genome annotation -can be obtained using our tool bigBedToBed, which can be -compiled from the source code or downloaded as a precompiled binary -for your system. Instructions for downloading source code and -binaries can be found -here. The tool can also be used to obtain features -within a given range, e.g. -bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/danRer11/ncbiCloneEndsCH1073/CH1073.bb -chrom=chr1 -start=0 -end=10000000 stdout. -
--The original annotation can be downloaded from -NCBI's clone reports FTP directory. -
- --Clone placements produced by the NCBI Clone DB group. CH1073 -(RZPD-1073 / DanioKey) is a zebrafish BAC library originally -constructed and end-sequenced in the context of large-scale -zebrafish genome and clone resources. -The CH1073 library was constructed by -Pieter de Jong -and colleagues at BACPAC Resources. -
- --Schneider VA, Chen HC, Clausen C, Meric PA, Zhou Z, Bouk N, Husain N, Maglott DR, Church DM. - -Clone DB: an integrated NCBI resource for clone-associated data. -Nucleic Acids Res. 2013 Jan;41(Database issue):D1070-8. -PMID: 23193260; PMC: PMC3531087 -