src/hg/utils/automation/genbank/README.txt 62b1428915cc2c8b9acbf46a86c1248833257efe

62b1428915cc2c8b9acbf46a86c1248833257efe
mspeir
  Mon Nov 17 18:50:32 2025 -0800
changing rest of genome-source references to github, refs #34485

diff --git src/hg/utils/automation/genbank/README.txt src/hg/utils/automation/genbank/README.txt
index 8b92b7fa2ca..a93819a0601 100644
--- src/hg/utils/automation/genbank/README.txt
+++ src/hg/utils/automation/genbank/README.txt
@@ -1,71 +1,71 @@
 ############################################################################
 
 UCSC procedure to mirror genbank/refseq assemblies and generate assembly hubs
 
 This README.txt file is from:
 
-http://genome-source.soe.ucsc.edu/gitlist/kent.git/blob/master/src/hg/utils/automation/genbank/README.txt
+http://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/utils/automation/genbank/README.txt
 
 ############################################################################
 # mirror genbank/refseq hierarchy
 
 There is a cron job running once a day M-F that runs an rsync command
 to fetch a select set of files from genbank/refseq.  The script here is:
   cronUpdate.sh
 with one argument specifying either the refseq or genbank assembly
 hierarchy, e.g.:
   cronUpdate.sh refseq
 
 The accumulating hierarchy at UCSC is in:
    export outside="/hive/data/outside/ncbi/genomes/refseq"
 Or, if genbank assemblies:
    export outside="/hive/data/outside/ncbi/genomes/genbank"
 
 The rsync command has a number of include statements to pick up
 a select set of files:
 
   cd "${outside}"
   rsync -avPL --include "*/" --include="chr2acc" --include="*chr2scaf" \
     --include "*placement.txt" --include "*definitions.txt" \
     --include "*_rm.out.gz" --include "*agp.gz" --include "*regions.txt" \
     --include "*_genomic.fna.gz" --include "*gff.gz" \
     --include "*_assembly_report.txt" --exclude "*" \
         rsync://ftp.ncbi.nlm.nih.gov/genomes/genbank/${D}/ ./${D}/ \
      || /bin/true
 
 The "${D}" directory name is one of the directories
 at ftp.ncbi.nlm.nih.gov/genomes/{refseq|genbank}/${D}/
 
 Slightly different directory listings for refseq vs. genbank.
 For refseq assemblies:
 for D in archaea bacteria fungi invertebrate mitochondrion plant plasmid \
   plastid protozoa vertebrate_mammalian vertebrate_other viral
 
 For genbank assemblies:
   for D in archaea bacteria fungi invertebrate other plant protozoa \
      vertebrate_mammalian vertebrate_other
 
 The cron script also maintains file listings and logs in:
    ${outside}/fileListings/yyyy/mm/
    ${outside}/logs/yyyy/mm/
 to track changes over time and provide a record of activity
 
 ############################################################################
 # processing NCBI files into assembly hub files
 
 Processing takes place in the two hierarchies:
    /hive/data/inside/ncbi/genomes/refseq/
    /hive/data/inside/ncbi/genomes/genbank/
 
 The scripts from this source directory:
      kent/src/hg/utils/automation/genbank/
 are used to process the NCBI files into assembly hub files.
 This processing is done with a cluster computer set up since there
 are a large number of steps to perform.
 
 The primary driving script is: ncbiToUcsc.sh
 It takes two arguments:
     1. genbank|refseq - to specify which hierarchy to work on
     2. <pathTo>/*_genomic.fna.gz - fasta sequence file in the 'outside' path
 
 ############################################################################