fc0444e2770896dfa3e5d4c60b3ef4d506036183
gperez2
  Thu Jan 29 11:41:50 2026 -0800
Updating the makedoc about the change in the script, which automatically detects and uses the most recent ncbiRefSeq patch version available, refs #36779

diff --git src/hg/makeDb/doc/hg38/hgmd.txt src/hg/makeDb/doc/hg38/hgmd.txt
index 4f110d0df28..02e8092528f 100644
--- src/hg/makeDb/doc/hg38/hgmd.txt
+++ src/hg/makeDb/doc/hg38/hgmd.txt
@@ -22,32 +22,34 @@
 # Made a script using claude.ai that automates HGMD data processing for hg38 and hg19.
 
 # Location: ~/kent/src/hg/makeDb/scripts/hgmd/process_hgmd.py
 
 # What the script does:
 # 1. Creates BED files from HGMD TSV data with variant classifications
 # 2. Converts BED to BigBed format
 # 3. Creates symlinks in /gbdb/{db}/bbi/
 # 4. Registers BigBed files with hgBbiDbLink
 # 5. Extracts transcript IDs from hg38 HGMD file (column 7)
 # 6. Filters ncbiRefSeq gene predictions to HGMD transcripts only
 # 7. Loads filtered gene predictions into ncbiRefSeqHgmd table
 #
 # Key features:
 # - Always uses hg38 file for transcript extraction (hg19 file lacks column 7)
-# - Auto-detects ncbiRefSeq version: p13 for hg19, p14 for hg38
+# - Auto-detects and selects latest ncbiRefSeq patch version (p15, p14, p13, etc.)
 # - Falls back to previous years if specified year's ncbiRefSeq not found
+# - Uses regex pattern matching to extract version and date from directory names
+#   Example: "ncbiRefSeq.p14.2025-08-13" -> extracts "p14" and "2025-08-13"
 
 # wc -l:
 # 332094 /hive/data/genomes/hg38/bed/hgmd/hgmd.bed
 
 # wc -l:
 # 15691 /hive/data/genomes/hg38/bed/hgmd/ncbiRefSeq.p14.2025-08-13/hgmd.curated.gp
 
 # Usage:
 python3 ~/kent/src/hg/makeDb/scripts/hgmd/process_hgmd.py --year 2025 --db hg38
 
 # Sample output:
 # hg38 BigBed completed successfully!
 # Output files: /hive/data/genomes/hg38/bed/hgmd/hgmd.bed, /hive/data/genomes/hg38/bed/hgmd/hgmd.bb
 # Symlink created: /gbdb/hg38/bbi/hgmd.bb
 # hgBbiDbLink run: hgBbiDbLink hg38 hgmd /gbdb/hg38/bbi/hgmd.bb