cb718a9f9f9d2c9d43a92a618da91be0de85ed3d
hiram
  Mon Jul 21 16:32:16 2025 -0700
document the findGenome endpoint and reveal the /list/genarkGenomes endpoing function refs #35468

diff --git src/hg/htdocs/goldenPath/help/api.html src/hg/htdocs/goldenPath/help/api.html
index b712d0a1358..95bda4c5860 100755
--- src/hg/htdocs/goldenPath/help/api.html
+++ src/hg/htdocs/goldenPath/help/api.html
@@ -96,30 +96,31 @@
 <h2>What is the access URL?</h2>
 <p>
 This access URL: <b>https://api.genome.ucsc.edu/</b> is used to access
 the endpoint functions.  For example:
 <pre>
     wget -O- 'https://api.genome.ucsc.edu/list/publicHubs'
 </pre>
 </p>
 
 <!-- ========== What type of data can be accessed? ===================== -->
 <a id="Return"></a>
 <h2>What type of data can be accessed?</h2>
 <p>
 The following data sets can be accessed at this time:
 <ul>
+<li>Find a genome in the UCSC browser with a search string</li>
 <li>List of available public hubs</li>
 <li>List of available UCSC Genome Browser genome assemblies</li>
 <li>List of files available for download for UCSC Browser genome assemblies</li>
 <li>List of genomes from a specified assembly or track hub</li>
 <li>List of available data tracks from a specified hub or UCSC Genome Browser genome assembly
 (see also: <a
  href='trackDb/trackDbHub.html' target=_blank>track definition help</a>)</li>
 <li>List of chromosomes contained in an assembly hub or UCSC Genome Browser genome assembly</li>
 <li>List of chromosomes contained in a specific track of an assembly or track hub, or UCSC Genome
 Browser genome assembly</li>
 <li>Return DNA sequence from an assembly hub 2bit file, or UCSC Genome Browser assembly</li>
 <li>Return track data from a specified assembly or track hub, or UCSC Genome Browser assembly</li>
 <li>Return search matches to words in track data, track names, track descriptions, public hub
 track names, and public hub descriptions within a UCSC Genome Browser genome assembly</li>
 </ul>
@@ -127,100 +128,107 @@
 <a href="/FAQ/FAQblat.html#blat14">BLAT FAQ</a> for more info.
 </p>
 
 <!-- ========== Endpoint functions ======================= -->
 <a id="Endpoint"></a>
 <h2>Endpoint functions to return data</h2>
 <p>
 The URL <b>https://api.genome.ucsc.edu/</b> is used to access
 the endpoint functions.  For example:
 <pre>
     curl -L 'https://api.genome.ucsc.edu/list/ucscGenomes'
 </pre>
 </p>
 <p>
 <ul>
+<li><b>/findGenome</b> - search for a genome in the UCSC browser</li>
 <li><b>/list/publicHubs</b> - list public hubs</li>
 <li><b>/list/ucscGenomes</b> - list UCSC Genome Browser database genomes from database host</li>
-<!-- not yet
 <li><b>/list/genarkGenomes</b> - list UCSC Genome Browser database genomes from assembly hub host</li>
--->
 <li><b>/list/hubGenomes</b> - list genomes from specified hub</li>
 <li><b>/list/files</b> - list download files available for specified genome</li>
 <li><b>/list/tracks</b> - list data tracks available in specified hub or database genome
 (see also: <a href='trackDb/trackDbHub.html' target=_blank>track definition help</a>)</li>
 <li><b>/list/chromosomes</b> - list chromosomes from a data track in specified hub or database
 <li><b>/list/schema</b> - list the schema for a data track in specified hub or database
 genome</li>
 <li><b>/getData/sequence</b> - return sequence from specified hub or database genome</li>
 <li><b>/getData/track</b> - return data from specified track in hub or database genome</li>
 <li><b>/search</b> - return search matches within a UCSC Genome Browser genome assembly</li>
 </ul>
 </p>
 
 <!-- ========== Parameters to endpoint functions ======================= -->
 <a id="Parameters"></a>
 <h2>Parameters to endpoint functions</h2>
 <p>
 <ul>
+<li>maxItemsOutput=1000000 - limit number of items to output, default: 1,000,000, maximum limit:
+1,000,000 (use <em>-1</em> to get maximum output)</li>
 <li>hubUrl=&lt;url&gt; - specify track hub or assembly hub URL</li>
 <li>genome=&lt;name&gt; - specify genome assembly in UCSC Genome Browser or track/assembly hub.  Use with with /list/genarkGenomes to test for existence.</li>
 <li>track=&lt;trackName&gt; - specify data track in track/assembly hub or UCSC database genome
 assembly</li>
 <li>chrom=&lt;chrN&gt; - specify chromosome name for sequence or track data</li>
 <li>start=&lt;123&gt; - specify start coordinate (0 relative) for data from track or sequence
 retrieval (start and end required together). See also: <a
  href='https://genome-blog.gi.ucsc.edu/blog/the-ucsc-genome-browser-coordinate-counting-systems/'
 target=_blank>UCSC browser coordinate counting systems</a></li>
 <li>end=&lt;456&gt; - specify end coordinate (1 relative) for data from track or sequence
 retrieval (start and end required together). See also: <a
  href='https://genome-blog.gi.ucsc.edu/blog/the-ucsc-genome-browser-coordinate-counting-systems/'
 target=_blank>UCSC browser coordinate counting systems</a></li>
-<li>revComp=1 - on <em>/getData/sequence</em> function, return reverse complement of sequence data</li>
-<li>maxItemsOutput=1000 - limit number of items to output, default: 1,000, maximum limit:
-1,000,000 (use <em>-1</em> to get maximum output)</li>
+<li>q=&lt;search word(s)&gt; - used with <em>/findGenome</em>, a search string</li>
+<li>browser=&lt;mustExist|mayExist|notExist&gt; - used with <em>/findGenome</em>, <em>mustExist</em> result only for assemblies in the UCSC browser, <em>mayExist</em> may exist in the UCSC browser, or may not, <em>notExist</em> not yet available in the browser.  default is <em>mustExist</em> </li>
+<li>statsOnly=1 - on <em>/findGenome</em> function, only show statistics about search result</li>
+<li>year=&lt;2025&gt; - on <em>/findGenome</em> function, only show search result for given year, default is any year</li>
+<li>category=&lt;reference|representative&gt; - on <em>/findGenome</em> function, show search result only for given NCBI category of assembly</li>
+<li>status=&lt;reference|representative&gt; - on <em>/findGenome</em> function, show search result only for given NCBI status of assembly</li>
+<li>level=&lt;complete|chromosome|scaffold|contig&gt; - on <em>/findGenome</em> function, show search result only for given NCBI level of assembly</li>
 <li>trackLeavesOnly=1 - on <em>/list/tracks</em> function, only show tracks, do not show
 composite container information</li>
+<li>revComp=1 - on <em>/getData/sequence</em> function, return reverse complement of sequence data</li>
 <li>jsonOutputArrays=1 - on <em>/getData/track</em> function, JSON format is array type
 for each item of data, instead of the default object type</li>
 <li>format=text - on <em>/list/files</em> function, return plain text listing
 of download files instead of JSON format output (which includes more meta-data information).  Text output contains less meta-data in comment lines prefixed by the '#' hash character.</li>
 <li>search=&lt;term&gt;&genome=&lt;name&gt; - on <em>/search</em> function, specify term to be
 search within a UCSC Genome Browser genome assembly</li>
 <li>categories=helpDocs - on <em>/search?search=&lt;term&gt;&genome=&lt;name&gt;</em> function, restrict the search
 within the UCSC Genome Browser help documentation</li>
 <li>categories=publicHubs - on <em>/search?search=&lt;term&gt;&genome=&lt;name&gt;</em> function, restrict the search
 within the UCSC Genome Browser Public Hubs</li>
 <li>categories=trackDb - on <em>/search?search=&lt;term&gt;&genome=&lt;name&gt;</em> function, restrict the search
 within the track database (trackDb) settings</li>
 </ul>
 </p>
 <p>
 The parameters are added to the endpoint URL beginning with a
 question mark <b>?</b>, and multiple parameters are separated with
 the semi-colon <b>;</b>.  For example:
 <pre>
 https://api.genome.ucsc.edu/getData/sequence?genome=hg38;chrom=chrM
 </pre>
 </p>
 
 <!-- ========== Required and optional parameters  ======================= -->
 <a id="Parameter_use"></a>
 <h2>Required and optional parameters</h2>
 <p>
 <table>
 <tr><th>Endpoint function</th><th>Required</th><th>Optional</th></tr>
+<tr><th>/findGenome</th><td>q</td><td>statsOnly, browser, year, category, status, level, maxItemsOutput</td></tr>
 <tr><th>/list/publicHubs</th><td>(none)</td><td>(none)</td></tr>
 <tr><th>/list/ucscGenomes</th><td>(none)</td><td>(none)</td></tr>
 <tr><th>/list/genarkGenomes</th><td>(none)</td><td>genome, maxItemsOutput</td></tr>
 <tr><th>/list/hubGenomes</th><td>hubUrl</td><td>(none)</td></tr>
 <tr><th>/list/files</th><td>genome</td><td>format=text, maxItemsOutput</td></tr>
 <tr><th>/list/tracks</th><td>genome or (hubUrl and genome)</td><td>trackLeavesOnly=1</td></tr>
 <tr><th>/list/chromosomes</th><td>genome or (hubUrl and genome)</td><td>track</td></tr>
 <tr><th>/list/schema</th><td>(genome or (hubUrl and genome)) and track</td><td>(none)</td></tr>
 <tr><th>/getData/sequence</th><td>(genome or (hubUrl and genome)) and chrom</td><td>start, end, revComp=1</td></tr>
 <tr><th>/getData/track</th><td>(genome or (hubUrl and genome)) and track</td><td>chrom,
 (start and end), maxItemsOutput, jsonOutputArrays</td></tr>
 <tr><th>/search</th><td>search and genome</td><td>categories=helpDocs,
 categories=publicHubs, categories=trackDb</td></tr>
 </table>
 </p>
@@ -236,30 +244,39 @@
 to the single specified chromosome.  To limit the request to a specific
 position, both <b>start=4321</b> and <b>end=5678</b> must be given together.
 Using the <b>revComp=1</b> parameter returns the reverse complement.
 </p>
 <p>
 Use the <b>genome</b> argument with the <b>/list/genarkGenomes</b> function
 to test for the existence of a specific genome assembly in the
 <a href='https://hgdownload.soe.ucsc.edu/hubs/' target=_blank>Genark</a> set
 of assembly hubs.
 </p>
 <p>
 The <b>/list/files</b> endpoint only works for UCSC hosted genome assemblies,
 not for external hosted assembly hubs.
 </p>
 <p>
+The <b>/findGenome</b> endpoint can find genome assemblies in the browser or
+any other assembly available at NCBI even when not in the browser.  Note,
+there are almost 4 million assemblies available at NCBI.  All searches are
+<b>case insensitive</b>.  Force inclusion: Use a + sign before <b>+word</b> to ensure
+it appears in the result.  Exclude words: Use a - sign before <b>-word</b> to
+exclude it from the search result.  Wildcard search: Add an * (asterisk) at
+end of <b>word*</b> to search for all terms starting with that prefix.
+</p>
+<p>
 Any extra parameters not allowed in a function will be flagged as an error.
 </p>
 
 <!-- ========== Supported track types ======================= -->
 <a id="Track_types"></a>
 <h2>Supported track types for getData functions</h2>
 <div class="row">
   <!-- Left column -->
   <div class="col-md-6">
     <p>
     <ul>
     <li>altGraphX (e.g. 'sibTxGraph' for hg19)</li>
     <li><a href='/goldenPath/help/barChart.html' target=_blank>barChart/bigBarChart</a></li>
     <li><a href='/FAQ/FAQformat.html#format1' target=_blank>bed</a></li>
     <li><a href='/goldenPath/help/bigBed.html' target=_blank>bigBed</a></li>
@@ -320,30 +337,38 @@
 <b>https://genome-asia.ucsc.edu/cgi-bin/hubApi/list/ucscGenomes</b></li>
 </ul>
 
 <!-- ========== Example data access ======================= -->
 <a id="list_examples"></a>
 <h2>Example data access</h2>
 <p>
 Your WEB browser can be configured to interpret JSON data and format
 in a convenient browsing format.  Firefox has this function built in,
 other browsers have add-ons that can be turned on to format JSON data.
 With your browser thus configured, the following links can demonstrate
 the functions of the API interface.
 </p>
 <h3>Listing functions</h3>
 <ol>
+<li><a href='https://api.genome.ucsc.edu/findGenome?q=dog&jsonOutputArrays=1' target=_blank>find any genome with the word 'dog'</a> -
+<b>api.genome.ucsc.edu/findGenome?q=dog</b></li>
+<li><a href='https://api.genome.ucsc.edu/findGenome?q=%2Bwhite%20%2Brhino%2A%20-southern' target=_blank>find any genome matching the search string: '+white +rhino* -southern'</a> -
+<b>api.genome.ucsc.edu/findGenome?q=%2Bwhite%20%2Brhino%2A%20-southern</b></li>
+<li><a href='https://api.genome.ucsc.edu/findGenome?q=GCF_028858775.2' target=_blank>find any genome with the accession id: <b>GCF_028858775.2</b></a> -
+<b>api.genome.ucsc.edu/findGenome?q=GCF_028858775.2</b></li>
+<li><a href='https://api.genome.ucsc.edu/findGenome?q=GRCh38' target=_blank>find any genome with the name: <b>GRCh38</b></a> -
+<b>api.genome.ucsc.edu/findGenome?q=GRCh38</b></li>
 <li><a href='https://api.genome.ucsc.edu/list/publicHubs' target=_blank>list public hubs</a> -
 <b>api.genome.ucsc.edu/list/publicHubs</b></li>
 <li><a href='https://api.genome.ucsc.edu/list/ucscGenomes' target=_blank>list UCSC database genomes</a> -
 <b>api.genome.ucsc.edu/list/ucscGenomes</b></li>
 <li><a href='https://api.genome.ucsc.edu/list/genarkGenomes?maxItemsOutput=5' target=_blank>list GenArk assembly hub genomes</a> -
 <b>api.genome.ucsc.edu/list/genarkGenomes?maxItemsOutput=5</b></li>
 <li><a href='https://api.genome.ucsc.edu/list/genarkGenomes?genome=GCF_028858775.2' target=_blank>test if genome GCF_028858775.2 exists in the GenArk assembly hub genomes</a> -
 <b>api.genome.ucsc.edu/list/genarkGenomes?genome=GCF_028858775.2</b></li>
 <li><a href='https://api.genome.ucsc.edu/list/hubGenomes?hubUrl=http://hgdownload.soe.ucsc.edu/hubs/mouseStrains/hub.txt'
 target=_blank>list genomes from specified hub</a> -
 <b>api.genome.ucsc.edu/list/hubGenomes?hubUrl=http://hgdownload.soe.ucsc.edu/hubs/mouseStrains/hub.txt</b></li>
 <li><a href='https://api.genome.ucsc.edu/list/tracks?hubUrl=http://hgdownload.soe.ucsc.edu/hubs/mouseStrains/hub.txt;genome=CAST_EiJ'
 target=_blank>list tracks from specified hub and genome</a> -
 <b>api.genome.ucsc.edu/list/tracks?hubUrl=http://hgdownload.soe.ucsc.edu/hubs/mouseStrains/hub.txt;genome=CAST_EiJ</b></li>
 <li><a href='https://api.genome.ucsc.edu/list/tracks?genome=hg38' target=_blank>list tracks from UCSC database genome</a> -