7f7928d7115d32d6018254b9f8f241cc6e6c7716 dschmelt Tue Nov 2 16:03:21 2021 -0700 Adding doc about Batch Queries using positions refs #28436 diff --git src/hg/htdocs/goldenPath/help/hgTablesHelp.html src/hg/htdocs/goldenPath/help/hgTablesHelp.html index 72a3fa5..c9c48e3 100755 --- src/hg/htdocs/goldenPath/help/hgTablesHelp.html +++ src/hg/htdocs/goldenPath/help/hgTablesHelp.html @@ -1,36 +1,37 @@
The Table Browser is built on top of the Genome Browser database, which actually consists of several separate databases, one for each genome assembly.
Tables within the databases may be differentiated by whether the data are based on genome start-stop coordinates (positional tables) or are independent of position (non-positional tables).Some output formats and query options are applicable only to positional tables, hence the distinction.
- - - - --Non-positional tables contain data not tied to genomic location, for example a table that correlates -a Known Gene ID with a RefSeq accession ID. Some non-positional tables relate internal numeric mRNA -IDs to extended information such as author, tissue, or keyword. Some "meta" tables in -this category contain information about the structure of the database itself or describe external -files containing sequence data.
-Positional tables contain data associated with specific locations in the genome, such as mRNA alignments, gene predictions, cross-species alignments, and other annotations. Each of the annotation tracks displayed in the Genome Browser is based on a positional table. In some instances, data from other positional and non-positional tables may also be incorporated into the track. Data associated with custom annotation tracks active within the user's Table Browser session are also available as positional tables.
Positional tables can be further subdivided into several categories based on the type of data they describe. Alignment data can be best described by using a block structure to represent each element. Other tables require only start and end coordinate data for each element. Some tables specify a translation start and end in addition to the transcription start and end. Some tables contain strand information, others don't. Most tables, but not all, specify a name for each element. Based on the format of the data described by a table, different query and output formatting options may be offered.
+ + + ++Non-positional tables contain data not tied to genomic location, for example a table that correlates +a Known Gene ID with a RefSeq accession ID. Some non-positional tables relate internal numeric mRNA +IDs to extended information such as author, tissue, or keyword. Some "meta" tables in +this category contain information about the structure of the database itself or describe external +files containing sequence data.
+In its most basic form, the Table Browser can be used to retrieve a specific subset of records from a track or positional table in a selected genome assembly. The query may be based on a specific position or a set of one or more identifiers.
@@ -273,35 +273,35 @@
Select the RefSeq Genes option in the track
list.
position
box (the Table Browser will
automatically select the position
option button).Get Output
button.The Table Browser will display the records for the RefSeq accessions NM_005522, NM_153620, NM_006735, NM_153632, NM_030661, and NM_153631.
-In many cases, you may want to retrieve data based on a list of one or more accessions or names, +In many cases, you may want to retrieve data based on a list of one or more accessions, IDs, or names, rather than querying by genomic position. Many tracks in the Table Browser, such as those in the -Genes and Gene Prediction track group, support identifier queries. The identifier type used +Genes and Gene Prediction or Variationtrack groups, support identifier queries. The identifier type used in the query must match the kind of identifiers present in the track data, e.g., mRNA accession IDs -must be used to query the mRNA table.
+must be used to query the mRNA table and rsIDs must match those in the dbSNP table.Follow these steps to display a list of records that correspond to a set of accessions or names entered as query input.
Step 1. Pick the genome assembly, track, and table
Step 2. Select the genome region
setting
Step 3. Load the identifiers into the browser
Click the Paste List
button to type or paste in the identifiers or the Upload
List
button to load the data from a file existing on your local computer.
Clear List
button.
Step 4. Click the Get Output
button
See the Output formats section for information about configuring the
query output.
+
+
+
+If you have a list of genomic positions and want to retrieve information
+about their properties, you can use the Define Regions
button to input
+multiple positions to query a chosen table. In this example, you want to determine
+the dbSNP rsID names for your list of positions.
+
+
Step 1. Select genome assembly and track +To determine dbSNP rsIDs we will be using Human genome hg38 and dbSNP153.
+ +Step 2. Select the define regions
button, enter regions
+You can find the define regions
button under the Define region of
+interest
section. Upload, type, or paste in your regions of interest, making sure they are
+in the desired 0/1 base notation. They will only be accepted in BED or positional format.
Step 3. Select output format and get output
+If you want all data from a table, you need not change the output format from the default.
+If you want only particular columns from the table, you can change it to selected fields
+from primary and related tables
. Once you hit the get output
button,
+you will be redirected to a column selection page or if you did not change the output format,
+your output data itself.
Follow the example below to obtain gene symbols in your query:
output format
to selected fields from primary
and related tables. get output
to go to the next step of selecting fields from
related tables. get output
again to get the final query output.
The Table Browser filter
option can be used to:
Step 5. Click the Submit button to apply the filter
Note: In the current implementation of the Table Browser, the selected fields
from primary and related tables output format option must be used when including fields from
multiple tables in a filter. Check the boxes for all tables in the Linked Tables
list
on which filter constraints have been applied, then click the Allow Selection From Checked
Tables
button to include them in the output.
Strings
Text fields are compared to words or patterns containing wildcard characters. Valid wildcards are
+i
+
+
+
+
"*" (matches 0 or more characters) and "?" (matches a single character). Each
space-separated word or pattern in a text field box is matched against the value of that field in
each record. If any word or pattern matches the value, then the record meets the constraint on that
field.
Numbers
Numeric fields are compared to table data using an operator such as <, >, != (not equals) followed
by a number. To specify a range, enter two numbers (start and end) separated by white space and/or a
comma.
Free-form queries
When the filters on individual fields aren't sufficiently flexible, the free-form query
text box allows the application of more complex constraints that typically relate two or more field
names of the selected table. Valid free-form queries use the syntax of the SQL
where clause