198c9b8daecc44fbda6a6494c566c723920f030a lrnassar Wed Mar 11 18:25:21 2026 -0700 Fixing a few hundred clear typos with the help of Claude. Some are less important in code comments, but majority of them are in user-facing places. I manually approved 60%+ of the changes and didn't see any that were an incorrect suggestion, at worst it was potentially uncessesary, like a code comment having cant instead of can't. No RM. diff --git src/hg/htdocs/FAQ/FAQformat.html src/hg/htdocs/FAQ/FAQformat.html index 5197a05cb95..58a0bf6e26e 100755 --- src/hg/htdocs/FAQ/FAQformat.html +++ src/hg/htdocs/FAQ/FAQformat.html @@ -162,31 +162,31 @@ <td>834-944</td> <td>≥ 945</td> </tr> </table> <li> <strong>strand</strong> - Defines the strand. Either "." (=no strand) or "+" or "-".</li> <li> <strong>thickStart</strong> - The starting position at which the feature is drawn thickly (for example, the start codon in gene displays). When there is no thick part, thickStart and thickEnd are usually set to the chromStart position.</li> <li> <strong>thickEnd</strong> - The ending position at which the feature is drawn thickly (for example the stop codon in gene displays).</li> <li> <strong>itemRgb</strong> - An RGB value of the form R,G,B (e.g. 255,0,0). If the track line - <em>itemRgb</em> attribute is set to "On", this RBG value will determine the display + <em>itemRgb</em> attribute is set to "On", this RGB value will determine the display color of the data contained in this BED line. NOTE: It is recommended that a simple color scheme (eight colors or less) be used with this attribute to avoid overwhelming the color resources of the Genome Browser and your Internet browser.</li> <li> <strong>blockCount</strong> - The number of blocks (exons) in the BED line.</li> <li> <strong>blockSizes</strong> - A comma-separated list of the block sizes. The number of items in this list should correspond to <em>blockCount</em>.</li> <li> <strong>blockStarts</strong> - A comma-separated list of block starts. All of the <em>blockStart</em> positions should be calculated relative to <em>chromStart</em>. The number of items in this list should correspond to <em>blockCount</em>.</li> </ol> <p> In BED files with block definitions, the first <i>blockStart</i> value must be 0, so that the first @@ -436,55 +436,55 @@ Note that there is also a GFF3 specification that is not currently supported by the Browser. All GFF tracks must be formatted according to Sanger's GFF2 specification.</p> <p> If you would like to obtain browser data in GFF (GTF) format, please refer to <a href="http://genomewiki.ucsc.edu/index.php/Genes_in_gtf_or_gff_format" target="_blank">Genes in gtf or gff format</a> on the Wiki.</p> <p> Here is a brief description of the GFF fields:</p> <ol> <li> <strong>seqname</strong> - The name of the sequence. Must be a chromosome or scaffold.</li> <li> <strong>source</strong> - The program that generated this feature.</li> <li> <strong>feature</strong> - The name of this type of feature. Some examples of standard feature - types are "CDS" "start_codon" "stop_codon" and "exon"li> + types are "CDS" "start_codon" "stop_codon" and "exon".</li> <li> <strong>start</strong> - The starting position of the feature in the sequence. The first base is numbered 1.</li> <li> <strong>end</strong> - The ending position of the feature (inclusive).</li> <li> <strong>score</strong> - A score between 0 and 1000. If the track line <em>useScore</em> attribute is set to 1 for this annotation data set, the <em>score</em> value will determine the level of gray in which this feature is displayed (higher numbers = darker gray). If there is no score - value, enter ".". + value, enter ".". <li> <strong>strand</strong> - Valid entries include "+", "-", or "." (for don't know/don't care).</li> <li> <strong>frame</strong> - If the feature is a coding exon, <em>frame</em> should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be ".".</li> <li> <strong>group</strong> - All lines with the same group are linked together into a single item.</li> </ol> <p> <strong><em>Example:</em></strong><br> -Here's an example of a GFF-based track. This data format require tabs and some operating systems convert tabs to spaces. If pasting doesn't work, this <a href="../goldenPath/help/regulatory.txt" +Here's an example of a GFF-based track. This data format requires tabs and some operating systems convert tabs to spaces. If pasting doesn't work, this <a href="../goldenPath/help/regulatory.txt" target="blank">example's</a> contents or the url itself can be pasted into the custom track text box. <pre><code>browser position chr22:10000000-10025000 browser hide all track name=regulatory description="TeleGene(tm) Regulatory Regions" visibility=2 chr22 TeleGene enhancer 10000000 10001000 500 + . touch1 chr22 TeleGene promoter 10010000 10010100 900 + . touch1 chr22 TeleGene promoter 10020000 10025000 800 - . touch2 </code></pre> <p> Click <a class="insideLink" href="../cgi-bin/hgTracks?org=human&position=chr22&hgt.customText=http://genome.ucsc.edu/goldenPath/help/regulatory.txt" target="_blank">here</a> to display this track in the Genome Browser.</p> <a name="format4"></a> <h2>GTF format</h2> <p> @@ -529,41 +529,41 @@ Also, review the enhanced <a href="../goldenPath/help/interact.html">interact</a> format for information on how to visualize pairwise interactions as arcs in the browser. </p> <a name="format5"></a> <h2>MAF format</h2> <p> The multiple alignment format stores a series of multiple alignments in a format that is easy to parse and relatively easy to read. This format stores multiple alignments at the DNA level between entire genomes. Previously used formats are suitable for multiple alignments of single proteins or regions of DNA without rearrangements, but would require considerable extension to cope with genomic issues such as forward and reverse strand directions, multiple pieces to the alignment, and so forth.</p> <p> <strong>General Structure</strong><br> -The <em>.maf</em> format is line-oriented. Each multiple alignment beigns with the reference genome +The <em>.maf</em> format is line-oriented. Each multiple alignment begins with the reference genome line and ends with a blank line. Each sequence in an alignment is on a single line, which can get quite long, but there is no length limit. Words in a line are delimited by any white space. Lines starting with # are considered to be comments. Lines starting with ## can be ignored by most programs, but contain meta-data of one form or another.</p> <p> The file is divided into paragraphs that terminate in a blank line. Within a paragraph, the first word of a line indicates its type. Each multiple alignment is in a separate paragraph that begins with an "a" line and contains an "s" line for each sequence in the multiple -alignment. The first sequence must be the reference genome on which the rest of the sequenes map. +alignment. The first sequence must be the reference genome on which the rest of the sequences map. Some MAF files may contain other optional line types: </p> <ul> <li> an "i" line containing information about what is in the aligned species DNA before and after the immediately preceding "s" line</li> <li> an "e" line containing information about the size of the gap between the alignments that span the current block</li> <li> a "q" line indicating the quality of each aligned base for the species</li> </ul> <p> Parsers may ignore any other types of paragraphs and other types of lines within an alignment paragraph. </p> <p> @@ -646,31 +646,31 @@ <p> <strong>Lines starting with "s" -- a sequence within an alignment block</strong></p> <pre><code> s hg16.chr7 27707221 13 + 158545518 gcagctgaaaaca s panTro1.chr6 28869787 13 + 161576975 gcagctgaaaaca s baboon 249182 13 + 4622798 gcagctgaaaaca s mm4.chr6 53310102 13 + 151104725 ACAGCTGAAAATA </code></pre> <p> The "s" lines together with the "a" lines define a multiple alignment. The first "s" line must be the reference genome, hg16 in the above example. The "s" lines have the following fields which are defined by position.</p> <ul> <li> <strong>src</strong> -- The name of one of the source sequences for the alignment. For sequences that are resident in a browser assembly, the form 'database.chromosome' allows automatic creation - of links to other assemblies. Non-browser sequences are typically reference by the species name + of links to other assemblies. Non-browser sequences are typically referenced by the species name alone.</li> <li> <strong>start</strong> -- The start of the aligning region in the source sequence. This is a zero-based number. If the strand field is "-" then this is the start relative to the reverse-complemented source sequence (see <a href="http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms" target=blank>Coordinate Transforms</a>).</li> <li> <strong>size</strong> -- The size of the aligning region in the source sequence. This number is equal to the number of non-dash characters in the alignment text field below.</li> <li> <strong>strand</strong> -- Either "+" or "-". If "-", then the alignment is to the reverse-complemented source.</li> <li> <strong>srcSize</strong> -- The size of the entire source sequence, not just the parts involved in @@ -821,31 +821,31 @@ <tr> <td align="center">0</td> <td align="center">98</td> <td align="center">Manually assigned</td> </tr> <tr> <td align="center">F</td> <td align="center">99</td> <td align="center">Finished</td> </tr> </table> </ul> <p> <strong>A Simple Example</strong></p> <p> -Here is a simple example of a three alignment blocks derived from five starting sequences. The +Here is a simple example of three alignment blocks derived from five starting sequences. The first <strong>track</strong> line is necessary for custom tracks, but should be removed otherwise. Repeats are shown as lowercase, and each block may have a subset of the input sequences. All sequence columns and rows must contain at least one nucleotide (no columns or rows that contain only insertions).</p> <pre><code>track name=euArc visibility=pack ##maf version=1 scoring=tba.v8 # tba.v8 (((human chimp) baboon) (mouse rat)) a score=23262.0 s hg18.chr7 27578828 38 + 158545518 AAA-GGGAATGTTAACCAAATGA---ATTGTCTCTTACGGTG s panTro1.chr6 28741140 38 + 161576975 AAA-GGGAATGTTAACCAAATGA---ATTGTCTCTTACGGTG s baboon 116834 38 + 4622798 AAA-GGGAATGTTAACCAAATGA---GTTGTCTCTTATGGTG s mm4.chr6 53215344 38 + 151104725 -AATGGGAATGTTAAGCAAACGA---ATTGTCTCTCAGTGTG s rn3.chr4 81344243 40 + 187371129 -AA-GGGGATGCTAAGCCAATGAGTTGTTGTCTCTCAATGTG a score=5062.0