198c9b8daecc44fbda6a6494c566c723920f030a lrnassar Wed Mar 11 18:25:21 2026 -0700 Fixing a few hundred clear typos with the help of Claude. Some are less important in code comments, but majority of them are in user-facing places. I manually approved 60%+ of the changes and didn't see any that were an incorrect suggestion, at worst it was potentially uncessesary, like a code comment having cant instead of can't. No RM. diff --git src/hg/htdocs/FAQ/FAQformat.html src/hg/htdocs/FAQ/FAQformat.html index 5197a05cb95..58a0bf6e26e 100755 --- src/hg/htdocs/FAQ/FAQformat.html +++ src/hg/htdocs/FAQ/FAQformat.html @@ -162,31 +162,31 @@
In BED files with block definitions, the first blockStart value must be 0, so that the first @@ -436,55 +436,55 @@ Note that there is also a GFF3 specification that is not currently supported by the Browser. All GFF tracks must be formatted according to Sanger's GFF2 specification.
If you would like to obtain browser data in GFF (GTF) format, please refer to Genes in gtf or gff format on the Wiki.
Here is a brief description of the GFF fields:
Example:
-Here's an example of a GFF-based track. This data format require tabs and some operating systems convert tabs to spaces. If pasting doesn't work, this example's contents or the url itself can be pasted into the custom track text box.
browser position chr22:10000000-10025000
browser hide all
track name=regulatory description="TeleGene(tm) Regulatory Regions" visibility=2
chr22 TeleGene enhancer 10000000 10001000 500 + . touch1
chr22 TeleGene promoter 10010000 10010100 900 + . touch1
chr22 TeleGene promoter 10020000 10025000 800 - . touch2
Click here to display this track in the Genome Browser.
@@ -529,41 +529,41 @@ Also, review the enhanced interact format for information on how to visualize pairwise interactions as arcs in the browser.
The multiple alignment format stores a series of multiple alignments in a format that is easy to parse and relatively easy to read. This format stores multiple alignments at the DNA level between entire genomes. Previously used formats are suitable for multiple alignments of single proteins or regions of DNA without rearrangements, but would require considerable extension to cope with genomic issues such as forward and reverse strand directions, multiple pieces to the alignment, and so forth.
General Structure
-The .maf format is line-oriented. Each multiple alignment beigns with the reference genome
+The .maf format is line-oriented. Each multiple alignment begins with the reference genome
line and ends with a blank line. Each
sequence in an alignment is on a single line, which can get quite long, but there is no length
limit. Words in a line are delimited by any white space. Lines starting with # are considered to be
comments. Lines starting with ## can be ignored by most programs, but contain meta-data of one form
or another.
The file is divided into paragraphs that terminate in a blank line. Within a paragraph, the first word of a line indicates its type. Each multiple alignment is in a separate paragraph that begins with an "a" line and contains an "s" line for each sequence in the multiple -alignment. The first sequence must be the reference genome on which the rest of the sequenes map. +alignment. The first sequence must be the reference genome on which the rest of the sequences map. Some MAF files may contain other optional line types:
Parsers may ignore any other types of paragraphs and other types of lines within an alignment paragraph.
@@ -646,31 +646,31 @@
Lines starting with "s" -- a sequence within an alignment block
s hg16.chr7 27707221 13 + 158545518 gcagctgaaaaca
s panTro1.chr6 28869787 13 + 161576975 gcagctgaaaaca
s baboon 249182 13 + 4622798 gcagctgaaaaca
s mm4.chr6 53310102 13 + 151104725 ACAGCTGAAAATA
The "s" lines together with the "a" lines define a multiple alignment. The first "s" line must be the reference genome, hg16 in the above example. The "s" lines have the following fields which are defined by position.
A Simple Example
-Here is a simple example of a three alignment blocks derived from five starting sequences. The +Here is a simple example of three alignment blocks derived from five starting sequences. The first track line is necessary for custom tracks, but should be removed otherwise. Repeats are shown as lowercase, and each block may have a subset of the input sequences. All sequence columns and rows must contain at least one nucleotide (no columns or rows that contain only insertions).
track name=euArc visibility=pack
##maf version=1 scoring=tba.v8
# tba.v8 (((human chimp) baboon) (mouse rat))
a score=23262.0
s hg18.chr7 27578828 38 + 158545518 AAA-GGGAATGTTAACCAAATGA---ATTGTCTCTTACGGTG
s panTro1.chr6 28741140 38 + 161576975 AAA-GGGAATGTTAACCAAATGA---ATTGTCTCTTACGGTG
s baboon 116834 38 + 4622798 AAA-GGGAATGTTAACCAAATGA---GTTGTCTCTTATGGTG
s mm4.chr6 53215344 38 + 151104725 -AATGGGAATGTTAAGCAAACGA---ATTGTCTCTCAGTGTG
s rn3.chr4 81344243 40 + 187371129 -AA-GGGGATGCTAAGCCAATGAGTTGTTGTCTCTCAATGTG
a score=5062.0