909449391962ec9312052bf186158d9f5162fcc4 brianlee Fri May 27 07:27:01 2022 -0700 Not for Code Review, see ticket #29356, check-in of temporary examples not fully working for bigRmsk -on T2T CHM13 and align.bb not working yet diff --git src/hg/htdocs/goldenPath/help/bigRmsk.html src/hg/htdocs/goldenPath/help/bigRmsk.html index 7e0361f..e33d738 100755 --- src/hg/htdocs/goldenPath/help/bigRmsk.html +++ src/hg/htdocs/goldenPath/help/bigRmsk.html @@ -204,30 +204,55 @@ as this example does not include the optional supporting alignment file.</p> <p> This example can also be loaded in a Track Hub with a stanza such as the following:</p> <pre> track ExBigRmsk shortLabel Example bigRmsk longLabel This is an example Track Hub Stanza type bigRmsk visibility full bigDataUrl http://genome.ucsc.edu/goldenPath/help/examples/bigRmsk.bb </pre> NOTE: FOR WHEN REDOING PAGE, only Track Hubs now allow clicking into hgc. <!--- NOTE: The below is innaccurate and just a holder for when <b>xrefDataUrl works</b> to give an example building it. + +Adding potential input file (this is from RobertH T2T hub), both the align.bb and bigRmsk.bb for a region are stashed below (not for hg38 though). + +$ bigBedToBed -chrom=chr1 -start=4513 -end=7608 https://hgdownload.soe.ucsc.edu/hubs/GCA/009/914/755/GCA_009914755.4/bbi/GCA_009914755.4_T2T-CHM13v2.0.t2tRepeatMasker/chm13v2.0_rmsk.align.bb stdout +chr14082453324838279596227.3212.421.00-LTR60BLTRERV126476503TA/GTTACT/CGGGG/AAGG/TGCT/GGA/GT/AG/ATCC+T+CA/GGTTCTT+A+GTT/CTA/TACTTGGA/GAGAAAGAT/ATTT/CC/GA/GCCAAGAGG/ACAG/ATA/TC/TAA/GA/CG/CATG/AG/AC/AAGAT/GAAC/TTT-C-ATTGAAA/GA/GG/AAAAC/TAC/GAGT/AGT/CAA/GAGAGC/TTTATT+TAAAGAGACA+GTA+CACTCT+GAAAA/GATA/GG/AGGA/CG/AGAGT/CGGGCTG+CTGAAAG+AGC/AGTGC/AA/GT/CT/CAA/G+C+AA/GCAGCCT/C+C+A/GAGAGTC/TCTGT/CT/GC/TA/TGGA/GA/GA/TTTTTATT/N+ATG+TG/CGGACTTC/TTTC+TTG+AC/AA/GTTCCT/CGCCTCTGTCTC/TAAG-T-CTCCA/GCCTG/TTTTTCTTTGTCTG/AG+T+TTTTC/TCT+TAA+GC/TT/CA/CCT/CGCCTT-AG-C/NTCCCCGA/CCT+AG+TG/TCCCC/GA/CCT/CT/CAGGCTTGTGGGACC+CT+T/CCCTC/TACTGTG/CG/AGTTGA/GG/TGT/CA/GCATGT/CG+CGGGCC+T/CGGTGA/TTC/GA/GATACGAATC/TCA/TC/AT/CCTG/AG/AC/TA/GC/GCA/NGC+GTTG+CTCC/ATTC/ACCGCCAT/CCCCAGGC/AAC/GGC/TT/CG+T+AC/TAGCGA/GTCAC/AG/ATT/CTGTACC/TTAC/TTGT/CGCCTGC+GTAT+CTCTTT/AT/GGAAT-G-TC/TCTT/CCTC/TTGCCCT +chr14533466024838266850520.477.870.00-LTR60BLTRERV11783144514AATCTGTACTTATG/TGG/CGCCA/TG+C+GTT/ATCTCTTAA/GGAATG/TTCC/TCT/CTTTG+CCCTCTT+G/TCCTT/CCTTAC/TCAA/GCATGTAGCTAGCA/TAT/CATTCTGACAT/GT/GTTT/AAT/CTGCAGAGG/TGAA/GT/CGATTG/A+CT+GGGCA/GTCTTC/AAGA/GGGA/CGTTC +chr146635139248382189130422.814.201.43+L1MC_orf2LINEL12804329225GGTGA/CG/TGGAAC-GATT-AAT/CTGGAA/CA+T+CCAT/CAA/TGA/CAAT/AG/AAT/AATGC/AATA/CTAGAT/CG/AA/CAA/GACT/CTTACAA/CCT/CC/TA/TCACAAC/AT/AAA/TTC/AACTCAAAAT+GGAT+CATC/A+GA+CT/CTAC/AAC/TT/GA/TAAAAT/CGCT/AAAACTATAC/AAAT/CTT/CCTAGAAGA+T+AACA-A-TAGA/GAGAAAAG/TCTAT/GG/ATGC/ACT/CTTGGGTTTGGT/CA/GATGAA/CTTTTA/TAC/GAA/TAT/CG/AAT/CACA/CAAAGGT/C+A+T/CGAT+CC+ATA/GC/AAC/AA/GAAAG/NAAA/TTGAC/TAT/AT/GG/CTGGT/AT/CTTC+A+TTAAT/AATTT/AAAAG/AT/CTTA/CTA/GCTCTG+CG+G/AAAGACAC-CT-TGTT/CAAGAGAAC/TA/GAAAAGACAAGCCACAT/GAT/CTGA/G+G+AGAAAATATTTGCAAAAT/GACAC/TATCTGAG/TAAAGA/GAT/CTT/GG/TTC/ATT/CCAAAATATAT/CAAAA/GAAA/CTA/CTTAAAACTA/CAACAATAAGT/A+AAA+T/CAAACAG/ACCCA/GAC/TT+N+AAAAATGC/GA/GCAC/AAC/A+G+AT/CCTGAACAGACACCTCACCAAAGAAGATC/ATACAGATGGCAAG/ATAAA/GCATAC/TA/GAAAAGATGCTCA/NACAT +chr14997526324838206588715.4612.782.74+L1MC3_3endLINEL129322395TTGTC/ATT/CCAAAATATAT/CAAAA/GAAA/CTA/CTTAAAACTA/CAACAATAAGT/A+AAA+T/CAAACAG/ACCCAAC/TTAAAAA+A+TGC/GA/GCAC/AAC/A+G+ATCTGAACAGACACCTCACCAAAGAAGATC/ATACAGATGGCAAG/ATAAA/GCATAC/TA/GAAAAGATGCTCAACAT+CATTTGTC+AC/TTAGA/GGAAC/TTG+CAAATT+AAAACC/AACAATGAGATAG/CCAC-AGCTGG-TC/AT/CAT/CAT/CCTC/ATTAGAAC/TT/GGCTAAAC/ATCC-CT-AAAAAA+C+TGACA+ATACC+AAT/NTGCTG+GCGAGGAT+GA/CGGAA/GA/CAACAA/GGAACTCTT/C+A+TTCATTGCC/TGGTGGA/GA +chr152745528248381800140314.681.970.78+MER34C_vLTRERV12633226AGA/NCCAA/GAATATGCCACCCCAAAATATA/GAT/CG/TGTAGGAA/GACCAGAATATGCCACCCCAAAATATGT/CCC/TCTTTGT/GCT/ATAAGA/GATTATTC/TC/TA/GAGCTGATTATTTTGAA/GAAAA/CTA/GA/CAT/GG/AC-TA-ACAA/GA/GG/AGAAGT/CTCTGAAAACAGAGTAGAAGTTACCCTTG/TTGTAAGGA/GAAATTTACATCTATAAAGGAAATCC/TCCATTTA/G+T+AAA/GGC/GTA/GC/TCT+CC+CTCTCTA/GC/TACCAA/GGAAGAGAAGGATA/GA+CT+CTAAATCACTAA/GAGAG/CTCTT +chr155285686248381642354424.566.810.65+L1MC3_3endLINEL181296815645TAATA/GGTGG-G-ATAT/CC/ATGACACA/TAC/TGCATTTA/GTCAAG/AAT/CA/CCAC/TAGAAT/CTTTAT/CG/AGC+A+CAAA-T-GG/AGTA/GAAT/CCA/TA/TAT/ATC/GTATT/GCAAATTA/TA/TAC/AAAAATT/CAC/TTC/TAGGAT/GGT+C+GGC/GGT/GATCCCAGGAC/TA/GGAATGCAT/GC/AA/NTGTGA+C+AAAAG/NAATT/CTA-T-G/ACTA/GC/TAA/T-A-T +chr156866131248381197244214.805.841.29+MSTA1LTRERVL-MaLR46507TA/GCTATGGTTTGGATGT-GGT-TTGTCCCCGCA/CAAAACTCATGTTGAAATTTGAC/TCCCCAATGTGGCAGTGTG/TGGG/A-C-GGTGGGGCCTAGTGGA/GT/AGGTGTTTGGGTCATGGGGA/GT/CGGATCCCTCATGAATAGATTAATGT/CCCTCC+CTCGNG+A/GTGGGG/NGTGAGTGAGTA/TCT+C+GCTCT+NN+CA/GT/CA/GGGAATGGATTAA/GTTCCT/CGCA/GG/AGAGT/CA/GGGTA/TA/GTTAAAAAGAGTCTGGC+GNC+TT/CCCTT/CG/CG/TCT+CTC+TCC/TCTT+GC+TTGCTTT/CCA/TCTT/CTT/CGCT/CATGTGATCTCTG-G-T/CG/ACACC/GCCT/C-T-GCTCCCCTTCC+NCTTC+GCTTTCCA/GCCATGAGG/TT/NGAAA/GA/CAGA/CCTGAA/GGCCC+T+CACCAGATGCAA/GCTGCCCA/GA/NT/ACT/CC/NG/TGA/CC/TA/TTTC+GNC+CAGCT/CACCAGT/AATT/CGTGAGCCAAATG/AAAT/CCTT/CTTTTA/CC/TTTATAAATTACCCAGCCTCAGGTATTCT/CGTTAC/TAGA/CAG/ACACAAG/AAT/CGGACTAAGACA +chr161317132248380196354424.566.810.65+L1MC3_3endLINEL196920414915CAAATGTAG/TGT/AAAA/CAAC+C+TCACTGAAGGT/GGG+TG+A/GGGGAAAAT/AGGTGT/CTGACCTAAGTC/AACTTTGA/GAAATGAA/GTA/GGAA/GTCTG+T+G/AAGG/ACTG/AAAGGCAC/AA+A+T/GGAACTA/GTACT/ATC/AAT/GA/CAT/CTGG/TAT/CTA/CC/TAT/GTTT/GATAAAGTTA/GTTTCCA/CACA/GGA/GA/GGC/TAA/CC/GT/GGTG/TAACAATTG/CTA/GAA/NACCA/GCA/TG/ATG/AT/CC/ATGTAT+A+CTGGAG/ATA/TA/GAACAATG/TAC/AT/GTAC/AATA/GA/GG/ATC/GGCA/GGATGGTGGGAA/GCCAGC/GTTTCTCACTGTTGA/GAGTGGGAGG/NTTACAA/GATT/AAGCAAGA/GC/GGAGA/GAGGCTAGAATGATT/CCC/ATGTGA/GTAG/ATA/GGATC/TAGAGG/TTGGAGACATCAA/GC/TG/ATA/GAACTT/CATGC/TTTAGT/CTTAATATAGATACAC/GAC/TA/GGTTC/AT/CAC/TATAGAAAA/TC/ATTTATAA/GT/ATAG/TGTGTG/ATG/ATAG/CG/AT/CA/GGGTTAG+T+AC/TACACACATATAC/TTTCCTA/TGCA/TT/CTGC/TT/CAA/GT/C+TGA+GAGGGA/CCA/TAGAT/AA/GCAAT/C+GACACCCCAGTAGCAACGA+GT/CG/ACAT/CT/CC/TAGCA/GG/CCCAC/GATG/CTA/TA/GGTTT+C+TC/AC/AC/TACCATTC+TCCAA+TG/AAAAGGAAT/CCA+G+GGCTCT/CTTGA/GAGAAATGT/GCTGATA/TCTAGA/GACTGGGA/GCAGT/GAAATAT+ACAAG+AG/TGAGCCA/TGGAT/GA/CATCTG/TGA/TAGTA/GT/CCAGAAAGT+AAGG+AAGTA/GCT+C+AAAAAAAT/CT/CA/CAA/CA+ATGATGGGGG+TATA/GTCAAAC/GA/GA/GAA/CAT/CAA/GA/GAGCCAAT/CA/TA/GAAAC/GAGCTA/CCCG/AATGGCCAAC/AA/GCA/TGGAAG/CG/AAA/TTTGT/AGCAACAT/AAAT+A+G/AC/ATA+AAG+TAGTG/ATC/TGA/GATA/TATAA+C+CT/CAAAGC/TT/ATAAAG/ATAAT/ATATCT/CAG/TGT/AGTCT/CG/ATAT/CTT/GG/ATATAC/AC/ATAG/AG/ATG+ATTGAATA+AATAAG/AC/TAAATGGA/GGT/GT/AGC/AAT/GAG+AC+AAATCTCCT/CT/GTGCAA/GAAGAATTCCAAATAAC/TTG/TATGTAGAC/TACTCA/CGCCA/CTCAAGA/GAGGTGGAGC+A+C/TAACTCCT/CCACTCCG/TTAAGTGTGGGCTC/GT/CGCATAGTGACTTG/CCTC/TCA/CAAAGA-ACAC-A/GTG/ACAGTATGGAC/AAA/GGGA/GGGAAAAA+AGAG+TAACTTC/TACAGTGGAGAAAT/CCTGACAAACAG/CTAG/CCTCT/AGCCAA/GA/GTGATCC/AAA/GGTG/CAACAC/TCAAA/CG/AC/GTGAC/TAG/AT/GTCAC/TC/GTTGAG/TAA/GC/TATG +chr171417533248379795121520.504.597.89+L1MC3_3endLINEL12150252935TGG/AGGGACATTCTACAAAAA/TT/ACCTGACCAA/GTC/ACTCCTCAG/AT/AG/ACTA/GTG/CAAGGTCATCAT/AG/AAG/A+C+AT/AGGAAAGC/TCTA/GAC/GAC/AACTGTCACAGCCAG/AGAA/GGAGCCTAT/AG+GAGACA+TGAT/CGT/ACTAC/AATGTC/AG/ATGC/TGGG/TATCCTGGATGGGATCCTGGG/AT/ACAGAG/AT/AAAGA/G+ACAT+TAG+GTAAA+AACTAAGGG/AAATCC/TA/GAATG/AAAA/GTATGA/GACTTTAGTTAATAAC/TAG/ATC/GTATCAG/ATATTGGTTCATTAAC/TTGTGG/ACAA-ATT-ATGTA-AGATATTAATAAG-CCAT-GTGAGACAC-ACTG/AATA/GG/TAAGATGTTAATAAG/TAGA/GGGAAACTA/GGGT+G+TG-C-GGC/GTAC/TATGGGAAA/CTCTCTG-CTTT-TT/AT/CTT/ATT/CTTG/CA/GCG/AATTTC/TTG/CTGTAAG/ATA/CA/TAAAAA/CA/TG/AA/TC/TG/CTAAAATAAAAC/A+G+TTTATTTA/TA/NAA + +That is matched with + +$ bigBedToBed -chrom=chr1 -start=4513 -end=7608 https://hgdownload.soe.ucsc.edu/hubs/GCA/009/914/755/GCA_009914755.4/bbi/GCA_009914755.4_T2T-CHM13v2.0.t2tRepeatMasker/chm13v2.0_rmsk.bb stdout +chr107536L1MC3#LINE/L1223+46637533094912,600,518,158,0,1001,108,392,3-1,4663,-1,5528,-1,6131,-1,7141,-151304 19.6 8.0 2.0 chr1 4664 5263 (248382065) + L1MC3 LINE/L1 4913 5546 (2239) 5 ,3544 24.6 6.8 0.7 chr1 5529 5686 (248381642) + L1MC3 LINE/L1 6065 6221 (1564) 5 ,3544 24.6 6.8 0.7 chr1 6132 7132 (248380196) + L1MC3 LINE/L1 6222 7294 (491) 5 ,1215 20.5 4.6 7.9 chr1 7142 7533 (248379795) + L1MC3 LINE/L1 7403 7782 (3) 5 +chr140824796LTR60B#LTR/ERV1273-40824533030,451,263-1,0,-1962 27.3 12.4 1.0 chr1 4083 4533 (248382795) C LTR60B LTR/ERV1 (0) 765 264 3 +chr140824837LTR60B#LTR/ERV1205-4533466003451,127,177-1,451,-505 20.5 7.9 0.0 chr1 4534 4660 (248382668) C LTR60B LTR/ERV1 (451) 314 178 4 +chr152675850MER34C_v#LTR/ERV1147+52745528036,254,322-1,7,-161403 14.7 2.0 0.8 chr1 5275 5528 (248381800) + MER34C_v LTR/ERV1 7 263 (322) 6 +chr156856131MSTA1#LTR/ERVL-MaLR148+56866131030,445,0-1,1,-12442 14.8 5.8 1.3 chr1 5687 6131 (248381197) + MSTA1 LTR/ERVL-MaLR 1 465 (0) 7 + +End of excerpt from 2 bigBed files in T2T that could be potential input in future examples (could be colors are wrong in this second file). + <h3 id="example2">Example #2</h2> <p> In this example, you will create a bigRmsk file from an existing bigRmsk input file, <em>bigRmsk.txt</em>, located on the UCSC Genome Browser http server.</p> <ol> <li> Save the bed3+1 example file, <a href="examples/bigRmsk.txt"><em>bigRmsk.txt</em></a>, to your computer (<em>Step 6</em>, above).</li> <li> Save the autoSql file <a href="examples/bigRmsk.as"><em>bigRmsk.as</em></a> to your computer (<em>Step 3</em>, above).</li> <li> Download the <a href="http://hgdownload.soe.ucsc.edu/admin/exe/"><code>bedToBigBed</code> utility</a> (<em>Step 4</em>, above).</li>