02336754147822f5aa61ba13277123b2cc629001 markd Thu May 20 08:38:55 2021 -0700 Moved pslMap, pslMapPostChain, pslRc, pslSwap to src/utils, as they do not have hg/lib dependencies. diff --git src/hg/utils/pslMap/README src/hg/utils/pslMap/README deleted file mode 100644 index abeab26..0000000 --- src/hg/utils/pslMap/README +++ /dev/null @@ -1,64 +0,0 @@ - -Overview: - -pslMap is an alignment tool than combines two alignment, sharing a common -sequence, to produce another alignment. This is an implementation of the -TransMap alignment algorithm on PSL files. - -Given alignments of - a to b - b to c -it produces an alignment of a to c by projecting through b. This differs from -liftOver in that it does a base-by-base mapping, which may insert or delete -bases within a block. The mappings produced by lifeOver are block-level, -which may expand or contract the size of blocks. - -Description: - - pslMap [options] inPsl mapFile outPsl - -The pslMap program takes alignments the PSL format alignment in the inPsl file -and projects the alignments through the overlapping mapFile alignments, which -can be in PSL or chain format. The resulting alignments are written to outPsl -file in PSL format. - -The target side of inPsl must be the same set of sequences as the query side -of mapFile. Input alignments with target sequence names that don't match any -mapping file query sequence names are discarded. If the matching target and -query name have different sequence sizes, an error will be generated. The -options -swapMap and -swapIn can be used to swap the query and target sides of -either alignment if the are not in the required orientation. - -Special handling is provide to support mapping proteins to a genome. If the -input PSL is a protein to DNA PSL, the protein coordinates will be converted -to CDS coordinates in the output PSL. That is, each coordinate in the protein -will be multiplied by three. This is used when mapping proteins to the genome -by mapping protein to mRNA alignments with mRNA to genome alignments. Unlike -most protein to genome alignment process, the resulting CDS to genome -alignment is able to represent amino acids that are coded for by spliced -codons. - - -Examples: - -Mapping cDNAs between organism using syntenic chains of genomic alignments: - - # map a PSL file of mouse cDNA to genomic (mm8) alignments to hg18 in - # mmCDna.mm8.psl - - # chains are mm8 query to hg18 target - chainDir=/cluster/data/hg18/bed/blastz.mm8/axtChain - - # create a subset of syntenic chains for - netFilter -syn $chainDir/hg18.mm8.net.gz >hg18.mm8.syn.net - netChainSubset -wholeChains hg18.mm8.syn.net $chainDir/hg18.mm8.all.chain.gz hg18.mm8.syn.chain - - pslMap -chainMapFile mmCDna.mm8.psl hg18.mm8.syn.chain mmCDna.hg18.psl - - # depending on desired results, pslCDnaFilter can be used to filter results - -Citation: - Jingchun Zhu, J. Zachary Sanborn, Mark Diekhans, Craig B. Lowe, Tom H. Pringle, and David Haussler. - Comparative genomics search for losses of long-established genes on the human lineage. - PLoS Computational Biology, 3:e247 EP , Dec 2007. - http://dx.doi.org/10.1371/journal.pcbi.0030247