2d795fa03e0ef57637885b4ddf9d13832143799b kent Fri Aug 16 11:55:57 2019 -0700 Adding tidy builtin function. diff --git src/lib/strex.doc src/lib/strex.doc index 537b98e..afa7368 100644 --- src/lib/strex.doc +++ src/lib/strex.doc @@ -39,30 +39,46 @@ [3] = "3" - the fourth character (the fun of zero based indexes [-3] = "7" - the third character from the end [-3:] = "789" - last three characters of string [:-3] = "0123456" - everything up to the last three Python actually goes further than this and allows a third, step, specification that strex has not implemented. untsv(string, index) - separate by tab. Synonym for separate(string, '\t', index) uncsv(string, index) - do comma separated value extraction of string. Includes quote escaping. trim(string) - returns copy of string with leading and trailing spaces removed strip(string, toRemove) - remove all occurrences of any character in toRemove from string +tidy(prefix, string, suffix) - helps trim unwanted ends off of a string. If prefix is present + in the string, the prefix and everything before it will be cut off. Blank prefixes + have no effect. Similarly if the suffix is present in what is left of the string + after the prefix is trimmed, then the parts of the string from where the suffix + starts will be cut off. BLank suffixes have no effect. + example: tidy("", "myreads.fastq.gz", ".gz") + returns "myreads.fastq" + example: tidy("", "myreads.fastq", ".gz") + returns "myreads.fastq" + example: tidy("my/", "deep/path/to/my/reads.fastq.gz", ".gz") + returns "reads.fastq" + example: tidy("my/", "deep/path/to/my/reads.fastq.gz", ".fastq") + returns "reads" + example: tidy("my/", "deep/path/to/your/reads.fastq.gz", ".fastq") + returns "deep/path/to/your/reads" + upper(string) - returns all upper case version of string lower(string) - returns all lower case version of string md5(string) - returns an MD5 sum digest/hash of string. symbol(prefix, string) - turn string into a computer usable symbol that starts with the given prefix. To create the rest of the symbol, the string is mangled. First the spaces, tabs, and newlines are all turned into _ chars, then any remaining characters that aren't ascii letters or numerical digits are removed. If the result is 32 characters or less it's used, but if it's longer it's converted into an MD5 sum. replace(string, oldPart, newPart) - returns string with all instances of old replaced by new. The cases where either old or new are empty string are useful special cases.