c9199234bbc31055553ff6303cdaf338bc647f3a kent Tue Dec 21 14:09:19 2021 -0800 Fixing typo and documenting new lookup function diff --git src/lib/strex.doc src/lib/strex.doc index 597eadb..4b126f6 100644 --- src/lib/strex.doc +++ src/lib/strex.doc @@ -1,26 +1,26 @@ The strex language is a small string expression evaluation language. This document describes its built in functions and operators: + returns the concatenation of the surrounding strings or the addition of surrounding numbers. Will convert a number to a string in mixed expressions = returns true if two items are the same. For strings ignores case != returns true if two items are not the same. For strings ignores case > Greater than. Converts mixed expressions to strings. Ignores case in strings >= Greater or equal. Converts mixed expressions to strings. Ignores case in strings -> Less than. Converts mixed expressions to strings. Ignores case in strings ->= Less or equal. Converts mixed expressions to strings. Ignores case in strings +< Less than. Converts mixed expressions to strings. Ignores case in strings +<= Less or equal. Converts mixed expressions to strings. Ignores case in strings [index] - selects a character from string given an integer zero based index. As in Python if index is negative it selects characters from the end of the string. -1 corresponds to the last character of string, as 0 corresponds to first. Returns empty string if index out of range. between(prefix, string, suffix) - returns the part of string found between prefix and suffix example: between("abc", "01234abcHelloxyz56789", "xyz") fetches just "Hello" If there are multiple places the prefix occurs, it will choose the first one, and the then the first place the suffix matches after that. The biologist might think of it as a text oriented PCR, though the primer prefix and suffixes are not included in the output. The prefix "" corresponds to beginning of string and the suffix "" corresponds to end. Returns empty string if nothing found. @@ -124,39 +124,44 @@ Also, a lot of the time when you do this it is to fix a small inconsistency in the metadata. In general fix is faster to execute and quicker to type than replace and the effects are more specific. example to help clean up minor variations in vocabulary fix(fix(fix(fix( sex, "M","male"), "F","female"), "Male","male") "Female","female") example to give something a value if non is present fix(requiredField, "", "reasonable default value") pick(query ? key1:val1, key2:val2, ... keyN:valN, default:defaultVal) Looks through keys for one that matches query, and returns associated value if it finds it. Otherwise it returns the empty string unless a default is specified. The default key can appear in any position, but traditionally is last or first. Can be used to apply different expressions to parsing in different conditions: pick(species, "human" : ethnicity, "mouse" : strain, default:"unknown") +lookup(string, twoColFileName, defaultVal) + Looks up a string in a table defined by twoColFileName, a tab separated file. + If string is in the first column, return the corresponding string from the second + column. Returns defaultVal if string not in lookup table. + (boolean ? trueVal : falseVal) This is the trinary conditional expression found in C, Python and many other languages. If the boolean before the question mark is true, then the result is the trueVal before the colon, otherwise it's the falseVal after the colon. Empty strings and zeros as booleans are considered false, other strings and numbers true. in(string, query) - returns true if query is a substring of string, false otherwise -same(a, b) - returns true if the two arguments are the same, false otherwise +same(a, b) - returns true if the two arguments are the same, false otherwise. Same as a=b starts_with(prefix, string) - returns true if string starts with prefix ends_with(string, suffix) - returns true if string ends with suffix or - logical or operation extended to strings and numbers. for logic - if any or-separated-values are true, return true, else false for numbers - if any or-separated non-zero numbers exist, return first one else 0 for strings - if any or-separated non-empty strings exists, return first one else "" In mixed operations result is converted to strings if strings are involved or the values "" and "true" if no strings are involved. For the pure string case this can be useful for setting defaults as well. For instance presuming you might or might not have filled in values for the city or country variables in the given expression that would return a location of some sort (city or country or "somewhere in the universe")