35d5fb35eb509dab6bbe84d0cf81a95d6122ee1d kent Mon Mar 19 15:35:15 2018 -0700 Updating a little in anticipation of training Chris Lee in our code conventions. diff --git src/README src/README index 328244e..be422cc 100644 --- src/README +++ src/README @@ -59,31 +59,31 @@ The code follows an indentation convention that is a bit unusual for C. Opening and closing braces are on a line by themselves and are indented at the same level as the block they enclose: if (someTest) { doSomething(); doSomethingElse(); } Each block of code is indented by 4 from the previous block. As per Unix standard practice, tab stops are set to 8, not 4 as is the common practice in Windows, so some care must be taken when using tabs for indenting. -Tabs continue to be a problem for the programmer even in 2012. +Tabs continue to be a problem for the programmer even in 2018. Currently our makefiles require tabs, while our python code forbids them. The C code can go either way so long as tabs are treated as advancing to the next multiple-of-eight column. Please consult local users of your favorite editor for help configuring it with these indentation and tab standards. Lines should be no more than 100 characters wide. Lines that are longer than this are broken and indented at least 8 spaces more than the original line to indicate the line continuation. Where possible simplifying techniques should be applied to the code in preference to using line continuations, since line continuations obscure the logic conveyed in the indentation of the program. Line continuations may be unavoidable when calling functions with long parameter lists. In most other situations lines can be shortened @@ -267,31 +267,53 @@ of the module, just after the module opening comment and any includes. This is followed by broadly used module local (static) variables. Less broadly used structs and variables may be grouped with the functions they are used with. If a module is used by other modules, it will be represented in a header file. In the majority of cases one .h file corresponds to one .c file. Typically the opening comment is duplicated in .h and .c files, as are the public structure and function declarations and opening comments. In general we try, with mixed success, to keep modules less than 2000 lines. Sadly many of the Genome Browser specific modules are currently quite long. On the bright side the vast majority of the library modules are reasonably sized. +PREVENTING STRING OVERFLOW + +One weakness of C in the string handling. It is very easy using standard C +library functions like sprintf and strcat to write past the end of the +character array that holds a string. For this reason instead of sprintf +we use the "safef" function, which takes an additional parameter, the size +of the character array. So + char buffer[50]; + sprintf(buf, "My name is %s", name); +becomes + char buffer[50]; + safef(buf, sizeof(buf), "My name is %s", name); +Instead of just silently overflowing the buffer and crashing cryptically +in many cases if the string is too long, say "Sahar Barjesteh van Waalwijk van Doorn-Khosrovani" +which is actually a real scientists name! + PREVENTING SQL-INJECTION In order to prevent SQL-Injection (sqli), we use primarily a special function called sqlSafef() to construct properly escaped SQL strings. The main article about preventing sqli is found here on genomewiki: http://genomewiki.ucsc.edu/index.php/Sql_injection_protection There are several other related and supporting functions to defeat sqli. The function reference is found here: http://genomewiki.ucsc.edu/index.php/Sql-injection_safe_functions -==================================================================== +CREATING NEW PROGRAMS + +By convention most of our command line programs follow a very simple structure. They are in +a directory by themselves which initially will just contain a .c file and a makefile. It +is easiest to use the program called "newProg" in our source tree to set this up. It'll create +a proper makefile, which is not simple, but is rarely done enough you're likely to forget it. +It also creates a skeleton for the C program including a usage message.