35d5fb35eb509dab6bbe84d0cf81a95d6122ee1d
kent
  Mon Mar 19 15:35:15 2018 -0700
Updating a little in anticipation of training Chris Lee in our code conventions.

diff --git src/README src/README
index 328244e..be422cc 100644
--- src/README
+++ src/README
@@ -59,31 +59,31 @@
 
 The code follows an indentation convention that is a bit
 unusual for C.  Opening and closing braces are on
 a line by themselves and are indented at the same
 level as the block they enclose:
     if (someTest)
 	{
 	doSomething();
 	doSomethingElse();
 	}
 Each block of code is indented by 4 from the previous block.
 As per Unix standard practice, tab stops are set to 8, not 4
 as is the common practice in Windows, so some care must be
 taken when using tabs for indenting.  
 
-Tabs continue to be a problem for the programmer even in 2012.
+Tabs continue to be a problem for the programmer even in 2018.
 Currently our makefiles require tabs, while our python code forbids
 them. The C code can go either way so long as tabs are treated
 as advancing to the next multiple-of-eight column. Please consult local
 users of your favorite editor for help configuring it with these
 indentation and tab standards.
 
 Lines should be no more than 100 characters wide.  Lines that are 
 longer than this are broken and indented at least 8 spaces
 more than the original line to indicate the line continuation.
 Where possible simplifying techniques should be applied to the code 
 in preference to using line continuations, since line continuations
 obscure the logic conveyed in the indentation of the program.
 
 Line continuations may be unavoidable when calling functions with long
 parameter lists.  In most other situations lines can be shortened 
@@ -267,31 +267,53 @@
 of the module, just after the module opening comment and any includes.  
 This is followed by broadly used module local (static) variables.  Less 
 broadly used structs and variables may be grouped with the functions they 
 are used with.
 
 If a module is used by other modules, it will be represented in a header 
 file.  In the majority of cases one .h file corresponds to one .c file.
 Typically the opening comment is duplicated in .h and .c files, as are
 the public structure and function declarations and opening comments. 
 
 In general we try, with mixed success, to keep modules less than 2000 lines.
 Sadly many of the Genome Browser specific modules are currently quite long.
 On the bright side the vast majority of the library modules are reasonably
 sized.
 
+PREVENTING STRING OVERFLOW
+
+One weakness of C in the string handling.  It is very easy using standard C 
+library functions like sprintf and strcat to write past the end of the
+character array that holds a string.  For this reason instead of sprintf
+we use the "safef" function, which takes an additional parameter, the size
+of the character array.  So
+   char buffer[50];
+   sprintf(buf, "My name is %s", name);
+becomes
+   char buffer[50];
+   safef(buf, sizeof(buf), "My name is %s", name);
+Instead of just silently overflowing the buffer and crashing cryptically
+in many cases if the string is too long, say "Sahar Barjesteh van Waalwijk van Doorn-Khosrovani"
+which is actually a real scientists name!
+
 PREVENTING SQL-INJECTION
 
 In order to prevent SQL-Injection (sqli), we use primarily
 a special function called sqlSafef() to construct properly
 escaped SQL strings.  
 
 The main article about preventing sqli is found here on genomewiki:
 
 http://genomewiki.ucsc.edu/index.php/Sql_injection_protection
 
 There are several other related and supporting 
 functions to defeat sqli.  The function reference is found here:
 
 http://genomewiki.ucsc.edu/index.php/Sql-injection_safe_functions
 
-====================================================================
+CREATING NEW PROGRAMS
+
+By convention most of our command line programs follow a very simple structure.  They are in 
+a directory by themselves which initially will just contain a .c file and a makefile.  It
+is easiest to use the program called "newProg" in our source tree to set this up.  It'll create
+a proper makefile, which is not simple, but is rarely done enough you're likely to forget it.
+It also creates a skeleton for the C program including a usage message.