22059b2d2f20df8ee41bf31c62fedbe16496b650 vsmalladi Thu Mar 15 13:59:57 2012 -0700 Updating Style Guide and README to kent/src standards. Redmine #7162. diff --git python/style.txt python/style.txt index f84724c..d629e38 100644 --- python/style.txt +++ python/style.txt @@ -1,75 +1,210 @@ Style Guide for Python Code -Documentation Conventions +CODE CONVENTIONS -Use """ doc strings to embed comments for automated documentation generator: - http://epydoc.sourceforge.net/ +Follow the Python coding conventions laid out by the Python Style guide, except for the +UCSC Genomics Group specific conventions outlined below. + http://www.python.org/dev/peps/pep-0008/ + +INDENTATION AND SPACING + +Each block of code is indented by 4 spaces from the previous block. Do not use +tabs to separate blocks of code. The indentation convention differs from the C coding style +found in src/README, which uses 4-base indents/8-base tabs. Common editor configurations +for disallowing tabs are: + + vim: + Add "set expandtab" to .vimrc + + emacs: + Add "(setq-default indent-tabs-mode nil)" to .emacs + +Lines are no more than 100 characters wide. + +INTERPRETER DIRECTIVE + +The first line of any UCSC Python script should be: + #!/usr/bin/env python2.7 + +This line will invoke python2.7 found in the user's PATH. It ensures that scripts developed +by UCSC can be distributed and explicitly states which Python version was used to develop the scripts. + +NAMING CONVENTIONS + +Use mixedCase for symbol names: the leading character is not capitalized and all +successive words are capitalized. (Classes are an exception, see below.) Non-UCSC +Python code may follow other conventions and does not need to be adapted to +these naming conventions. + +Abbreviations follow rules in src/README: + + Abbreviation of words is strongly discouraged. Words of five letters and less should + generally not be abbreviated. If a word is abbreviated in general it is abbreviated + to the first three letters: + tabSeparatedFile -> tabSepFile + In some cases, for local variables abbreviating to a single letter for each word is okay: + tabSeparatedFile -> tsf + In rare, complex cases you may treat the abbreviation itself as a word, and only the + first letter is capitalized: + genscanTabSeparatedFile -> genscanTsf + Numbers are considered words. You would represent "chromosome 22 annotations" + as "chromosome22Annotations" or "chr22Ann." Note the capitalized 'A" after the 22. -Naming Conventions Packages and Modules - In Python, a package is represented as a directory with an __init__.py -file in it, and contains some number of modules, which are represented as -files with a .py extension. A module may in turn contain any number of related -classes and methods. This differs from Java, where one file correlates to one -class: in Python it is correct to treat one module similar to a whole -namespace in Java. - Packages and modules should have short names in lowercase, with no spaces or -underscores. An good example of this style is the ucscgenomics package: - - ucscgenomics/ + +In Python, a package is represented as a directory with an __init__.py file in it, +and contains some number of modules, which are represented as files with a .py extension. +A module may in turn contain any number of related classes and methods. This differs from Java, +where one file correlates to one class: in Python it is correct to treat one module similar to +a whole namespace in Java. + +Internal packages and modules should have short names in mixedCase, with no spaces or underscores. +A good example of this style is the ucscGenomics package: + + ucscGenomics/ __init__.py ra.py cv.py ... For more information: http://docs.python.org/tutorial/modules.html Imports - The most correct way to import something in Python is so that must be -identified by its containing module: + +The most correct way to import something in Python is by specifying its containing module: import os - from ucscgenomics import ra + from ucscGenomics import ra Then, the qualified name can be used: - somera = ra.RaFile() + somerRa = ra.RaFile() + + Do not import as below, as this may cause local naming conflicts: + from ucscGenomics.ra import RaFile + from ucscGenomics.track import * + +Imports should follow the structure: + + 1. Each import should be on a separate line, unless modules are from the same package: + import os + import sys + + from ucscGenomics import ra, track, qa + + 2. Imports should be at the top of the file. Each section should be separated by a blank line: + + a. standard library imports + + b. third party package/module imports + + c. local package/module imports For more information, see the "Imports" section: http://www.python.org/dev/peps/pep-0008/ - All lowercase names, no spaces. Underscores if it would improve -readability in modules, but not for use in packages. A package -contains many modules. All classes for a module exist within 1 -file. Structure follows python package and module standards. - Classes - CapitalCase names. Note the leading captial letter to distinguish between a -ClassName and a functionName. Underscores are not used, except for private -internal classes, where the name is preceded by double underscores which -Python recognizes as private. + +CapitalCase names. Note the leading capital letter to distinguish between a ClassName and +a functionName. Underscores are not used, except for private internal classes, +where the name is preceded by double underscores which Python recognizes as private. Methods - mixedCase names. The leading character is not captialized, but all -successive words are capitalized. Underscores are not used, except for private -internal methods, where the name is preceded by double underscores which -Python recognizes as private. + +mixedCase names. The leading character is not capitalized, but all successive words are +capitalized. Underscores are not used, except for private internal methods, +where the name is preceded by double underscores which Python recognizes as private. + +Functions + +mixedCase names. The leading character is not capitalized, but all +successive words are capitalized. Variables - lowercase names. Underscores are not used, except for private -internal variables, where the name is preceded by double underscores which -Python recognizes as private. - -Testing - Testing is carried out using the unittest module in python. This module -allows for self-running scripts which only need the following lines at the -bottom of the script: + +mixedCase names. Underscores are not used, except for private internal variables, +where the name is preceded by double underscores which Python recognizes as private. + +COMMENTING + +Note: Still working out which automation document tool to use. + +Automated documentation is carried out using the Epydoc tool: + http://epydoc.sourceforge.net/ + +Comments should follow the conventions: + + 1. Every module should have a paragraph at the beginning. Single class modules may + omit paragraph in favor of class comment. + + 2. Use Python's docstring convention to embed comments, using """triple double quotes""": + http://www.python.org/dev/peps/pep-0257/ + + 3. Use Epytext markup language conventions when commenting: + http://epydoc.sourceforge.net/epytext.html + + 4. Use Epytext field tags to describe specific properties of objects: + + Structure: + + a. Fields must be placed at the end of a docstring. + + b. Each field is distinguished by the following pattern: + @tag: body + @tag arg: body + + c. All blocks pertaining to a field must have equal indentation + greater than or equal to field tag indentation. + + d. Optional field tags to use: + + i. @param - Description of parameter to a function + + ii. @return - Description of a function's return value + + def exampleFunction(): + """ + This paragraph describes the object. + + @param inputFile: Input file name + + @return: This is a description of the function's return value + """ + + For more information and supported fields, see: + http://epydoc.sourceforge.net/fields.html#fields + +TESTING + +Testing is carried out using the unittest module in Python: + http://docs.python.org/library/unittest.html + +This module allows for self-running scripts, which are self-contained and should provide +their own input and output directories and files. The scripts themselves are composed of +one or more classes, all of which inherit from unittest.TestCase and contain one or more +methods which use various asserts or failure checks to determine whether a test passes or not. + + Structure: + + 1. At the start of a script import unittest module: + import unittest + + 2. A test case is created as a sub-class of unittest.TestCase: + class TestSomeFunctions(unittest.TestCase): + + 3. Test method names should begin with 'test' so that the test runner is + aware of which methods are tests: + def testSomeSpecificFuntion(self): + + 4. Define a setUp method to run prior to start of each test. + def setUp(self): + + 5. Define a tearDown method to run after each test. + def tearDown(self): + + 6. To invoke tests with a simple command-line interface add the add the following lines: if __name__ == '__main__': unittest.main() - The scripts themselves are composed of one or more classes, all of which -inherit from unittest.TestCase and contain one or more methods which use -various asserts or failure checks to determine whether a test passes or not. -Testing is self-contained, and should provide its own input and output -directories and files. - + For other ways to run tests see: + http://docs.python.org/library/unittest.html