SQLite coding conventions 2003-05-25

This is a collection of my observations about the coding style used in SQLite.
It is illustrated with code fragments from the SQLite source.

Reasons for the style choices are placed in italics, after each rule. Although many of the choices are apparently the personal preferences of D. Richard Hipp, many are required for general C portability to older C compilers, or to maximize the amount of code that can appear on the screen. This style appears to date from the time of 24x80 character displays.

Comments

- The bulk of all comments are outside of functions, immediately before the declaration or definition.
/*
** Each open SQLite database is represented by an instance of the
** following opaque structure.
*/
- Most comments have start and end delimiters on their own lines.
/*
** The version of the SQLite library.
*/
- Even single-line comments at file scope have delimiters on their own lines.
*************************************************************************
- Comment separator lines are 73 characters long.
int sqlite_function_type(
  sqlite *db,               /* The database where the function is registered */
  const char *zName,        /* Name of the function */
  int datatype              /* The datatype for this function */
);
- Function arguments are commented inside the argument list in both the prototype and definition.
    pTab->pTrigger = pTrig;  /* Re-enable triggers */
- Single-line comments are allowed to follow statements on a line.
  /* TODO: Do some validity checks on all fields.  In particular,
  ** make sure fields do not contain NULLs. Otherwise we might core
  ** when attempting to initialize from a corrupt database file. */
- Comments inside function bodies are indented to the same level as the code.
- The start comment delimiter can be on the same line as comments when inside a function body.
- The end comment delimiter can be on the same line as comments when followed by a blank line.
- If the final delimiter is on the same line as a comment, then the first one should be as well.
  /* The SELECT was successfully coded.   Set the return code to 0
  ** to indicate no errors.
  */
- Multi-line comments wrap at the end (>70 chars) and are written as paragraphs.
- Sentences that wrap don't start a new line even when this is feasible.
- New sentences usually have two spaces after the preceding period.
- Sentences continuing on the second line are left justified, not indented.
- Full sentences in comments are initially capitalized.
 * The "step_list" member points to the first element of a linked list
 * containing the SQL statements specified as the trigger program.
- variable names in comments are placed in double quotes.

Headers

- The public interface to SQLite is entirely in "sqlite.h".
- The private internal interface to SQLite is in "sqliteInt.h".
- Most C files include "sqlite.h" and "sqliteInt.h". These should be included before other SQLite headers.
- The utility makeheaders is used to automatically generate header files, so these aren't usually created unless necessary.
/*
** 2001 September 15
**
** The author disclaims copyright to this source code.  In place of
** a legal notice, here is a blessing:
**
**    May you do good and not evil.
**    May you find forgiveness for yourself and forgive others.
**    May you share freely, never taking more than you give.
**
*************************************************************************
** This header file defines the interface that the sqlite page cache
** subsystem.  The page cache subsystem reads and writes a file a page
** at a time and provides a journal for rollback.
**
** @(#) $Id: pager.h,v 1.17 2002/08/12 12:29:57 drh Exp $
*/
- Starts with date in canonical format.
- Followed by SQLite license.
- Short description of the contents of the file.
- Code used by CVS. (Do not edit manually)
#ifndef _SQLITE_HASH_H_
#define _SQLITE_HASH_H_

#endif /* _SQLITE_HASH_H_ */
- Header files use a guard #ifdef of form _SQLITE_filename_H.
typedef struct Pager Pager;
- Headers use an opaque struct whenever possible. These are declared in the header, but not defined.
- Typedefs are used to avoid having to use the struct keyword everywhere.
- The structs themselves are usually then defined in the C implementation file.

Naming Conventions

Standard Variable Names
sqlite *db;          /* database connection */
int rc;              /* error return code */
sqlite_func *pfn;    /* sqlite user function */
char *zErrMsg;       /* error string */
char **pzErrMsg;     /* error string return value */
Simplified Hungarian Prefixes
int nArg;                           /* integer */
char *zName;                        /* zero-terminated string */
void (*xFinalize)(sqlite_func*);    /* function pointer */
Pager *pPager;                      /* pointer */
Pager **ppPager;                    /* pointer to pointer */
Function Names
void sqliteHashClear(Hash*);
- Internal function names usually are "camel-case", with no underscores used.
- Internal function names which are part of an API begin with "sqlite".
void sqlite_free_table(char **result);
- All exported functions use the standard C naming convention -- all lower case, with underscores.
- All public API functions start with "sqlite_". Those that start with just "sqlite" (no underscore) are internal API functions.
- No function ever starts with "SQLite" (note case). This capitalization is only used for the name of the library itself, not for any code.
int sqliteBtreeCreateTable(Btree*, int*);
int sqliteBtreeCreateIndex(Btree*, int*);
int sqliteBtreeDropTable(Btree*, int);
int sqliteBtreeClearTable(Btree*, int);
- Internal APIs start with a common prefix that includes the API name.
- The API name is usually the name of the C file. The btree functions are in "btree.c", for example.

Macro and Symbol Names
#define MASTER_NAME   "sqlite_master"

#define P3_DYNAMIC    1   /* Pointer to a string obtained from sqliteMalloc() */
#define P3_STATIC   (-1)  /* Pointer to a static string */
- Symbolic constant names are upper case with underscores.
- Constants use #define instead of enum. Portability.
- Constant expressions use parentheses to prevent problems during macro expansion.
- Unsigned integers and string literals aren't parenthesized.
#define SQLITE_ISO8859 1
- Publicly visible symbols in the main distribution start with SQLITE_.
#define STK_Null      0x0001   /* Value is NULL */
#define STK_Str       0x0002   /* Value is a string */
#define STK_Int       0x0004   /* Value is an integer */
- Symbolic constants for bitflags start with a common, upper case prefix.
- The name part of the flags are capitalized and "camel case".
- Bitflag constant values are in hexadecimal.
#define ARRAYSIZE(X)  (sizeof(X)/sizeof(X[0]))
#define sqliteHashFirst(H)  ((H)->first)
- Macro functions are upper case.
- Macro implementations of functions in an internal API follow the naming scheme for that API.

Struct Names
typedef struct sqlite sqlite;

typedef struct Column Column;
typedef struct Table Table;
- struct names exposed in a header file are "camel case".
- the only exception is struct sqlite.

Layout and Style

- Skipping lines between separate statements is rare, and used for separating blocks of statements.
- In functions that don't use named error return values, 1 is used for error, and 0 for okay.
    if( res ){
      id->locked = lk;
      rc = SQLITE_OK;
    }else{
      rc = SQLITE_BUSY;
    }
- Lines are indented 2 spaces each. No tabs are used.
- Braces '{' come directly after the closing parenthesis of an expression for: functions, structs, if(), while(), else,...
- Braces '{' are used even for a one-line if() statement. Required for portability to old C compilers.
- There is no space between if, while,... and the following parenthesis.
  if( /* expression */){
    /* multiple statements */
  }else

  if( /* expression */){
    /* multiple statements */
  }else

  {
    /* final case statements */
  }
Long sequences of if-else if blocks at outer scope can have their own syntax:
- Each else is separated from the following if() by a blank line.
- The final case is also separated by a blank line.
- This style is only used when each block spans enough lines that they need to be separated for readability.
  if( pBt->inTrans==0 && pBt->pCursor==0 && pBt->page1!=0 ){
- Compound conditional expressions usually aren't split unless they would wrap.
- Complex expressions (those involving a binary operator or function call) are separated from enclosing parentheses by a space.
- In compound expressions spacing is often used to show operator precedence, as here.
    if( strcmp(pP1->zMagic,zMagicHeader)!=0 ||
          (pP1->iMagic!=MAGIC && swab32(pP1->iMagic)!=MAGIC) ){
- Compound conditional expressions are split with the binary operator at the end of the line.
    while( isspace(z[++i]) ){}
- Empty statements are denoted with a pair of empty brackets.
     "CREATE TABLE sqlite_master(\n"
     "  type text,\n"
     "  name text,\n"
     "  tbl_name text,\n"
     "  rootpage integer,\n"
     "  sql text\n"
     ")"
     ;
- Constant SQL strings are split onto multiple lines. readability.
- The semicolon following a multiline literal string is placed on a separate line. Makes editing the SQL lines easier.
int sqliteBtreeKey(BtCursor *pCur, int offset, int amt, char *zBuf){
- Function arguments are on one line if they aren't commented inline.
  Cell *pCell;
  MemPage *pPage;
  int nData;
- Multiple variable declarations aren't aligned -- a single space separates the type from the variable name.
- The asterisk of a pointer declaration is placed next to the name (not Cell* pCell). Prevents a coding error -- Cell* a,b; doesn't declare two pointers.
- In anonymous declarations and casts, there is no space between the type and the following asterisk.
  if( p==0 ) return 0;
- Explicitly test against zero (don't use !p).
- 0 is used instead of symbolic constant NULL.
- Very simple one line statements following an if() can have their braces omitted.
    goto fk_end;
    ...
fk_end:
  ...
}
- goto labels are never indented. They always start in column 0.
- label names have standard endings "_end", "_error", "_exception", "_cleanup", "_exit".

Portability

- Code should compile on Un*x and Windows, with no modifications.
- Symbols OS_UNIX and OS_WIN are used to separate operating system-specific behavior.
- All os-specific behavior should be in "os.c" and "os.h", and wrapped in generic functions sqliteOsFnName().
- All filenames have 8.3 format.
int sqliteOsDelete(const char *zFilename){
#if OS_UNIX
  unlink(zFilename);
#endif
#if OS_WIN
  DeleteFile(zFilename);
#endif
  return SQLITE_OK;
}
- Inside an os-dependent section, the os-specific sections are usually duplicated.
- No #else or #elif cases are used. Only explicit #if ostype ... #endif sections.
- Common code at the beginning or end of the function can be outside of the #ifdef blocks.

Testing

- Use symbol SQLITE_TEST to #ifdef out behavior used only in debugging or testing.
- When possible, unit test code should be included.
- tcl test scripts are in directory "\sqlite\test".
- Use assert() to check that the library is being used properly. It is used by programmers not familiar with the code, so they should be protected if possible. This also leads to more robust code.

General

- Correctness and verifiability come before efficiency -- this library is used in critical applications.
- Code must be released under the SQLite license for inclusion in the main distribution.
- All compiler warnings should be removed before submitting code. Test with the highest warning level available for your compiler.
- New features should be in a new file when possible, and made as independent as possible.
- Functions for internal use in a file should be declared as static. This keeps the global namespace clean, and is used by makeheaders to decide when to generate a prototype.
- Forwards declarations are not generally used in a C file. The utility makeheaders is relied on to automatically generate header files, so these aren't usually created unless necessary.
- The tools awk, lemon, and tcl are used in building sqlite and its docs, and should be used when they make for cleaner, more maintainable code.