Subject: a new KNF (and some comments)
To: None <tech-misc@netbsd.org>
From: Luke Mewburn <lukem@cs.rmit.edu.au>
List: tech-kern
Date: 01/21/2000 11:50:29
Well, there seems to be considerable debate on what the KNF should be.
Some people seem to believe that we should totally revise the coding
style, some want it updated to ANSI, some don't want to change at all.

Here's what I propose:

* Update it to ANSI C (`c89').
  RATIONALE:
	It's ten years since c89 was released, and our compiler has always
	supported it.  Even with the __P() macros to appease K&R compilers,
	it appears that there are other c89 constructs used in our code (as
	I discovered when porting ftp(1) to HP/UX and having problems with
	a K&R compiler with char [] initialisation, amongst other things).
	Some people will argue that this would make porting netbsd to older
	platforms without an ANSI compiler more difficult. This indicates
	that we need to rethink the way that we port to a new system. Rather
	than:
		build kernel with native tools
		bootstrap userland
		port build environment
	we should be doing
		port build environment to cross compile
		build compiler with cross compiler
		build userland with cross compiler
	We have been offered the expertise of people who do this type of
	cross compiling for a living (e.g, Cygnus) if necessary.

* Update for current NetBSD practice in source code templates, such
  as copyright location, __RCSID macros, etc.
  RATIONALE:
	It's what we do, the style guide should reflect that.

* Define block macros as
	do { /* ... */ } while (0)
  instead of 
	{ /* ... */ }
  so that a developer can put a trailing `;' on a macro (and treat it like
  a function call).


Issues to still be resolved:

* Guidelines for defining elements in a struct. An important consideration
  is memory wastage if items have to be aligned. We need a way of specifying
  this in the document (look for `XXX' in the file below).


Other points:

* We are not considering changing our indenting style or indent level.
  This is a contentious issue and a lot of developers object to the
  alternate proposals made (no matter how many reports on code
  readability from academics you can drag up).
  We have ~ 25,500 files with ~ 8,200,000 lines of code, and we're not
  reformatting it.

* If this is approved (i.e, not shot down for really good reasons), I
  suggest that we 

	a) require new files follow the new guide
	b) modify old source code when appropriate.
	c) modify changed source files by:
		- convert original file to new style, and commit first
		- add new functionality, and commit separately.

The last point, of course, is contentious...


Here's the revised style guide with those changes:

--- cut here --- file: style
/* $NetBSD: style,v 1.11 1999/07/03 21:47:21 abs Exp $ */

/*
 * The revision control tag appears first.
 * Copyright text appears after the revision control tag.
 */

/*
 * Style guide for the NetBSD KNF (Kernel Normal Form).
 *
 *	from: @(#)style	1.12 (Berkeley) 3/18/94
 */
/*
 * An indent(1) profile approximating the style outlined in
 * this document lives in /usr/share/misc/indent.pro.  It is a
 * useful tool to assist in converting code to KNF, but indent(1)
 * output generated using this profile must not be considered to
 * be an authoritative reference.
 */

/*
 * Source code revision control identifiers appear after any copyright
 * text.  Use the appropriate macros from <sys/cdefs.h>.  Usually only one
 * source file per program contains a __COPYRIGHT() section.
 */
#include <sys/cdefs.h>
#ifndef lint
__COPYRIGHT("@(#) Copyright (c) 1999\n\
	The NetBSD Foundation, inc. All rights reserved.\n");
__RCSID("$NetBSD$");
#endif /* not lint */

/*
 * VERY important single-line comments look like this.
 */

/* Most single-line comments look like this. */

/*
 * Multi-line comments look like this.  Make them real sentences.  Fill
 * them so they look like real paragraphs.
 */

/*
 * Kernel include files come first; normally, you'll need <sys/types.h>
 * OR <sys/param.h>, but not both!  <sys/types.h> includes <sys/cdefs.h>,
 * and it's okay to depend on that.
 */
#include <sys/types.h>		/* Non-local includes in brackets. */

/* If it's a network program, put the network include files next. */
#include <net/if.h>
#include <net/if_dl.h>
#include <net/route.h>
#include <netinet/in.h>
#include <protocols/rwhod.h>

/*
 * Then there's a blank line, followed by the /usr include files.
 * The /usr include files should be sorted!
 */
#include <stdio.h>

/*
 * Global pathnames are defined in /usr/include/paths.h.  Pathnames local
 * to the program go in pathnames.h in the local directory.
 */
#include <paths.h>

/* Then, there's a blank line, and the user include files. */
#include "pathnames.h"		/* Local includes in double quotes. */

/*
 * ANSI function declarations for private functions (i.e. functions not used
 * elsewhere) go at the top of the source module.  Only the kernel has a name
 * associated with the types.  I.e. in the kernel use:
 *	void function(int a);
 * in user-land use:
 *	void function(int);
 *
 * Use the __P macro from the include file <sys/cdefs.h> for prototypes
 * in header files, for compatibility with non-ANSI compilers.  I.e,
 *	void function __P((int));
 */
static char	*function(int, int, float, int);
static void	usage(void);

/*
 * Macros are capitalized, parenthesized, and should avoid side-effects.
 * If they are an inline expansion of a function, the function is defined
 * all in lowercase, the macro has the same name all in uppercase.  If the
 * macro needs more than a single statement, use do { ... } while (0), so
 * that * a trailing semicolon works.  Right-justify the backslashes; it
 * makes it easier to read.
 */
#define	MACRO(x, y)	do {						\
	variable = (x) + (y);						\
	(y) += 2;							\
} while (0)

/* Enum types are capitalized. */
enum enumtype {
	ONE,
	TWO
} et;

/*
 * When declaring variables in structures, declare them sorted by use, then
 * by size, and then by alphabetical order.  The first category normally
 * doesn't apply, but there are exceptions.
 *	XXX: change the above
 * Each variable gets its own line.
 * Attempt to line-up the entries, using appropriate tabs.
 *
 * Major structures should be declared at the top of the file in which they
 * are used, or in separate header files, if they are used in multiple
 * source files.  Use of the structures should be by separate declarations
 * and should be "extern" if they are declared in a header file.
 */
struct foo {
	struct foo	*next;		/* List of active foo */
	struct mumble	amumble;	/* Comment for mumble */
	int		bar;
};
struct foo *foohead;		/* Head of global foo list */

/* Make the structure name match the typedef. */
typedef struct bar {
	int	level;
} BAR;

/*
 * All major routines should have a comment briefly describing what
 * they do.  The comment before the "main" routine should describe
 * what the program does.
 */
int
main(int argc, char *argv[])
{
	extern char *optarg;
	extern int optind;
	long num;
	int ch;
	char *ep;

	/*
	 * For consistency, getopt should be used to parse options.  Options
	 * should be sorted in the getopt call and the switch statement, unless
	 * parts of the switch cascade.  Elements in a switch statement that
	 * cascade should have a FALLTHROUGH comment.  Numerical arguments
	 * should be checked for accuracy.  Code that cannot be reached should
	 * have a NOTREACHED comment.
	 */
	while ((ch = getopt(argc, argv, "abn")) != -1)
		switch (ch) {		/* Indent the switch. */
		case 'a':		/* Don't indent the case. */
			aflag = 1;
			/* FALLTHROUGH */
		case 'b':
			bflag = 1;
			break;
		case 'n':
			num = strtol(optarg, &ep, 10);
			if (num <= 0 || *ep != '\0')
				errx(1, "illegal number -- %s", optarg);
			break;
		case '?':
		default:
			usage();
			/* NOTREACHED */
		}
	argc -= optind;
	argv += optind;

	/*
	 * Space after keywords (while, for, return, switch).  No braces are
	 * used for control statements with zero or only a single statement.
	 *
	 * Forever loops are done with for's, not while's.
	 */
	for (p = buf; *p != '\0'; ++p);
	for (;;)
		stmt;

	/*
	 * Parts of a for loop may be left empty.  Don't put declarations
	 * inside blocks unless the routine is unusually complicated.
	 */
	for (; cnt < 15; cnt++) {
		stmt1;
		stmt2;
	}

	/* Second level indents are four spaces. */
	while (cnt < 20)
		z = a + really + long + statment + that + needs + two lines +
		    gets + indented + four + spaces + on + the + second +
		    and + subsequent + lines;

	/*
	 * Closing and opening braces go on the same line as the else.
	 * Don't add braces that aren't necessary.
	 */
	if (test)
		stmt;
	else if (bar) {
		stmt;
		stmt;
	} else
		stmt;

	/* No spaces after function names. */
	if (error = function(a1, a2))
		exit(error);

	/*
	 * Unary operators don't require spaces, binary operators do.  Don't
	 * use parenthesis unless they're required for precedence, or the
	 * statement is really confusing without them, such as:
	 * a = b->c[0] + ~d == (e || f) || g && h ? i : j >> 1;
	 */
	a = ((b->c[0] + ~d == (e || f)) || (g && h)) ? i : (j >> 1);
	k = !(l & FLAGS);

	/*
	 * Exits should be 0 on success, and 1 on failure.  Don't denote
	 * all the possible exit points, using the integers 1 through 300.
	 * Avoid obvious comments such as "Exit 0 on success."
	 */
	exit(0);
}

/*
 * The function type must be declared on a line by itself
 * preceeding the function.
 */
static char *
function(int a1, int a2, float fl, int a4)
{
	/*
	 * When declaring variables in functions declare them sorted by size,
	 * then in alphabetical order; multiple ones per line are okay.
	 * Function declarations (which are ANSI style) should go in the
	 * include file "extern.h".  If a line overflows reuse the type keyword.
	 *
	 * DO NOT initialize variables in the declarations.
	 */
	extern u_char one;
	extern char two;
	struct foo three, *four;
	double five;
	int *six, seven;
	char *eight, *nine, ten, eleven, twelve, thirteen;
	char fourteen, fifteen, sixteen;

	/*
	 * Casts and sizeof's are not followed by a space.  NULL is any
	 * pointer type, and doesn't need to be cast, so use NULL instead
	 * of (struct foo *)0 or (struct foo *)NULL.  Also, test pointers
	 * against NULL. I.e. use:
	 *
	 *	(p = f()) == NULL
	 * not:
	 *	!(p = f())
	 *
	 * Don't use '!' for tests unless it's a boolean. E.g. use
	 * "if (*p == '\0')", not "if (!*p)".
	 *
	 * Routines returning void * should not have their return values cast
	 * to any pointer type.
	 *
	 * Use err/warn(3), don't roll your own!
	 */
	if ((four = malloc(sizeof(struct foo))) == NULL)
		err(1, NULL);
	if ((six = (int *)overflow()) == NULL)
		errx(1, "Number overflowed.");
	return (eight);
}

/*
 * Use ANSI function declarations.  ANSI function braces look like
 * old-style (K&R) function braces.
 */
void
function(int a1, int a2)
{
	/* ... */
}

/*
 * Functions that support variable numbers of arguments should look like this.
 */
#include <stdarg.h>

void
vaf(const char *fmt, ...)
{
	va_list ap;

	va_start(ap, fmt);
	STUFF;
	va_end(ap);		/* No return needed for void functions. */
}

static void
usage(void)
{	/* Insert an empty line if the function has no local variables. */

	/*
	 * Use printf(3), not fputs/puts/putchar/whatever, it's faster and
	 * usually cleaner, not to mention avoiding stupid bugs.
	 *
	 * Usage statements should look like the manual pages.  Options w/o
	 * operands come first, in alphabetical order inside a single set of
	 * braces.  Followed by options with operands, in alphabetical order,
	 * each in braces.  Followed by required arguments in the order they
	 * are specified, followed by optional arguments in the order they
	 * are specified.  A bar ('|') separates either/or options/arguments,
	 * and multiple options/arguments which are specified together are
	 * placed in a single set of braces.
	 *
	 * "usage: f [-ade] [-b b_arg] [-m m_arg] req1 req2 [opt1 [opt2]]\n"
	 * "usage: f [-a | -b] [-c [-de] [-n number]]\n"
	 */
	(void)fprintf(stderr, "usage: f [-ab]\n");
	exit(1);
}
--- cut here ---