tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

syslog(3) behaviour for syslog-protocol



Hello,
as you might have read in my previous posts I am working to implement a new message format for syslog(3) and syslogd(8).

Now I would like to describe my modified syslog(3) behaviour and propose it as default behaviour for systems with syslog-protocol support.


= syslog-protocol format =

The syslog-protocol format (specified by internet draft
http://tools.ietf.org/html/draft-ietf-syslog-protocol) extends the
traditional syslog line by:
- using ISO timestamps,
- using FQDN if available,
- a new message ID field (similar to a Windows Eventlog ID),
- a new Structured Data field to encode information in key=value format.

Examples:
2008-07-26T12:03:31+02:00 host.example.org tag/subtag 4850 msgid [exampleSDID@0 iut="3"] message
2008-07-26T12:03:42+02:00 host.example.org tag - - - message only
(Note that the first line uses all fields while the second has no PID,
MSGID and SD so these fields contain a '-'.)


= Chosen Behavior =

Now there are obviously two goals in implementing this:
1. Keep the syslog(3) API,
2. Provide easy access to the new MSGID and SD fields.

My approach is to detect syslog-protocol messages by looking at the
first words of a message and test whether they have the syntax of MSGID
and SD fields. The SD is the important field here since its syntax with
brackets and quotes should be complex enough to distinguish old and new
style messages. The MSGID is just one optional ASCII word before the SD.

If a possible SD and optionally a MSGID is found then they are put into
the corresponding protocol fields; otherwise the whole text is seen as
the message and MSGID and SD fields are left empty, i.e. filled with the
NILVALUE '-'.

Some examples of intended usage:
syslog("%s", "hello world");
    --> 'normal' behaviour, message only
syslog("%s", "[ID@0 key=\"value\"] hello world");
    --> with SD
syslog("%s", "hw [ID@0 key=\"value\"] hello world");
    --> with MSGID and SD
syslog("%s", "hw - hello world");
    --> with MSGID and empty SD
syslog("%s", "[ID@0 key=value] world");
    --> due to missing quotes no valid SD, so message only


= Possible Problems =

There probably are a few existing messages which will be misinterpreted.
The biggest problem are minus signs ('-') treated as empty fields and
single words or IPs in brackets treated as SDs.
Examples:
syslog("%s", "value1 - value2 + value3");
    --> with MSGID and empty SD
syslog("%s", "- value2 + value3");
    --> without MSGID and with empty SD
syslog("%s", "hello [2001:db8::1428:57ab]");
    --> with MSGID and SD, no message field

On my own systems I have found such misinterpreted from one program (eAccelerator).

Consequences:
I do not consider the misinterpretations as big problems because they should be easy enough to fix and they have no immediate consequences. When written to a logfile the last example above will just result in the log line: 2008-07-26T12:03:42+02:00 host.example.org tag 123 hello [2001:db8::1428:57ab]
instead of:
2008-07-26T12:03:42+02:00 host.example.org tag 123 - - hello [2001:db8::1428:57ab]

So all text-based processing (like "grep 'hello \['") will continue to
work as before. The difference will only show in more advanced log
processing when the fields are parsed seperately and/or stored in
databases instead of textfiles.


= Alternatives =

Besides ignoring the syslog-protocol fields the alternative is to define a new flag for openlog() to enable the new behaviour. This would be a system-specific extension, leading to portability issues and leaving little incentive to actually use it.


= References =

- syslog-protocol: http://tools.ietf.org/html/draft-ietf-syslog-protocol
- Code is available at: http://barney.cs.uni-potsdam.de/svn/syslogd/trunk/src/libc_gen/syslog.c - Diff: http://barney.cs.uni-potsdam.de/trac/syslogd/changeset?new=trunk/src/libc_gen/syslog.c%40109&old=trunk/src/libc_gen/syslog.c%401

--
Martin



Home | Main Index | Thread Index | Old Index