Subject: Re: [long] Re: RFC: merge chap-midi branch
To: Alexandre Ratchov <alex@caoua.org>
From: Chapman Flack <nblists@anastigmatix.net>
List: tech-kern
Date: 06/29/2006 22:18:09
Alexandre Ratchov wrote:
> i really like these one liners. But imho, users may want to have
> something as close as possible to the "midi 1.0 specification". This
> spec is here since more than 20 years and is still actively used by
> modern hardware. It is also mature and quite well documented.
> ...
> - apps that want something close to the "midi 1.0 specification" should
>   have access to the byte stream (/dev/rmidi or similar)

You are here drawing close to suggesting that what midi(4) provides is
distant from the MIDI specification; we may be failing to communicate.

When you open an rmidi device, what you are allowed to write on it is
precisely what would be valid to write on a MIDI 1.0 DIN cable. Your
write will incur EPROTO if-and-only-if you attempt to write anything
else.

Nothing that you read from the device will be in any format or have any
meaning other than as defined in the MIDI spec.

With those points quickly out of the way, I think I begin to better
see where you are coming from, after looking more closely at the
OpenBSD sources. Lennart's work was the origin of both the NetBSD
and the OpenBSD MIDI stack, and was crafted to provide at least
partial ABI compatibility with applications developed for OSS. We
should keep in mind when we speak of a "MIDI" API or ABI, that the
MMA has never defined any such thing, but merely a set of messages
and what they mean (and one possible physical layer representation).
All of the details of /how/ you will send and receive them, such as
whether reads/writes on device nodes are used at all, or rather, say,
a new socket protocol family, also have to be specified. In our case,
Lennart followed OSS prior art.

That's important because it establishes that this interface was
never an empty byte-pipe down to a UART. In fact, for all of the talk
about "raw," the device nodes used in OSS are not named with an 'r',
and I am not exactly sure why Lennart added one. Maybe he remembers. :)

A significant change took place in OpenBSD almost two years ago exactly,
when a major rewrite of its midi stack did away with virtually all of
the protocol processing done in midi(4) and introduced instead the
empty-pipe semantics that you are advocating here. A later corresponding
update to OpenBSD's midi.4 man page changed the title from "device-
independent MIDI driver layer" to "raw device independent interface to
MIDI ports" and added the language "Data received on the input port is
not interpreted ... data issued by the user program is sent as-is...."
The source of the changes was Alexandre Ratchov.

So the question we are talking about is not so much:

  Why did Chap diverge from an empty-pipe "raw" semantics?

as:

  Why did Chap choose to refine the existing OSS semantics rather than
  to introduce the raw model brought by Alexandre into OpenBSD?

... and we've covered some of the reasons earlier in this thread.

> If you make /dev/rmidiN event driven, there will be 2 event driven APIs
> (/dev/rmidi and /dev/sequencer) and no simple byte stream API, as
> described in the midi spec.
> 
> - naive apps that don't want to parse the byte stream should use the
>   event orientated device (/dev/sequencer)
>
>>From application point of view, parsing the stream is not hard. It will
> be sad to disable access to the byte stream because some buggy apps are
> misusing /dev/rmidiN instead of using the event API (/dev/sequencer).

Some things seem to need untangling here. The sequencer is a timing and
demultiplexing agent for MIDI messages. An app that wants those services
should use it. An app that simply wants to produce or consume a stream
of MIDI messages will quite appropriately use the rmidi devices.
Communication with the sequencer involves passing event structs defined
in sys/midiio.h; nothing of the sort happens with the rmidi ports, and
MIDI messages are written and read without any extraneous goo. I am not
sure why you call that "event driven", or where you get the idea to call
an app "buggy" or say it is "misusing" an rmidi port if it has only MIDI
messages to read and write and no need for sequencer services.

[There /is/ event-like functionality coming to the [r]midi devices
in OSS4, and I think we should compatibly support it when it does,
because it will have advantages over the current sequencer. But that
is not what we are talking about now; it is not yet implemented, and
will never be the default; an explicit ioctl requests it.]

Perhaps you are thinking that the reason my change for Active Sense
handling differed from yours was a concession to "buggy" apps. Clearly
we both saw that the original OSS approach had to change: you can't just
swallow the message and leave the app without any way to find out when
communication fails. We chose different ways to fix that, yours by
passing the keepalives unprocessed for the app to do its own bookkeeping
and timeout, mine by telling the app if they've stopped.

Strictly, both are changes to the original behavior: yours exposes the
app constantly to a change in its possible input; mine does only when
something bad has already happened. I do not think of this as catering
to buggy apps, because it would be gratuitous to call an OSS app buggy
for expecting OSS input, and I may not assume any OSS app has been well
tested against input OSS would not produce. So I looked for an approach
with low probability of breaking an OSS app, that would also be easy to
write new code for, and yield reasonable behavior with simple shell
tools and the like.

Clearly, OSS legacy considerations were of less interest to you in
your changes for OpenBSD, and I respect that. It's just a different
approach.

> Finally, note that some other OS are using the byte stream for their
> raw midi devices, so from compatibility point of view it would be nice
> if midi(4) could provide the raw midi byte stream. 

I can see the appeal for you in wider adoption of the "raw" semantics
you introduced in OpenBSD, but I believe there are disadvantages also.

In essence, we both began with an implementation where message transfer
coding was partially implemented in midi(4) and partially in the
surrounding components (umidi, sequencer, midisyn...). I completed the
implementation in midi(4) so it did not need to be anywhere else; you
removed it from midi(4) to rely on it everywhere else (modulo that
midi_toevent happens to be in midi.c rather than sequencer.c).

That architectural choice can be viewed from different angles. One that
interests me is maintainability, or the ease of correctly implementing
new functionality. For example, are you convinced that OpenBSD's umidi
driver will produce class-compliant USB traffic if the writing process
uses hidden note-off? I'm not. I don't mean to say it would be difficult
to do; my real point is, the next time someone writes another driver for
a new MIDI link type, will the question arise again?

A related question applies to the overall model. The midi.4 man page
on obsd reads "data issued by the user program is sent as-is to the
output port." Do you think you can make that guarantee for every type of
link? Does it depend on what you call "the output port"? What do you
think a class-compliant USB MIDI adapter might do to your data stream,
if it has a MIDI 1.0 DIN on the far side?  Does the USB MIDI protocol
give you any way to control that?

The premise seems to be that any present or future type of link looks or
should be made to look like a 1.0 current loop. I wonder if that is a
solid foundation. I would rather assume that they all carry MIDI
messages, but not constrain them further.

But in the end, after all of this, I expect the difference between
our implementations to be barely visible to the great majority of
apps. OK, an app that wants to detect Active Sense timeout on obsd will
keep track of 0xfe's and set a timeout, and on nbsd will watch for a
zero read. Sounds like an #ifdef. Apps developed for nbsd will not need
complete logic to deal with 1.0 transfer encoding, but nothing suffers
if they have it. There may be some set of tasks for which nbsd's
approach proves more convenient (say, can be done with Unix tools
or very simple code), and some for which obsd's proves so (say,
umm, MIDI 1.0 steganography? :).  I have my opinions on which
architecture will be easier to develop further, but those are worth
what you paid for 'em. Anyway, that's what a marketplace of ideas is
about, right?

-Chap