[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: NetBSD Audio specification 2018
Attached is an ammended mdoc spec.
No details pertaing to audio from the first posting have been changed, only
On Mon, 7 May 2018 05:08:11 Nathanial Sloss wrote:
> I've attached the specification for the audio mixer in mdoc format along
> with a patch for NetBSD-current which I will pull up to -8 which makes
> audio.c conformant with this specification.
> Best regards,
.\" Copyright (c) 2016 - 2018 Nathanial Sloss <nathanialsloss%yahoo.com.au@localhost>
.\" All rights reserved.
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
.\" POSSIBILITY OF SUCH DAMAGE.
.Dd May 4, 2018
.Dt AUDIO 7
.Nd NetBSD In-kernel audio mixer specification
This document aims to describe all aspects of the in-kernel audio mixer
included with NetBSD-8 and onwards.
Describing its current behavior as of 2018.
.Sh VIRTUAL CHANNEL (VCHAN):
This is the most fundamental element to the mixer.
The vchan has all of the properties of the traditional single open NetBSD
It consists of playback and record rings along with audio_info structures.
Upon opening of /dev/audio or /dev/sound a new vchan and mixerctl structure is
In the case of /dev/sound audio_info structures are inherited from the last
open of /dev/audio or /dev/sound.
All vchans are up or down sampled into the mix ring (intermediate) format
before being sent to hardware.
It is described in the following diagram:
VCHAN2-------------MIX RING ---- HARDWARE
In the case of sysctl usemixer=0 (see below) there is only one vchan whose play
and record rings are the hardware play/record rings.
User accessible vchans are numbered starting at one(1).
Vchan 0 is used internaly by the mixer for the mix ring and its ring buffers
are not user accessible.
The only limit to the number of open vchans is the speed of the computer and the
number of free file descriptors.
.Sh BLOCK - SIZE / LATENCY:
A block of audio data is the basic unit for audio data.
Audio applications will not commence play back until three(3) blocks have been
written - this is the source of latency in the mixer along with the size of the
audio data block.
For normal uses audio read/write their will be three blocks of audio data before
play back commences one in the vchan, one in the mix ring and one in the
The size of the audio data block is dependent on the audio format configured
by the application the latency sysctl and the underlying audio hardware.
Some audio hardware devices only support a static block size, as such the
overall latency of the mixer for these devices cannot be changed.
Other devices such as those supported by hdaudio allow the hardware block size
to be changed, allowing the latency of the mixer to change from 4
milliseconds(ms) to 128 ms with the mixer intermediate format being 16 bit,
stereo, 48 kHz.
With regard to mmapped audio, blocks are played back immediately so the latency
presented to applications is one third of the latency sysctl value.
Latency can be calculated by the following formula:
Latency (ms) = blocksize(bytes) * num blocks * 1000
freq(Hz) * bytes per sample * channels
Latency in the mixer and latency presented to audio applications is consistent,
it will be the same regardless of the audio format requested.
The default latency configured at boot time is 150ms subject to the above
.Sh ADDED IOCTLS:
Two new ioctls have been added to accommodate mixing of multiple vchans:
.Bl -tag -width indent
.It Ar AUDIO_SETCHAN:
Allows setting the target vchan to operate on for subsequent
.It Ar AUDIO_GETCHAN:
Returns the current vchan number.
These ioctls were necessary as some audio applications like to open an audio
device and a audioctl device so to check on buffer usage and samples played etc.
As opening an audioctl device would result in a new vchan being created, these
ioctls allow setting the target vchan and audio_info structure to that of an
.Sh MIXERCTL INTERFACE / SOFTWARE VOLUME:
Mixerctl structures are allocated when a new vchan is created.
The mixer control structure allows for setting the software volume for play
back - vchan.dacN or recording vchan.adcN.
These are 8 bit values and the this value is applied during mixing into the mix
The software volume is applied to all channels(1,2,4 etc) in the vchan and at
present (2018-05-04) their is no balance controls.
The first vchan corresponds to the vchan.dac1/adc1 mixer controls.
All vchan mixer controls only have effect upon its own volume and writing to
outputs.master (or equivalent) control is required to change the volume of the
Mixer controls are only present whilst the chan is in use and numbering starts
Mixer control numbers ie dac/adc1 correspond to their vchan number.
.Sh AUDIOCTL / AUDIO_INFO INTERFACE:
Audioctl allows access to the audio_info structure of a given device.
Due to the audio mixer a
switch was added to allow access to a given vchan's audio_info structure.
The values for -p are numbered starting at one(1).
Not specifying -p will result in working with a new vchan and this is only
desired when the next subsequent audio open is to be /dev/sound. ie:
.Dl audioctl -w play.gain=120
.Dl open /dev/sound this will have an initial software volume level of 120.
The parameters for play back and recording only effect the particular vchan
being operated on (gain, sample rate, channels, encoding etc), except -p 0 (the
Specifying -p 0 will display the audio parameters of the mix ring and allow
setting the hardware gain and balance.
.Sh ADDED SYSCTLS
With the introduction of the audio mixer the following sysctls have been added:
.Bl -tag -width indent
.It Ar .driverN.frequency:
.It Ar .driverN.precision:
.It Ar .driverN.channels:
Intermediate mixing format.
.It Ar hw.driverN.latency:
Expressed in milliseconds.
.It Ar hw.driverN.multiuser:
Off/On (0/1) defaults to off.
This sysctl determines if multiple users are allowed to access the sound
The root user is always allowed access (ie for wsbell).
The first user to open the audio device has full control of the audio device
if this sysctl is set to off.
There currently is an outstanding PR about affecting a privileged process -
Ideally if root intervenes with the audio device, it should do so unaffected.
If this control is set to on, then all users' audio data are mixed and all users
have access to the audio hardware.
.It Ar hw.driverN.usemixer:
Off/On (0/1) defaults to on.
This sysctl enables or disables the audio mixer.
When set to off the audio device can support only one vchan.
This vchan's play and record ring buffers are the hardware ring buffers.
This option was added to aid older/slower systems where the extra overhead of
the audio mixer might pose a problem.
.Sh INTERMEDIATE / MIXING FORMAT:
The initial concept was to handle incoming audio data similarly to that of a
superheterodyne radio receiver:
.Dl RF -> IF -> AF
So the corresponding mixing concept is:
.Dl vchan -> mixing format -> hardware
The sysctls described above determine the format for mixing.
All vchans are up or down sampled to this format before mixing takes place.
On most systems this defaults to 16 bit stereo 48kHz.
The sysctls governing the mixing format may only be changed when there are no
vchans in use.
On faster systems the precision (8, 16, 32 bits) may be changed along with the
sample rate and number of channels (mono, stereo, 4 etc).
On older/slower systems utilizing audio mixing it may be required to lower the
quality of this format to ease the amount of data processing whilst mixing.
All possible audio formats (mulaw, alaw, slinear, ulinear, 8, 16 and 32 bit
precision) are converted for use by the audio mixer.
.Sh MEMORY MAPPED PLAY BACK
It is possible to use mmap for audio playback, achieving reduced latency.
However the audio applications selected format must match the mixing/
intermediate format (see above).
It is possible to obtain the audio_info for vchan0 which contain the
intermediate/mixing format to ease applications configuring for mmapped audio.
At present most applications don't use the mix ring's audio_info structure to
obtain the required play back parameters and some user intervention
is required to set the audio format for the application.
.Sh HARDWARE DRIVER REQUIREMENTS:
Audio mixing requires signed linear support in the hosts' endianness.
Driver authors should support slinear_le and slinear_be formats.
If the audio hardware is intended to be used with the mixer disabled mulaw 1ch
8000 hz needs to be supported also.
This is easily achievable with the auconv framework/filters.
All new drivers should consider the use of auconv where possible.
.An Nathanial Sloss
.Sh SPECIAL THANKS
Great appreciation goes to Onno van der Linden, isaki@, maya@, jmcneil@,
pgoyette@, mrg@, riastradh@ and christos@, without their input, this code would
not be what it is currently.
Main Index |
Thread Index |