Next: The Planet CCRMA package
Up: Planet CCRMA at home
Previous: Installing Planet CCRMA on
Understanding low latency
Note: this is an old section of the old install guide that I'm keeping
around as the contents are still somewhat valid.
What we are trying to optimize is the latency of the whole
system. What is 'latency' in this context? Roughly speaking, the time
that elapses between a hardware device issues a hardware interrupt,
and the time the process that deals with it is run.
Let's assume we are talking about your sound card and that your
favorite player is playing a soundfile. The soundcard has several
internal buffers that have to be periodically filled by your program
to keep playback free of interruptions. When one of those buffers is
emptied the card issues a hardware interrupt. This is a pin in one of
the sound card chips that ultimately links to a pin in the processor
inside your computer. The interrupt is supposed to redirect the flow
of instruction execution in the processor to an interrupt handler
routine that is programmed to deal with whatever the interrupt is
signalling (in this case refill with new samples a free buffer in the
sound card). Here is the first roadblock that can add latency to the
whole process. Interrupts have to be enabled before the interrupt line
can actually affect the flow of program execution. But the linux
kernel needs to disable interrupts sometimes, when it is in internal
critical sections of code that cannot be interrupted. While the kernel
is good at keeping this time short, sometimes those internal sections
of code that need to be protected from interrupts are long and can
delay the interrupt for quite a while. One of the (many) potential
culprits is an unoptimized EIDE hard disk as, by default, the driver
is set to very conservative settings that can keep the interrupts
disabled for a long time. That makes it impossible to achieve low
latencies. This is one of the reasons why we need to 'tune' EIDE
disks.
So let's keep going. Assuming the interrupts are eventually enabled,
the system will jump to the interrupt routine which is normally very
short. Interrupts are disabled while inside the interrupt handler
(which can obviously reenable them if that is possible), so the driver
designers want to keep the code that executes in the handler as short
as possible. This is not the code that will be sending samples back to
the sound card! One of the actions that this code will take is to wake
up a process that will deal with the rest of the task to be done. This
is where the second potential readblock to short latencies occurs.
The processor inside your computer is constantly switching between
many tasks. Just do a ``ps
auxw'' to see what is
currently running in your computer. Each of those entries represent a
separate 'task' that is sharing time slots of processor time. At any
given time most of those programs are sleeping, waiting for an event
that will wake them up. One of them is your playback program. Most of
the time it is doing nothing, just waiting for another buffer to be
available to be filled with samples. Actually your program (of maybe
just a thread within it, if it is multi-threaded) is blocked at an
alsa library call which in turn is blocked in a write or read to the
actual device in the sound driver, which in turn is sleeping waiting
for an interrupt from the soundcard to wake it up (I think this is how
it works, wizards out there correct me if I'm missing something).
So the hardware interrupt handler will wake up the task that
ultimately will lead back to unblocking your application so that it
can supply the next bufferfull of samples to the sound card. At the
time this happens the processor is probably busy with some other task,
and some time will elapse till your task will transition from being
awake to being running and doing useful work.
The kernel itself arbitrates this, each time its scheduler is run it
checks for tasks that are ready to run (have been 'awakened'), finds
the one that has the highest priority and gives it the processor (this
is not the whole explanation, see the
sched_setscheduler
man page for all the details). For
this to happen the scheduler has to run. And it is not running all the
time. Getting the scheduler to run often enough is the target of the
low latency patch. Sometimes the
kernel needs to do lengthy tasks that are not broken up with scheduler
runs. If the scheduler does not run, your task does not get a chance
at grabbing the processor. If that time is long enough, all buffers
inside the soundcard empty and a dropout occurs. The wizards that
write the low latency patch try to identify those critical sections in
the kernel empirically, and insert scheduler calls to break them up
safely into shorter pieces, so that other tasks get a chance to
run. So having the low latency patch installed and enabled can help a
lot. Obviously linux is not a hard real time operating system and
there is no way to guarrantee that your task will be awakened in time,
but in the real world it is good enough (a real time os like QNX would
be far more appropriate than Linux, Windows or MacOS for real time
work).
So now we have the interrupts disabled for the shortest possible time,
and the scheduler is running often enough so that the linux kernel
itself does not introduce big latency hits every once in a while. But
that is not enough. If your playback program is not running with high
enough priority it could happen that the linux scheduler gives the
processor to some other task, and your playback programs is stuck,
awake but powerless, waiting for the next scheduler run to happen (and
a chance to get the processor). Tasks have dynamic priorities assigned
to them (see the nice
and renice
utilities)
and you could make your task a high priority one and that would make
things better. But even that is not enough. Priorities for this
scheduling policy are dynamic and change over time. The higher the
number of times the scheduler skipped a task that it ready to run, the
higher the scheduler will increase its priority, so that eventually it
will run when the priority has gotten high enough. All tasks can be
interrupted at any time by another higher priority task, and even if
your task has the highest dynamic priority, it will eventually lose to
another process, most probably at the worst time (can you hear the
click coming?). So what do we do now?
The scheduler has three different ways of scheduling tasks, the so
called scheduling policies. The normal scheduling policy (SCHED_OTHER)
works more or less in the way I have described so far. The scheduler
selects the next highest priority task to run and gives it a go, but
the scheduler can run again at any time (for example, it normally runs
every 10 msecs no matter what, triggered by the timer tick) and your
task can be interrupted and put temporarily back in the ready to run
list. But there are two additional scheduling policies designed for
real-time programs. Those are the First in-First out (SCHED_FIFO) and
Round Robin (SCHED_RR) scheduling policies. Very low latency audio
applications definitely have to be run with one of those scheduling
policies, otherwise it is impossible to attain reliable 'under load'
low latencies. The audio task has to have the highest priority no
matter what. But with that power comes a responsability.
A task with SCHED_FIFO policy has a static priority that will not be
altered by the scheduler and is higher than all other normal
SCHED_OTHER tasks. Furthermore, SCHED_FIFO tasks have to voluntarily
yield the processor back to the scheduler either through a system call
or through calling sched_yield, in other words, that
task cannot be interrupted by any of the normal tasks that are running
in the linux environment (except, of course, by a task running with
the same SCHED_FIFO policy and a higher static priority!). So, if your
program has a bug, gets into an infinite loop and does NOT yield back
to the scheduler the whole computer will freeze. It will not crash in
the absolute sense of the word. It is still running quite nicely, but
your task is using all the processor time and not yielding back to the
scheduler so that no other task gets a chance to run, not even the
kernel (and its scheduler). Ever. You have to power-cycle the whole
thing or press the reset button if you have one (I'm amazed at the
optimism of the hardware designers that do not include reset buttons
in their computers). Low latency real-time priority apps have to be
very well designed. Some spawn an additional processe that runs
periodically with higher SCHED_FIFO priority than the main task, so
that they can check on a stuck process and kill it, a watchdog
approach that saves you from a complete freeze.
Phew, that was long...
Summarizing, you need tuned drivers that do not disable interrupts for
long, low latency patches in the kernel so that the scheduler runs
often enough and your application itself has to run with the
SCHED_FIFO scheduling policy so that it gets the best chance of
grabbing the processor when it needs it.
When everything is in place things work incredibly well. The system
can be running an audio task with no dropouts and a few milliseconds
of latency while the computer is being loaded with disk accesses,
screen refreshes and whatnot. The mouse gets jerky, windows update
very slowly but not a dropout to be heard.
I wonder if anybody got this far :-)
Subsections
Next: The Planet CCRMA package
Up: Planet CCRMA at home
Previous: Installing Planet CCRMA on
|