I work on a multiplatform (windows/osx/linux) video application that uses portaudio for playing back audio. I've been investigating an intermittent bug we have that causes hangs when playing back video with audio on Linux. I was hoping some one on this list might be able to provide me with some advice/insight as to what is going on at the portaudio level, as the hang seems to be in Pa_StopStream. I'm not very familiar with audio programming so apologies for any obvious things I might have missed.
The general problem is that our application can hang when trying to stop the audio stream. It's quite intermittent and machine specific. I'm able to reproduce it semi reliably on my machine which has the following..
Distro -RHEL 6.4
Kernel - 2.6.32-358.11.1.el6.x86_64
Alsa version - 1.0.21
Audio device - HDA Intel
We're using portaudio v19_20111221 and use the callback interface. If I step through the program with gdb once it's hung I see that the main thread is waiting for a thread to join..
#0 0x000000352600822d in pthread_join () from /lib64/libpthread.so.0
#1 0x00007fa2af795931 in PaUnixThread_Terminate (self=0x84fee78, wait=<value optimized out>, exitResult=0x7fffb80a0d1c) at src/os/unix/pa_unix_util.c:441
#2 0x00007fa2af78e7d5 in RealStop (stream=0x84fece0, abort=<value optimized out>) at src/hostapi/alsa/pa_linux_alsa.c:3047
Presumably it's waiting on the portaudio thread, which is doing the following..
#0 0x00000035254df0d3 in poll () from /lib64/libc.so.6
#1 0x00007fa2af790d4a in PaAlsaStream_WaitForFrames (self=0x84fece0, framesAvail=0x7fa0a7ffee50, xrunOccurred=0x7fa0a7ffee58) at src/hostapi/alsa/pa_linux_alsa.c:3778
#2 0x00007fa2af791c59 in CallbackThreadFunc (userData=0x84fece0) at src/hostapi/alsa/pa_linux_alsa.c:4222
#3 0x00000035260079d1 in start_thread () from /lib64/libpthread.so.0
#4 0x00000035254e88fd in clone () from /lib64/libc.so.6
I'm able to get the same hang in the patest_start_stop test, built from v19_20140130, when I have some other audio playing (normally spotify running in chrome). I've been using that to try and debug things a bit. Here's what I've found.
* Stepping through the callback thread it seems that it is stuck in the while loop of CallbackThreadFunc, because framesAvail, set in PaAlsaStream_WaitForFrames, is always 0 and PaUtil_IsBufferProcessorOutputEmpty returns false.
* Stepping into PaAlsaStream_WaitForFrames -> PaAlsaStream_GetAvailableFrames -> PaAlsaStreamComponent_GetAvailableFrames, the call to alsa_snd_pcm_avail_update is always returning 0.
* The poll in the while loop of PaAlsaStream_WaitForFrames is returning 1
* The alsa_snd_pcm_poll_descriptors_revents call, in PaAlsaStreamComponent_EndPolling, sets revents to 4 (POLLOUT).
* If I add a call to snd_pcm_state in PaAlsaStreamComponent_GetAvailableFrames then I get SND_PCM_STATE_RUNNING.
I've no idea what causes it to get in this state but I've noticed that if I change the suggestedLatency used in patest_start_stop from defaultLowOutputLatency to defaultHighOutputLatency then I no longer get the hang. Also if I restart the pulseaudio service then that temporarily fixes the problem, but only for an hour or so, after that it starts happening again. It seems possible to work around the hang by replacing the call to Pa_StopStream with Pa_AbortStream but I'm a bit worried that the hang is actually a symptom of something else. Similarly if I call AlsaRestart(stream) from gdb, while it's hung, then it comes back to life. I've also noticed that in the cases where patest_start_stop doesn't hang it plays audio right up to the point where Pa_StopStream is called, while in the cases where it hangs it stops playing audio first then hangs once Pa_StopStream is called. In both cases there are a lot of "ALSA lib pcm.c:7246:(snd_pcm_recover) underrun occured" messages and scratching sounds, which aren't there when defaultHighOutputLatency is used. Our application uses defaultLowOutputLatency but doesn't get any underrun messages or scratching sounds.
Thanks in advance,