Stop using paUtilVariableHostBufferSizePartialUsageAllowed in DirectSound #772

dechamps · 2023-02-25T12:36:25Z

This pull request does the following:

Changes the DirectSound host API code to use paUtilBoundedHostBufferSize instead of paUtilVariableHostBufferSizePartialUsageAllowed.
Removes paUtilVariableHostBufferSizePartialUsageAllowed as a buffer processor mode as it is now dead - DS was the only code using it.

The rationale behind the above is that paUtilVariableHostBufferSizePartialUsageAllowed appears to have no benefit over paUtilBoundedHostBufferSize, only downsides, and it causes problems with initialization/priming (see #770).

See individual commit messages for details.

This PR fixes #770.

RossBencina · 2023-02-28T00:53:24Z

To begin: this is a significant change to a core part of the code base and it's going to require a large investment of time to review it properly.

Ideally we need a test case that isolates and demonstrates the bug in the buffer processor.

paUtilVariableHostBufferSizePartialUsageAllowed appears to have no benefit over paUtilBoundedHostBufferSize

It may appear to have no benefit, and I will have to study this, but I doubt that I would have implemented it that way if it had no benefit whatsoever. Most likely that mode exists to avoid unnecessary data movement, e.g. in the case where the client requests a fixed size buffer.

At a certain time in the project history, when we had less strict review policies, there were a number of poorly reviewed changes to the buffer processor that were let through. So there is some chance that a bug was introduced that did not exist in the original stable version of the buffer processor.

dechamps · 2023-02-28T22:53:44Z

To begin: this is a significant change to a core part of the code base and it's going to require a large investment of time to review it properly.

Well to be fair, the change only affects DirectSound. Other Host APIs are not impacted because no other Host API is using this mode.

By the way, Ross, if you don't think you (or anyone else) would have the time to review this with a fine-tooth comb, then please consider my plea to give PRs the benefit of the doubt and approve it by default, instead of rejecting it by default. Your contributors will thank you.

Ideally we need a test case that isolates and demonstrates the bug in the buffer processor.

I did look for a test suite for the buffer processor that I could add tests to, but I didn't find one. It would be nice to have such a test suite, as the buffer processor is a nice self-contained piece of code with a clear interface boundary, making it ideal for unit testing. It would be a nice project, but I don't think it's reasonable to require this as part of this PR.

For what it's worth, the issue can be trivially reproduced through paloopback as shown in #770, albeit with the DirectSound Host API code in the middle.

It may appear to have no benefit, and I will have to study this, but I doubt that I would have implemented it that way if it had no benefit whatsoever.

One thing to note is that DS is not the only Host API that uses circular buffers, yet it's the only one that is using that mode. That might be an indication that this mode is not that useful.

Most likely that mode exists to avoid unnecessary data movement, e.g. in the case where the client requests a fixed size buffer.

I agree that this would make sense (it would avoid unnecessary copies if there is no need for sample conversion), but that's not what's happening in practice. paUtilBoundedHostBufferSize vs. paUtilVariableHostBufferSizePartialUsageAllowed only affects the behaviour of AdaptingProcess(), but AdaptingProcess() never passes host buffers directly to the stream callback - it always copies data into temp buffers first.

portaudio/src/common/pa_process.c

Lines 1423 to 1459 in 5b5feb3

    
           /* setup userInput */ 
        
           if( bp->userInputIsInterleaved ) 
        
           { 
        
               userInput = bp->tempInputBuffer; 
        
           } 
        
           else /* user input is not interleaved */ 
        
           { 
        
               for( i = 0; i < bp->inputChannelCount; ++i ) 
        
               { 
        
                   bp->tempInputBufferPtrs[i] = ((unsigned char*)bp->tempInputBuffer) + 
        
                           i * bp->framesPerUserBuffer * bp->bytesPerUserInputSample; 
        
               } 
        
               userInput = bp->tempInputBufferPtrs; 
        
           } 
        
           /* setup userOutput */ 
        
           if( bp->userOutputIsInterleaved ) 
        
           { 
        
               userOutput = bp->tempOutputBuffer; 
        
           } 
        
           else /* user output is not interleaved */ 
        
           { 
        
               for( i = 0; i < bp->outputChannelCount; ++i ) 
        
               { 
        
                   bp->tempOutputBufferPtrs[i] = ((unsigned char*)bp->tempOutputBuffer) + 
        
                           i * bp->framesPerUserBuffer * bp->bytesPerUserOutputSample; 
        
               } 
        
               userOutput = bp->tempOutputBufferPtrs; 
        
           } 
        
           /* call streamCallback */ 
        
           *streamCallbackResult = bp->streamCallback( userInput, userOutput, 
        
                   bp->framesPerUserBuffer, bp->timeInfo, 
        
                   bp->callbackStatusFlags, bp->userData );

Maybe it's worth exploring how AdaptingProcess() could be improved to directly feed host buffers to the stream callback under certain conditions, but right now, I think it's more urgent (and easier!) to fix the glitches. We can discuss such performance improvements in a separate issue.

dechamps · 2023-03-04T10:42:16Z

I did look for a test suite for the buffer processor that I could add tests to, but I didn't find one. It would be nice to have such a test suite, as the buffer processor is a nice self-contained piece of code with a clear interface boundary, making it ideal for unit testing. It would be a nice project, but I don't think it's reasonable to require this as part of this PR.

Filed #779 to track this separately.

This works around a DirectSound limitation where input host buffer sizes smaller than 31.25 ms are basically unworkable and make PortAudio hang. The workaround is to impose a minimal buffer size of 2*31.25 ms on input-only and full-duplex streams. This is enough for the read cursor to advance twice around the buffer, basically resulting in de facto double buffering. This change was tested with `paloopback` under a wide variety of half/full-duplex, framesPerBuffer, and suggested latency parameters. (Note the testing was done on top of PortAudio#772 as otherwise paloopback is not usable.) Fixes PortAudio#775

RossBencina · 2023-03-13T22:42:57Z

I'm assigning this to me for now. Hopefully I'll have time to look at it next week.

This works around a DirectSound limitation where input host buffer sizes smaller than 31.25 ms are basically unworkable and make PortAudio hang. The workaround is to impose a minimal buffer size of 2*31.25 ms on input-only and full-duplex streams. This is enough for the read cursor to advance twice around the buffer, basically resulting in de facto double buffering. This change was tested with `paloopback` under a wide variety of half/full-duplex, framesPerBuffer, and suggested latency parameters. (Note the testing was done on top of PortAudio#772 as otherwise paloopback is not usable.) Fixes PortAudio#775

Currently the DirectSound host API uses the buffer processor in the "paUtilVariableHostBufferSizePartialUsageAllowed" buffer size mode. What's special about this mode is that it allows the buffer processor to leave input frames in the host buffer if there are not enough frames available to fire the stream callback. In practice the buffer processor will only do this in full duplex mode. However, just because we *can* leave these frames in the host buffer, doesn't mean we *should*. It is not clear to me why the DS code was set up to leave data in the host input buffer. The reason stated in a code comment, "because DS can split the host buffer when it wraps around" doesn't make sense to me - the buffer processor seems perfectly capable of handling buffer split and wraparound (through functions such as PaUtil_Set2ndInputFrameCount()) regardless of the buffer size mode. While it is not clear what could be the upside of leaving data in the host buffer, the downsides seem evident: the longer data stays in the input buffer, the more likely it is to get trampled on by the capture cursor (buffer overflow). Copying the data out of the host input buffer and into the buffer processor temp buffer at the earliest opportunity seems like a safer option with no downside. Another downside of this mode is that currently, the behavior with respect to the host *output* buffer seems... confused, with the buffer processor writing a number of frames to the output host buffer that is inconsistent with the number of "frames processed" it returns to the DS code, leading to PortAudio#770. Arguably we could try to fix the buffer processor, but why fix a mode whose reason to exist is unclear in the first place? Another oddity of this mode is that it leads to the DS code locking the DS input and output buffers for read/write, but the buffer processor might then decide to do nothing with them. These DS Lock/Unlock calls are therefore wasted, reducing efficiency. This commit changes the mode to paUtilBoundedHostBufferSize. Not only does this fix PortAudio#770, it also has the nice side effect of having the buffer processor provision a temp buffer to match the host buffer size (if paFramesPerBufferUnspecified is used), avoiding unnecessary user buffer splitting. This change was tested using paloopback to test for glitches with various framesPerBuffer and suggested latency parameters, both in half duplex and full duplex mode. No regressions were observed, and paloopback results show that PortAudio#770 is fixed by this change. Fixes PortAudio#770

DirectSound was the only host API using this mode, and its usage was removed in the previous commit, making this mode dead code. As discussed in that commit the value proposition of this mode is unclear, and it does not behave correctly around stream start (priming) as shown in PortAudio#770.

RossBencina self-requested a review February 28, 2023 00:54

RossBencina added src-common Common sources in /src/common src-dsound MS DirectSound Host API /src/hostapi/dsound labels Feb 28, 2023

dechamps mentioned this pull request Mar 4, 2023

Add unit tests for buffer processor (pa_process) code #779

Open

dechamps mentioned this pull request Mar 6, 2023

dsound: clamp input buffer size at 62.5 ms #782

Open

RossBencina self-assigned this Mar 13, 2023

RossBencina added this to the V19.8 milestone May 24, 2024

dechamps added 2 commits May 25, 2024 20:40

dechamps force-pushed the dsoundbuf branch from e3cb7a4 to d7bc9c9 Compare May 25, 2024 19:40

RossBencina added the P2 Priority: High label Jul 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop using paUtilVariableHostBufferSizePartialUsageAllowed in DirectSound #772

Stop using paUtilVariableHostBufferSizePartialUsageAllowed in DirectSound #772

dechamps commented Feb 25, 2023

RossBencina commented Feb 28, 2023 •

edited

Loading

dechamps commented Feb 28, 2023 •

edited

Loading

dechamps commented Mar 4, 2023

RossBencina commented Mar 13, 2023

Stop using paUtilVariableHostBufferSizePartialUsageAllowed in DirectSound #772

Are you sure you want to change the base?

Stop using paUtilVariableHostBufferSizePartialUsageAllowed in DirectSound #772

Conversation

dechamps commented Feb 25, 2023

RossBencina commented Feb 28, 2023 • edited Loading

dechamps commented Feb 28, 2023 • edited Loading

dechamps commented Mar 4, 2023

RossBencina commented Mar 13, 2023

RossBencina commented Feb 28, 2023 •

edited

Loading

dechamps commented Feb 28, 2023 •

edited

Loading