Skip to content

Optimize reverb buffer locality#90

Open
Ali-Z0 wants to merge 1 commit into
sezero:masterfrom
Ali-Z0:optimize-reverb-buffer-locality
Open

Optimize reverb buffer locality#90
Ali-Z0 wants to merge 1 commit into
sezero:masterfrom
Ali-Z0:optimize-reverb-buffer-locality

Conversation

@Ali-Z0

@Ali-Z0 Ali-Z0 commented Jun 8, 2026

Copy link
Copy Markdown

Summary

Hi MikMod maintainers,

My collaborator @thomasstaheli and I recently studied a MikMod-based audio pipeline as part of our High-Performance Computing course at HEIG-VD in Yverdon, Switzerland. During this work, we profiled the reverb mixing code and identified MixReverb_Stereo as a noticeable hotspot in the audio path.

This pull request proposes two small optimizations for the reverb mixer:

  1. improving the memory layout of the reverb buffers;
  2. reducing the arithmetic cost of circular buffer indexing.

The goal is to improve performance in a frequently executed DSP path without changing the reverb algorithm or the intended audio output.

Motivation

The stereo reverb mixer repeatedly accesses 16 circular buffers:

  • RVbufL1 to RVbufL8 for the left channel;
  • RVbufR1 to RVbufR8 for the right channel.

In the original implementation, these buffers are allocated separately. This can scatter them across the heap and reduce spatial locality. Since the reverb loop reads and writes these buffers very frequently, a more cache-friendly layout can help reduce memory stalls.

The inner loop also performs repeated modulo operations for circular indexing. These modulo operations may compile to integer divisions, which are relatively expensive compared to simple increments and comparisons.

Changes

1. Contiguous reverb buffer allocation

The separate reverb buffer allocations are replaced with a single contiguous allocation. The existing RVbufL* and RVbufR* pointers are then assigned linearly inside this block.

This keeps the rest of the code structure mostly unchanged, while improving spatial locality and reducing heap fragmentation.

2. Strength reduction for circular indexing

The repeated modulo-based index updates in the inner reverb loop are replaced with conditional increments:

if (++loc >= RVc)
    loc = 0;

The modulo is only needed for initialization. During the loop, the index advances with cheaper operations: increment, comparison, and conditional reset.

Performance notes

This change was motivated by profiling in a downstream project using MikMod-based audio mixing. In that environment, MixReverb_Stereo was one of the relevant audio hotspots.

Using Linux perf, we observed the following self CPU cost for the targeted function:

  • baseline run 1: 5.38%;
  • baseline run 2: 5.69%;
  • optimized run: 4.33%.

Using the average of the two baseline runs, this corresponds to a relative reduction of approximately 21.8% for MixReverb_Stereo in that workload.

These numbers are workload- and platform-dependent, so they should be interpreted as an indication rather than a universal benchmark. The main goal of this PR is to make the reverb buffer access pattern more cache-friendly and reduce unnecessary arithmetic in a hot loop.

Compatibility

The reverb algorithm is not intentionally changed. The existing buffer pointer interface is preserved, and the optimization only changes how the buffers are allocated and how circular indices are advanced internally.

Please let us know if you would prefer the patch to be split differently, rebased, or adjusted to better match the coding style of the project.

Best regards,
@Ali-Z0 and @thomasstaheli

@sezero

sezero commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Whitespace / style changes are uncalled for.

Apart from that, what do you think @AliceLR ?

@AliceLR

AliceLR commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator
  • I suspect the vast majority of any improvement is from the modulo removal.
  • Except maybe on older chips, I doubt memory fragmentation is a serious concern here, but the implementation of this looks fine. I would be interested to see performance metrics for separate modulo removal and allocation patches.
  • The comment/whitespace changes (header comments, end LF, /* --- this style of comment calling attention to itself while not really doing anything --- */) aren't necessary.
  • Where is SINTPTR_T is defined?

@sezero

sezero commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Thanks for the review.

Where is SINTPTR_T is defined?

Huh. It isn't defined in library, it is defined in player: mutilities.h
This won't even compile as is...

@thomasstaheli

Copy link
Copy Markdown

We're going to separate the changes so we can measure the performance improvements they bring.

We're not sure we understand-is it not compiling for you?

@sezero

sezero commented Jun 8, 2026

Copy link
Copy Markdown
Owner
playercode/virtch.c:136:8: error: unknown type name 'SINTPTR_T'
  136 | static SINTPTR_T MixSIMDMonoNormal(const SWORD* srce,SLONG* dest,SINTPTR_T idx,SINTPTR_T increment,SINTPTR_T todo)
      |        ^~~~~~~~~
playercode/virtch.c:136:66: error: unknown type name 'SINTPTR_T'
  136 | static SINTPTR_T MixSIMDMonoNormal(const SWORD* srce,SLONG* dest,SINTPTR_T idx,SINTPTR_T increment,SINTPTR_T todo)
      |                                                                  ^~~~~~~~~
playercode/virtch.c:136:80: error: unknown type name 'SINTPTR_T'
  136 | static SINTPTR_T MixSIMDMonoNormal(const SWORD* srce,SLONG* dest,SINTPTR_T idx,SINTPTR_T increment,SINTPTR_T todo)
      |                                                                                ^~~~~~~~~
playercode/virtch.c:136:100: error: unknown type name 'SINTPTR_T'
  136 | static SINTPTR_T MixSIMDMonoNormal(const SWORD* srce,SLONG* dest,SINTPTR_T idx,SINTPTR_T increment,SINTPTR_T todo)
      |                                                                                                    ^~~~~~~~~
playercode/virtch.c:152:8: error: unknown type name 'SINTPTR_T'
  152 | static SINTPTR_T MixSIMDStereoNormal(const SWORD* srce,SLONG* dest,SINTPTR_T idx,SINTPTR_T increment,SINTPTR_T todo)
      |        ^~~~~~~~~
playercode/virtch.c:152:68: error: unknown type name 'SINTPTR_T'
  152 | static SINTPTR_T MixSIMDStereoNormal(const SWORD* srce,SLONG* dest,SINTPTR_T idx,SINTPTR_T increment,SINTPTR_T todo)
      |                                                                    ^~~~~~~~~
playercode/virtch.c:152:82: error: unknown type name 'SINTPTR_T'
  152 | static SINTPTR_T MixSIMDStereoNormal(const SWORD* srce,SLONG* dest,SINTPTR_T idx,SINTPTR_T increment,SINTPTR_T todo)
      |                                                                                  ^~~~~~~~~
playercode/virtch.c:152:102: error: unknown type name 'SINTPTR_T'
  152 | static SINTPTR_T MixSIMDStereoNormal(const SWORD* srce,SLONG* dest,SINTPTR_T idx,SINTPTR_T increment,SINTPTR_T todo)
      |                                                                                                      ^~~~~~~~~
playercode/virtch.c: In function 'Mix32MonoNormal':
playercode/virtch.c:277:24: error: implicit declaration of function 'MixSIMDMonoNormal'; did you mean 'Mix32MonoNormal'? [-Wimplicit-function-declaration]
  277 |                 return MixSIMDMonoNormal(srce, dest, idx, increment, todo);
      |                        ^~~~~~~~~~~~~~~~~
      |                        Mix32MonoNormal
playercode/virtch.c: In function 'Mix32StereoNormal':
playercode/virtch.c:301:24: error: implicit declaration of function 'MixSIMDStereoNormal'; did you mean 'Mix32StereoNormal'? [-Wimplicit-function-declaration]
  301 |                 return MixSIMDStereoNormal(srce, dest, idx, increment, todo);
      |                        ^~~~~~~~~~~~~~~~~~~
      |                        Mix32StereoNormal
make[2]: *** [playercode/virtch.lo] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2

You changed SSIZE_T usage with SINTPTR_T without explaining why
in the code which is compiled only when SIMD is enabled: SIMD is
not enabled by default, therefore you failed to notice that your
change is leading to build failures.

@sezero

sezero commented Jun 8, 2026

Copy link
Copy Markdown
Owner

It looks like you developed your changes against an old version (probably 3.3.11?),
because you seem to be overwriting two commits:

b1d6de5

fd513c8

I guess that you made your changes in an old version, and then copied the
two modified files directly to to current version, resulting in a mess..

@sezero

sezero commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Here is the cleaned-up patch with only intended changes: 90.patch

May I force-push to your branch for reviewers' convenience?

@Ali-Z0

Ali-Z0 commented Jun 9, 2026

Copy link
Copy Markdown
Author

Thanks for the review and for cleaning up the patch.

Yes, you are right, we worked from an older MikMod-based project and missed some unrelated changes when adapting it to the current tree. Sorry about that.

Please feel free to force-push the cleaned-up patch to the branch.

Let us know if there is anything else we should adjust.

@sezero sezero force-pushed the optimize-reverb-buffer-locality branch from f62c017 to 23ee402 Compare June 10, 2026 01:50
Repository owner deleted a comment from AliceLR Jun 10, 2026
@sezero

sezero commented Jun 10, 2026

Copy link
Copy Markdown
Owner

P.S.: Looks like this first appeared as JHGuitarFreak/UQM-MegaMod#301 but got rejected.

Please feel free to force-push the cleaned-up patch to the branch.

Done

Let us know if there is anything else we should adjust.

@AliceLR requested benchmark numbers. Maybe a few things more..

@AliceLR

AliceLR commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

P.S.: Looks like this first appeared as JHGuitarFreak/UQM-MegaMod#301 but got rejected.

I think once you peel away the unnecessary style modifications, the changes are probably not copyrightable (and are reviewable and ostensibly useful), so I didn't push the point further than that.

Separate benchmarks of the two changes would still be nice, and would help us determine if both changes are useful (vs. if the improvement is just from the modulo removal).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants