Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization opportunities #3

Open
viciious opened this issue Jul 19, 2023 · 0 comments
Open

Optimization opportunities #3

viciious opened this issue Jul 19, 2023 · 0 comments

Comments

@viciious
Copy link

viciious commented Jul 19, 2023

Hi!

Just wanted to share some optimization opportunities I've discovered while adapting your code for the work I'm doing for the SegaCD.

The first one is by using a precomputed table for all index*nibble permutations, you can git completely eliminate the need to do clipping of the index value. The table is computed like so:

    int16_t adpcm_ima_indices[89*16];
...
    for (i = 0; i < 89; i++) {
        for (j = 0; j < 16; j++) {
            int newindex = i + indices[j];
            if (newindex < 0) {
                newindex = 0;
            }
            if (newindex > 88) {
                newindex = 88;
            }
            adpcm_ima_indices[i*16+j] = newindex << 5;
        }
    }

Note the << 5 instead of the <<6 because we do the additional multiply by 2 in the assembler portion of the code. This saves a significant amount of cycles + a couple of registers.

Another opportunity, which is probably less useful in your case, is that if you switch to unsigned samples, clamping against 0 becomes more trivial via spl+ext+and. Clamping against 0xffff also becomes more trivial since you can get away with a testing bit 16 on the result. This will also save you a couple of registers. The reason it's probably less useful in your case, is that will still have to sub 32768 from the result to go back to signed samples.

Sorry for not doing a proper PR, but I don't currently have a proper setup to incorporate all of my changes that would also allow me to test the results properly. Here you can find my version of the code, which is tailor to my SegaCD-specific needs: https://gist.github.com/viciious/0b8b0ee75dfd6deaebadfb5ef1eef6ff

All in all, I hope will find this information useful.

Regards,
Victor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant