Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assorted fixes and performance improvements #134

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

sleepybishop
Copy link

@sleepybishop sleepybishop commented Aug 15, 2020

hello, this PR includes a smattering of fixes for small issues i encountered while using the library, they seem to be common after reviewing the open issues list.

the latest commit improves performance by ~2.5x when compiled with -O3 -march=native -funroll-loops

please review and merge if you find them useful, each fix is in its own branch if some fixes look more palatable than others.

@sleepybishop sleepybishop changed the title assorted fixes assorted fixes and performance improvements Aug 16, 2020
src/celt_lpc.c Outdated
@@ -96,7 +96,7 @@ void celt_fir(
int ord)
{
int i,j;
opus_val16 rnum[ord];
opus_val16 *rnum = calloc(sizeof(opus_val16), ord);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change VLAs to calloc? VLAs are allocated on the stack and calloc allocates on the heap. Usually people prefer using the stack instead of the heap. The other difference is that calloc 0 initializes the memory but this doesn't seem required in these cases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, however after browsing the issues list it seemed some folks had trouble compiling under MSVC due to VLA's. calloc was opted for safety.

Copy link

@e00E e00E Aug 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to support different compilers but I'd prefer some kind of ifdef to fallback to heap allocation only on platforms that don't support VLAs like MSCV because supporting them should not negatively affect more modern compilers.
Or maybe the api of these functions could be changed so that the extra buffer is passed in from the outside? I'm worried about allocating too often because it's unclear to me how how often these functions that use VLAs are called (how hot they are).
Changing the api is probably a bigger change than what you want to accomplish in this PR. Maybe something for a maintainer of the repo to decide.

Copy link

@e00E e00E Aug 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code more I am confused. celt_fir and celt_iir are unused functions. They are never called and the project still compiles when I remove them.
That does still leave the other calloc uses though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can revert the VLA related changes if it requires more consideration. i was merely trying to bundle common fixes.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's totally up to you or a maintainer. I'm not an official maintainer, just voicing my opinion on the PR. Also thanks for bundling the fixes.

Copy link
Contributor

@j-schultz j-schultz Sep 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to look into libspeex and how it handles VLA support (look out for stack_alloc.h).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i reverted the VLA related changes. thanks for the feedback wrt libspeex, i might try folding this code back into libspeexdsp to replace it's VAD as an experiment, in which case i would definitely adopt existing conventions.

Comment on lines -211 to +207
int i, k;
int i = 0, k = 0;
Copy link

@e00E e00E Aug 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't doubt that the static analyzer complains but these initializations are not required because either way both variables are set to 0 when they are used. There is a general argument for always obviously initializing everything but this project seems to prefer to initialize only when needed so I feel that this change is inconsistent with the project style.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think a compromise between initializing everything and being as terse as possible is initializing when a static analyzer highlights a potential problem due to a series of branches, however unlikely that may maybe.

Copy link

@e00E e00E Aug 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do believe that these are always initialized (and cannot possibly be unintialized), the static analyzer is just not good enough to figure it out. On the other hand initializing has no drawback either except the mentioned style differences and it is safer in terms of human error. Probably another thing for a maintainer to decide.

src/denoise.c Outdated
@@ -277,6 +277,7 @@ DenoiseState *rnnoise_create(RNNModel *model) {
}

void rnnoise_destroy(DenoiseState *st) {
opus_fft_free(common.kfft, 0);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand the surrounding code correctly but I think this is only allocated on demand in check_init so it should only be deallocated if it has been initialized.

Copy link
Author

@sleepybishop sleepybishop Aug 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough. committed change to only free if common.init == 1

src/rnn.c Outdated
@@ -76,33 +76,39 @@ static OPUS_INLINE float relu(float x)
return x < 0 ? 0 : x;
}

void faxpy(float *restrict a, const rnn_weight *restrict b, int k, float u)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this function is not used outside of this translation unit:

Suggested change
void faxpy(float *restrict a, const rnn_weight *restrict b, int k, float u)
static void faxpy(float *restrict a, const rnn_weight *restrict b, int k, float u)

@@ -66,4 +66,4 @@ void compute_gru(const GRULayer *gru, float *state, const float *input);

void compute_rnn(RNNState *rnn, float *gains, float *vad, const float *input);

#endif /* _MLP_H_ */
#endif /* _RNN_H_ */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not able to evaluate the rest of the changes, but applied this one to get it out of the way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants