-
Notifications
You must be signed in to change notification settings - Fork 904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
assorted fixes and performance improvements #134
base: master
Are you sure you want to change the base?
Conversation
src/celt_lpc.c
Outdated
@@ -96,7 +96,7 @@ void celt_fir( | |||
int ord) | |||
{ | |||
int i,j; | |||
opus_val16 rnum[ord]; | |||
opus_val16 *rnum = calloc(sizeof(opus_val16), ord); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change VLAs to calloc? VLAs are allocated on the stack and calloc allocates on the heap. Usually people prefer using the stack instead of the heap. The other difference is that calloc 0 initializes the memory but this doesn't seem required in these cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, however after browsing the issues list it seemed some folks had trouble compiling under MSVC due to VLA's. calloc was opted for safety.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good to support different compilers but I'd prefer some kind of ifdef
to fallback to heap allocation only on platforms that don't support VLAs like MSCV because supporting them should not negatively affect more modern compilers.
Or maybe the api of these functions could be changed so that the extra buffer is passed in from the outside? I'm worried about allocating too often because it's unclear to me how how often these functions that use VLAs are called (how hot they are).
Changing the api is probably a bigger change than what you want to accomplish in this PR. Maybe something for a maintainer of the repo to decide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the code more I am confused. celt_fir
and celt_iir
are unused functions. They are never called and the project still compiles when I remove them.
That does still leave the other calloc uses though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i can revert the VLA related changes if it requires more consideration. i was merely trying to bundle common fixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's totally up to you or a maintainer. I'm not an official maintainer, just voicing my opinion on the PR. Also thanks for bundling the fixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to look into libspeex and how it handles VLA support (look out for stack_alloc.h).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i reverted the VLA related changes. thanks for the feedback wrt libspeex, i might try folding this code back into libspeexdsp to replace it's VAD as an experiment, in which case i would definitely adopt existing conventions.
int i, k; | ||
int i = 0, k = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't doubt that the static analyzer complains but these initializations are not required because either way both variables are set to 0 when they are used. There is a general argument for always obviously initializing everything but this project seems to prefer to initialize only when needed so I feel that this change is inconsistent with the project style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think a compromise between initializing everything and being as terse as possible is initializing when a static analyzer highlights a potential problem due to a series of branches, however unlikely that may maybe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do believe that these are always initialized (and cannot possibly be unintialized), the static analyzer is just not good enough to figure it out. On the other hand initializing has no drawback either except the mentioned style differences and it is safer in terms of human error. Probably another thing for a maintainer to decide.
src/denoise.c
Outdated
@@ -277,6 +277,7 @@ DenoiseState *rnnoise_create(RNNModel *model) { | |||
} | |||
|
|||
void rnnoise_destroy(DenoiseState *st) { | |||
opus_fft_free(common.kfft, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the surrounding code correctly but I think this is only allocated on demand in check_init
so it should only be deallocated if it has been initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough. committed change to only free if common.init == 1
7c3e90c
to
d4c1c30
Compare
src/rnn.c
Outdated
@@ -76,33 +76,39 @@ static OPUS_INLINE float relu(float x) | |||
return x < 0 ? 0 : x; | |||
} | |||
|
|||
void faxpy(float *restrict a, const rnn_weight *restrict b, int k, float u) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this function is not used outside of this translation unit:
void faxpy(float *restrict a, const rnn_weight *restrict b, int k, float u) | |
static void faxpy(float *restrict a, const rnn_weight *restrict b, int k, float u) |
d4c1c30
to
accadd3
Compare
@@ -66,4 +66,4 @@ void compute_gru(const GRULayer *gru, float *state, const float *input); | |||
|
|||
void compute_rnn(RNNState *rnn, float *gains, float *vad, const float *input); | |||
|
|||
#endif /* _MLP_H_ */ | |||
#endif /* _RNN_H_ */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not able to evaluate the rest of the changes, but applied this one to get it out of the way.
hello, this PR includes a smattering of fixes for small issues i encountered while using the library, they seem to be common after reviewing the open issues list.
the latest commit improves performance by ~2.5x when compiled with
-O3 -march=native -funroll-loops
please review and merge if you find them useful, each fix is in its own branch if some fixes look more palatable than others.