Fixes random number generator on windows. Fixes #103 #140
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #103 (This is the same fix that was proposed for liblinear.)
On windows platforms, liblinear and libsvm have strong convergence issues because of the way random numbers are generated: max random number in Windows is 15 bits (even on 64 bit windows), which is 32767, while max random number in linux+GCC is 31 bits (resp. 63 bits in 64 bits systems I guess) so that's 2147483647 (resp 9223372036854775807).
If I understand correctly, these random numbers are used in the coordinate gradient descent algorithms, to find the next coordinate to act upon. When the dimensionality (e.g. number of samples) is large, the random number generator on windows has a hard time to explore all dimensions.
This is a known bug documented in liblinear FAQ (strangely enough, not the libsvm FAQ) but the proposed workaround was wrong.
I made a patch for this years ago in liblinear, that was approved by several users yet never merged: cjlin1/liblinear#28 .
Since another user reported it on libsvm as #103, here is the corresponding PR.
Note that I am proposing this simultaneously to the scikit-learn project (python), as they observed some convergence issues. Some of them might be due to this platform-related bug ?