Slow learning for high-dimensional input #237

hollma · 2017-07-05T08:59:35Z

Dear developers,

first of all, thank you for this fine peace of free software!

I have been using scikit-neuralnetwork 0.7 with scikit-learn 0.18.2, theano 0.7.0, and lasagne 0.1, when I noticed that learning halfspaces seems to be quite slow when the examples are high-dimensional vectors, e.g. dim >= 1000. These dimensions are quite normal when using tfidf vectors in text classification settings.

A minimal working example (with runtime stats):
https://gist.github.com/hollma/f0d98bc5e58a6db34725dbce9ecdf9d1

Processing 500 training examples (100-dimensional) and validating the success on another 500 test examples took almost 14 seconds (on an Intel i7-4790 CPU running at 3.6 GHz).

What would you recommend? Are high-dimensional input vectors the wrong use-case for scikit-neuralnetwork, i.e. I should use some other library instead?

I am looking forward to your answer.

Best regards,
Mario

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow learning for high-dimensional input #237

Slow learning for high-dimensional input #237

hollma commented Jul 5, 2017 •

edited

Loading

Slow learning for high-dimensional input #237

Slow learning for high-dimensional input #237

Comments

hollma commented Jul 5, 2017 • edited Loading

hollma commented Jul 5, 2017 •

edited

Loading