[TIP] 2x speed improvement with one changed line #77

ldorigo · 2021-10-11T10:38:33Z

Hi, I don't have time to make a PR right now, this is just to let you know that simply excluding NER from the spacy pipeline results in approximately 2x speed (at least when processing lots of short sentences).

You can do so by replacing line 158 of core.py from

            self.nlp = spacy.load(spacy_lang)

to

            self.nlp = spacy.load(spacy_lang, exclude=["ner"])

And most likely, you could also add a separate case (like self.nlp_nosyntax = spacy.load(spacy_lang, exclude=[...])) for matching without syntax where you can exclude most other components as well and get an even larger speedup.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TIP] 2x speed improvement with one changed line #77

[TIP] 2x speed improvement with one changed line #77

ldorigo commented Oct 11, 2021

[TIP] 2x speed improvement with one changed line #77

[TIP] 2x speed improvement with one changed line #77

Comments

ldorigo commented Oct 11, 2021