You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I can't seem to be able to run this in Mac. is there any requirements not mentioned in setup.py?
🍺 python textrank.py summarize ./articles/3.txt
Traceback (most recent call last):
File "textrank.py", line 219, in <module>
cli()
File "TextRank-master/virtualenv/lib/python2.7/site-packages/click/core.py", line 716, in __call__
return self.main(*args, **kwargs)
File "TextRank-master/virtualenv/lib/python2.7/site-packages/click/core.py", line 696, in main
rv = self.invoke(ctx)
File "TextRank-master/virtualenv/lib/python2.7/site-packages/click/core.py", line 1060, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "TextRank-master/virtualenv/lib/python2.7/site-packages/click/core.py", line 889, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "TextRank-master/virtualenv/lib/python2.7/site-packages/click/core.py", line 534, in invoke
return callback(*args, **kwargs)
File "textrank.py", line 214, in summarize
summary = extractSentences(text)
File "textrank.py", line 163, in extractSentences
sentenceTokens = sent_detector.tokenize(text.strip())
File "TextRank-master/virtualenv/lib/python2.7/site-packages/nltk/tokenize/punkt.py", line 1226, in tokenize
return list(self.sentences_from_text(text, realign_boundaries))
File "TextRank-master/virtualenv/lib/python2.7/site-packages/nltk/tokenize/punkt.py", line 1274, in sentences_from_text
return [text[s:e] for s, e in self.span_tokenize(text, realign_boundaries)]
File "TextRank-master/virtualenv/lib/python2.7/site-packages/nltk/tokenize/punkt.py", line 1265, in span_tokenize
return [(sl.start, sl.stop) for sl in slices]
File "TextRank-master/virtualenv/lib/python2.7/site-packages/nltk/tokenize/punkt.py", line 1304, in _realign_boundaries
for sl1, sl2 in _pair_iter(slices):
File "TextRank-master/virtualenv/lib/python2.7/site-packages/nltk/tokenize/punkt.py", line 310, in _pair_iter
prev = next(it)
File "TextRank-master/virtualenv/lib/python2.7/site-packages/nltk/tokenize/punkt.py", line 1280, in _slices_from_text
if self.text_contains_sentbreak(context):
File "TextRank-master/virtualenv/lib/python2.7/site-packages/nltk/tokenize/punkt.py", line 1325, in text_contains_sentbreak
for t in self._annotate_tokens(self._tokenize_words(text)):
File "TextRank-master/virtualenv/lib/python2.7/site-packages/nltk/tokenize/punkt.py", line 1460, in _annotate_second_pass
for t1, t2 in _pair_iter(tokens):
File "TextRank-master/virtualenv/lib/python2.7/site-packages/nltk/tokenize/punkt.py", line 310, in _pair_iter
prev = next(it)
File "TextRank-master/virtualenv/lib/python2.7/site-packages/nltk/tokenize/punkt.py", line 577, in _annotate_first_pass
for aug_tok in tokens:
File "TextRank-master/virtualenv/lib/python2.7/site-packages/nltk/tokenize/punkt.py", line 542, in _tokenize_words
for line in plaintext.split('\n'):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 9: ordinal not in range(128)
The text was updated successfully, but these errors were encountered:
This is a well-documented issue in Python regarding encoding types. You can solve this by reloading sys and changing the encoding type. Of note is that if you create new text files you will be able to use this implementation on them if you set encoding to UFT-8.
If you install it with pip3 or run it with python 3 it will work.
If you don't want (or you can't) do it with python3, you only need to put the following lines at the top of the textrank.py file:
I can't seem to be able to run this in Mac. is there any requirements not mentioned in setup.py?
The text was updated successfully, but these errors were encountered: