About code #1

N-Kingsley · 2019-01-14T11:14:25Z

Hi, I want to ask some about your code.
In 176-188 lines in model.py, why to do this:
if start < self.window_size:
d = self.window_size - start
score[i, j, :d] = epsilon
if end > li + self.window_size:
d = (li + self.window_size) - end
score[i, j, d:] = epsilon

Shouldn’t it judge whether the selected window is beyond length?

ChrisFugl · 2019-01-15T18:17:14Z

Hi @N-Kingsley. You are right, that this part of the code should determine if the selected window is beyond the input sentence length - and that is what it does. The reason that we enforce the window to be within the interval [window_size, li + window_size] (rather than [0, li]) is that we have padded the input sentence on the left and on the right as a performance optimisation. (Doing so allows us to do local attention in batches.) We do not want the model to pay attention to the padded sides of the input sentence and so we set those attention scores to "epsilon" before applying a softmax on the scores.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About code #1

About code #1

N-Kingsley commented Jan 14, 2019

ChrisFugl commented Jan 15, 2019

About code #1

About code #1

Comments

N-Kingsley commented Jan 14, 2019

ChrisFugl commented Jan 15, 2019