You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Unless the --no-averaging flag is specified, the learner will attempt to use the averaged perceptron for training. However, the averaging formula is incorrect when averages are only computed for updated weights (and averaging the entire weight vector would be too slow).
A correct approach can be found in p. 19 of Hal Daumé's thesis. This keeps two vectors: the current weights, and the weights' deviation from the sum over all learning timesteps. Both vectors are sparsely updated, and allow the averaged vector to be computed at the end of learning.
The text was updated successfully, but these errors were encountered:
Unless the
--no-averaging
flag is specified, the learner will attempt to use the averaged perceptron for training. However, the averaging formula is incorrect when averages are only computed for updated weights (and averaging the entire weight vector would be too slow).A correct approach can be found in p. 19 of Hal Daumé's thesis. This keeps two vectors: the current weights, and the weights' deviation from the sum over all learning timesteps. Both vectors are sparsely updated, and allow the averaged vector to be computed at the end of learning.
The text was updated successfully, but these errors were encountered: