Two changes to reduce the sequence length passed to genienlp #692

gcampax · 2021-07-14T21:43:41Z

The recently approved movie skill returns the full list of actors to all movies it returns. This can be quite a long list, and it's too long to encode to BART, even if every actor is a single GENERIC_ENTITY_org.themoviedb:actor_<n> token (which is split into ENTITY actor <n> before BART tokenization).

To avoid that, we reduce the number of history items encoded in the context (which also helps the model by giving a stronger recency bias) and we cut the length of array values in the result.

This was caught in the staging environment which keeps going down (stanford-oval/genienlp#174)

If an API returns a very long array, we need to trim it before passing it to the model, or we'll exceed the maximum sequence length.

s-jse

Looks good to me.
Would be interesting to see the difference (if any) this makes in terms of accuracy as well.

Reduce the number of history items in the context for the neural model

b5ecba0

gcampax added bug Something isn't working dialogue-agent Issues with the dialogue agent at runtime (not state machine related) training Issues with dataset generation, augmentation, training labels Jul 14, 2021

gcampax requested a review from s-jse July 14, 2021 21:43

Reduce array value length when encoding context for neural model

b1b2b41

If an API returns a very long array, we need to trim it before passing it to the model, or we'll exceed the maximum sequence length.

s-jse approved these changes Jul 15, 2021

View reviewed changes

gcampax force-pushed the wip/reduce-sequence-length branch from 1723c64 to b1b2b41 Compare July 15, 2021 00:32

gcampax merged commit 4ab3aa6 into master Jul 15, 2021

gcampax deleted the wip/reduce-sequence-length branch July 18, 2021 01:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Two changes to reduce the sequence length passed to genienlp #692

Two changes to reduce the sequence length passed to genienlp #692

gcampax commented Jul 14, 2021

s-jse left a comment

Two changes to reduce the sequence length passed to genienlp #692

Two changes to reduce the sequence length passed to genienlp #692

Conversation

gcampax commented Jul 14, 2021

s-jse left a comment

Choose a reason for hiding this comment