The decoder returned a sequence that exceeds the provided max_len when using local:llama.cpp #356

jzju · 2024-06-20T12:09:09Z

When running the following I get the error but its fine with openai (just remove model=model).

model = lmql.model(
    "local:llama.cpp:free.gguf",
    tokenizer="Orenguteng/Llama-3-8B-Lexi-Uncensored",
    n_threads=16,
    n_gpu_layers=128,
    n_ctx=8192,
)
query_string = """
    "Q: What is 1+1 \\n"
    "A: [ANS] \\n"
"""
lmql.run_sync(query_string, model=model).variables

File /usr/local/lib/python3.10/dist-packages/lmql/runtime/interpreter.py:1122, in PromptInterpreter.run(self, fct, *args, **kwargs)
   [1120](https://vscode-remote+ssh-002dremote-002bph.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/dist-packages/lmql/runtime/interpreter.py:1120) else:
   [1121](https://vscode-remote+ssh-002dremote-002bph.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/dist-packages/lmql/runtime/interpreter.py:1121)     state = await self.advance(state)
-> [1122](https://vscode-remote+ssh-002dremote-002bph.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/dist-packages/lmql/runtime/interpreter.py:1122)     assert len(s.input_ids) < decoder_args["max_len"], "The decoder returned a sequence that exceeds the provided max_len (max_len={}, sequence length={}). To increase the max_len, please provide a corresponding max_len argument to the decoder function.".format(decoder_args["max_len"], len(s.input_ids))
   [1124](https://vscode-remote+ssh-002dremote-002bph.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/dist-packages/lmql/runtime/interpreter.py:1124)     assert state.query_head.result is not None, "decoder designates sequence {} as finished but the underyling query program has not produced a result. This is likekly a decoder bug. Decoder in use {}".format(await s.text(), decoder_args["decoder"])
   [1125](https://vscode-remote+ssh-002dremote-002bph.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/dist-packages/lmql/runtime/interpreter.py:1125)     results.append(state.query_head.result)

AssertionError: The decoder returned a sequence that exceeds the provided max_len (max_len=2048, sequence length=2048). To increase the max_len, please provide a corresponding max_len argument to the decoder function.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The decoder returned a sequence that exceeds the provided max_len when using local:llama.cpp #356

The decoder returned a sequence that exceeds the provided max_len when using local:llama.cpp #356

jzju commented Jun 20, 2024 •

edited

Loading

The decoder returned a sequence that exceeds the provided max_len when using local:llama.cpp #356

The decoder returned a sequence that exceeds the provided max_len when using local:llama.cpp #356

Comments

jzju commented Jun 20, 2024 • edited Loading

jzju commented Jun 20, 2024 •

edited

Loading