Add a normal llama.cpp server endpoint option. #338

lastrosade · 2024-03-07T10:49:01Z

By adding a llama.cpp server endpoint option, we could easily just use features already present in llama.cpp without having to rely on llama-cpp-python.

The llama.cpp server supports both HIP and Vulkan on Windows.

lastrosade · 2024-03-07T11:33:57Z

Note that the llama.cpp server endpoint is openai compatible, it would probably be sufficient to reuse the openai endpoint code without any model/API key requirements. Maybe a way to specify samplers like min_p, top_k and temp. tho, this would make it impossible to specify a prompt template and would use chatml by default.

lbeurerkellner added the enhancement New feature or request label Mar 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a normal llama.cpp server endpoint option. #338

Add a normal llama.cpp server endpoint option. #338

lastrosade commented Mar 7, 2024

lastrosade commented Mar 7, 2024 •

edited

Loading

Add a normal llama.cpp server endpoint option. #338

Add a normal llama.cpp server endpoint option. #338

Comments

lastrosade commented Mar 7, 2024

lastrosade commented Mar 7, 2024 • edited Loading

lastrosade commented Mar 7, 2024 •

edited

Loading