You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
By adding a llama.cpp server endpoint option, we could easily just use features already present in llama.cpp without having to rely on llama-cpp-python.
The llama.cpp server supports both HIP and Vulkan on Windows.
The text was updated successfully, but these errors were encountered:
Note that the llama.cpp server endpoint is openai compatible, it would probably be sufficient to reuse the openai endpoint code without any model/API key requirements. Maybe a way to specify samplers like min_p, top_k and temp. tho, this would make it impossible to specify a prompt template and would use chatml by default.
By adding a llama.cpp server endpoint option, we could easily just use features already present in llama.cpp without having to rely on llama-cpp-python.
The llama.cpp server supports both HIP and Vulkan on Windows.
The text was updated successfully, but these errors were encountered: