Releases: Nexesenex/croco.cpp
Croco.Cpp_FrankenFork_v1.78003_b4067
Test release, for the adventurous minds.
Croco.Cpp_FrankenFork_v1.77009_b3972
Contextshift doesn't depend on Noshift anymore, that parameter is gone.
Just tick the button in GUI like before, or try to use --contextshift in CLI.
Full Changelog: v1.77008_b3962...v1.77009_b3962
Croco.Cpp_FrankenFork_v1.77008_b3972
Lazy release with Concedo's merge ordeal included.
Cuda backend problems with K cache non-FA are fixed, but beware of certain K non-FA / KV FA quants with Qwen 2.5 models (at least the 1.5b), see GUI's KV slider for infos.
Full Changelog: v1.77005_b3962...v1.77008_b3962
Croco.Cpp_FrankenFork_v1.77006_b3972
Test release for Cuda bugfix.
Needs more testing, feedback appreciated.
-> Your CPU, GPU, RAM, OS, and the model tested.
If you're motivated, here's my bucket list among which you can pick :
-> In full Cuda offload : does it work?
-> In full Cuda offload with lowvram argument : does it work?
-> In full Cuda offload with mmq argument : does it work?
-> In full Cuda offload with mmq AND lowvram argument : does it work?
-> In partial offload : does it work?
-> In cuda mode but without layers on the GPU : does it work?
-> Each with KV quant mode 0 (FA, non FA), 1, 9, 14 (FA), 16 and 17 (No FA).
Turing, Ampere, and Ada supported for now, Pascal and Maxwell to come later.
Croco.Cpp_FrankenFork_v1.77005_b3962
Bugfix for 1.77004, with :
- K q8_0 V F16 FA quant dropped.
- Non FA Quants dropped for now, only K q6_0 V F16 works (and it's the best anyway).
- Algo to pass the quant, FA, and no-shift fixed.
- Up KLite to 182.
Croco.Cpp_FrankenFork_v1.77004_b3962
New lazy release with :
- @ikawrakow's recent work on KV Quants integrated (new KV quant IQ4_NL to replace Q4_0, with -1% PPL if both K and V, Q6_0 close to Q8_0 in the couple K Q6_0 / V Q5_0).
-
Use the GUI to discover the new modes.
- Some of Ikawrakow's work on Cuda (on the top of some of his work for CPU inference).
- Cuda Graph caching PR of Agray3.
- Some bugfixes (aka, the bugs created by yours truly). ^^
- Note : Llava users, be careful, it might not work or simply crash.
Second release : with the help fixed.
I'll make a longer readme when motivated.
Full Changelog: v1.76005_b3906...v1.77004_b3962
Croco.Cpp_FrankenFork_v1.77002_b3934
KVQ27 (iq4_nl) doesn't work, I leave it for further testing. I left the equivalences for the deleted KV quants to the closest equal or inferior bpw, so the config files keep working as they are, until a stable KVQ cocktail of quants is chosen.
Croco.Cpp_FrankenFork_v1.76007_b3917
v1.76007_b3917
Croco.Cpp_FrankenFork_v1.76005_b3906
v176005_b3906
Croco.Cpp_FrankenFork_v1.76004_b3896
v176004_b3896