Skip to content

Releases: Nexesenex/croco.cpp

Croco.Cpp_FrankenFork_v1.78003_b4067

13 Nov 09:20
Compare
Choose a tag to compare

Test release, for the adventurous minds.

Croco.Cpp_FrankenFork_v1.77009_b3972

24 Oct 22:20
Compare
Choose a tag to compare

Contextshift doesn't depend on Noshift anymore, that parameter is gone.
Just tick the button in GUI like before, or try to use --contextshift in CLI.

Full Changelog: v1.77008_b3962...v1.77009_b3962

Croco.Cpp_FrankenFork_v1.77008_b3972

24 Oct 20:57
Compare
Choose a tag to compare

Lazy release with Concedo's merge ordeal included.

Cuda backend problems with K cache non-FA are fixed, but beware of certain K non-FA / KV FA quants with Qwen 2.5 models (at least the 1.5b), see GUI's KV slider for infos.

Full Changelog: v1.77005_b3962...v1.77008_b3962

Croco.Cpp_FrankenFork_v1.77006_b3972

24 Oct 16:46
Compare
Choose a tag to compare

Test release for Cuda bugfix.

Needs more testing, feedback appreciated.

-> Your CPU, GPU, RAM, OS, and the model tested.

If you're motivated, here's my bucket list among which you can pick :

-> In full Cuda offload : does it work?
-> In full Cuda offload with lowvram argument : does it work?
-> In full Cuda offload with mmq argument : does it work?
-> In full Cuda offload with mmq AND lowvram argument : does it work?
-> In partial offload : does it work?
-> In cuda mode but without layers on the GPU : does it work?

-> Each with KV quant mode 0 (FA, non FA), 1, 9, 14 (FA), 16 and 17 (No FA).

Turing, Ampere, and Ada supported for now, Pascal and Maxwell to come later.

Croco.Cpp_FrankenFork_v1.77005_b3962

23 Oct 07:37
Compare
Choose a tag to compare
Pre-release

Bugfix for 1.77004, with :

  • K q8_0 V F16 FA quant dropped.
  • Non FA Quants dropped for now, only K q6_0 V F16 works (and it's the best anyway).
  • Algo to pass the quant, FA, and no-shift fixed.
  • Up KLite to 182.

Croco.Cpp_FrankenFork_v1.77004_b3962

23 Oct 00:28
Compare
Choose a tag to compare
Pre-release

New lazy release with :

  • @ikawrakow's recent work on KV Quants integrated (new KV quant IQ4_NL to replace Q4_0, with -1% PPL if both K and V, Q6_0 close to Q8_0 in the couple K Q6_0 / V Q5_0).
  • Use the GUI to discover the new modes.

  • Some of Ikawrakow's work on Cuda (on the top of some of his work for CPU inference).
  • Cuda Graph caching PR of Agray3.
  • Some bugfixes (aka, the bugs created by yours truly). ^^
  • Note : Llava users, be careful, it might not work or simply crash.

Second release : with the help fixed.

I'll make a longer readme when motivated.

Full Changelog: v1.76005_b3906...v1.77004_b3962

Croco.Cpp_FrankenFork_v1.77002_b3934

20 Oct 04:15
Compare
Choose a tag to compare
Pre-release
KVQ27 (iq4_nl) doesn't work, I leave it for further testing.
I left the equivalences for the deleted KV quants to the closest equal or inferior bpw, so the config files keep working as they are, until a stable KVQ cocktail of quants is chosen.

Croco.Cpp_FrankenFork_v1.76007_b3917

14 Oct 22:29
Compare
Choose a tag to compare

Croco.Cpp_FrankenFork_v1.76005_b3906

11 Oct 20:57
Compare
Choose a tag to compare

Croco.Cpp_FrankenFork_v1.76004_b3896

08 Oct 00:02
Compare
Choose a tag to compare
Pre-release
v176004_b3896