Skip to content

Kobold.CPP_Frankenstein_v1.62b_b2628_IQ1M_fastMOE

Compare
Choose a tag to compare
@Nexesenex Nexesenex released this 08 Apr 20:32
· 49 commits to frankenstein_2 since this release

_Long time no see!

LostRuins has been back to work, and major updates were made in the last weeks on LlamaCPP.
Before our benefactor publishes his new KoboldCPP official release, here's my leechy one!_

Kobold.CPP Frankenstein v1.62's beta source and .exe for Windows built with Openblas/Clblast/Vulkan (small .exe), and the same + Cublas (big .exe) :

  • based on GGermanov'sLlamaCPP b2628 & LostRuin's KoboldCPP Experimental version 1.62 beta.
  • experimental KCPP commits up to the 08/04/2024, 20h GMT+1
  • With SOTA 1.5 bpw (IQ1_S, IQ1_M), 2 bpw (IQ2_XXS, XS, S, M), 3 bits (IQ3_XXS, IQ3_XS, IQ3_S, IQ3_M), and 4 bits (IQ4_XS) GGUF models working as in Llama CPP b2628.
  • With SOTA IQ4_NL quant (for non-standard models with weird tensor shapes) working as in LlamaCPP b2628.
  • With Google Gemma compatibility as in LlamaCPP b2628.

Also with (untested) :

  • Vulkan support implemented by the devs (constantly improving version after version).
  • MOE speed bump by Slaren (PR ggerganov#6505 )

And with, as always :

  • unlocked context size (now standard in KCPP)
  • custom rope settings
  • no KCPP fragmentation cache
  • Lostruins seems to have sorted well the CUDA speed, and I didn't mess with anything this time : it works as intended, both with & without MMQ.

The Cublas version is compiled with Cublas 12.4.

All credits go to LostRuins who develops tirelessly KoboldCPP, to the other devs who brought features to KCPP, and to the devs of LlamaCPP.

For more information on the features of KoboldCPP 1.61.2, it's here : https://github.com/LostRuins/koboldcpp/releases/tag/v1.61.2

The Frankenstein versions of KoboldCPP released here are not supported by LostRuins, nor is the unlocked context size provided here and there in command line: this is for test and amusement only.

What's Changed

Full Changelog: v1.59d_b2254...v1.62b_b2628