Skip to content

Releases: Nexesenex/croco.cpp

Kobold.CPP_Frankenstein_v1.65c_b2843

10 May 20:08
Compare
Choose a tag to compare

Frankenstein 1.65c "Fork" of KoboldCPP Experimental up to the 10/05/2024, 20h GMT+2.
Based on Llama.CPP b2843.

  • SmartContext preserved : to force its use, use --smartcontext and --noshift flags together in command line.
  • More detailled benchmarks (better to rename your old benchmark results file).
  • Jart's SILU/Softmax PR and JohannesGaessler's FP32 FA Vector Kernel PR merged.

All credits go to LostRuins and the other contributors to KoboldCPP, and to GGermanov and all the other contributors to LlamaCPP.

Both builds (Cublas 12.3 and standard) include OpenBLAS, CLBLAST, and Vulkan support provided by the devs.

What's Changed

Full Changelog: v1.65b_b2836...v1.65c_b2843

Kobold.CPP_Frankenstein_v1.65b_b2836

10 May 12:54
Compare
Choose a tag to compare

Frankenstein 1.65b "Fork" of KoboldCPP Experimental up to the 10/05/2024, 12h GMT+2.
Based on Llama.CPP b2836.

  • SmartContext preserved : to force its use, use --smartcontext and --noshift flags together in command line.

All credits go to LostRuins and the other contributors to KoboldCPP, and to GGermanov and all the other contributors to LlamaCPP.

Both builds (Cublas 12.3 and standard) include OpenBLAS, CLBLAST, and Vulkan support provided by the devs.

Full Changelog: v1.65a_b2824...v1.65b_b2836

Kobold.CPP_Frankenstein_v1.65a_b2824

09 May 11:28
Compare
Choose a tag to compare

Frankenstein 1.65a "Fork" of KoboldCPP Experimental up to the 9/05/2024, 13h GMT+2.
Based on Llama.CPP b2824.

All credits go to LostRuins and the other contributors to KoboldCPP, and to GGermanov and all the other contributors to LlamaCPP.

Both builds (Cublas 12.3 and standard) include OpenBLAS, CLBLAST, and Vulkan support provided by the devs.

Full Changelog: v1.64b_b2775...v1.65a_b2824

Kobold.CPP_Frankenstein_v1.64b_b2775_FlashAtt

01 May 04:15
Compare
Choose a tag to compare

Frankenstein 1.64b "Fork" of KoboldCPP Experimental up to the 1/05/2024, 6h GMT+2.
Based on Llama.CPP b2775, with Flash Attention merged.

I didn't test it yet, just sharing for the impatient folks like me.

Edit : Flash Attention works.
Ex : On Llama 70b model 👍used with BBS128 FA, blas buffer size divided by 6.5 for the same performance than without FA.
At BBS256 FA, 1.5x performances for 1/3 of the blas buffer size of the BBS128 buffer without FA.
At BBS512 FA, 2x performances, and it's still a smaller blas buffer (around 2/3 size) than BBS128 without FA.

All credits go to LostRuins and the other contributors to KoboldCPP, and to GGermanov and all the other contributors to LlamaCPP.

Both builds (Cublas 12.3 and standard) include OpenBLAS, CLBLAST, and Vulkan support provided by the devs.

Full Changelog: v1.64a_b2749...v1.64b_b2775

Kobold.CPP_Frankenstein_v1.64a_b2749

27 Apr 22:09
Compare
Choose a tag to compare

Frankenstein 1.64a "Fork" of KoboldCPP Experimental up to the 27/04/2024, 17h GMT+2.
Based on LlamaCPP b2749.

Full Changelog: v1.63d_b2723...v1.64a_b2749

v1.63d_b2723

25 Apr 07:13
Compare
Choose a tag to compare

Kobold.CPP_Frankenstein_v1.63c_b2716

23 Apr 21:25
Compare
Choose a tag to compare

Last release of KoboldCPP Experimental (23/04/2024, 20h GMT+2) with LCPP b2716 as a base.
Cuda version compiled wih Cublas 12.3

Full Changelog: v1.63b_b2699...v1.63c_b2716

Kobold.CPP_Frankenstein_v1.63b_b2699_FastMOE

20 Apr 01:06
Compare
Choose a tag to compare

Last release of KoboldCPP Experimental (19/04/2024, 20h GMT+2) with LCPP b2699 as a base.
Cuda version compiled wih Cublas 12.3.

Kobold.CPP_Frankenstein_v1.63a_b2690_fastMOE

18 Apr 03:08
Compare
Choose a tag to compare

KCPP experimental 1.63a, with LCPP b2690, up to date on the 18/04/2024, 00h01.

With Slaren's PR accelerating MOE.

Cuda version compiled with Cublas 12.3.

Full Changelog: v1.62.2a_b2650...v1.63a_b2690

Kobold.CPP_Frankenstein_v1.62.2b_b2650_fastMOE

11 Apr 21:00
Compare
Choose a tag to compare

Requested release, compiled with Cublas 12.3.

Included LlamaCPP b2650, KCPP 1.62.2 last experimental version of the 11/04/2024 at 20h GMT+2.
And Slaren's MOE speed-bump.

Untested, feedback about speed will be appreciated, to compare with to the last released Frankenstein version compiled with Cublas 12.3 (1.59d) before this one.

Koboldcpp_nocuda.exe : standard script of Lostuins.
Koboldcpp.exe : PSutils (high CPU priority mode) added.
PSutil is also integrated in my Cublas build.

What's Changed

Full Changelog: v1.62.1a_b2637...v1.62.2a_b2650