Skip to content

Releases: Nexesenex/croco.cpp

kobold.cpp-elephantastic_experimental_v1.43.b1216

16 Sep 13:44
Compare
Choose a tag to compare

Kobold CPP v1.43 with CUDA/CUBLAS MMQ fixed (buffers are allocated properly from the start), and unrestricted context.
CodeLlama2 c34b in Q4_K_S can run with 16384 context on a GTX 3090/4090 used as a second graphic card.