Skip to content

Commit

Permalink
Revert "Revert "set flags to optimize for mmq""
Browse files Browse the repository at this point in the history
This reverts commit 7959e93.
  • Loading branch information
LostRuins committed Jun 26, 2024
1 parent 7959e93 commit 70000b4
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions ggml-cuda/mmq.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
#include <cstdint>

#define MMQ_DP4A_MAX_BATCH_SIZE 64 // Max. batch size to use for dp4a MMQ kernels when FP16 tensor cores are available.
#define GGML_CUDA_FORCE_MMQ

typedef void (*load_tiles_mmq_t)(const char * __restrict__ x, int * x_tile, const int & kbx0, const int & i_max, const int & stride);
typedef void (*vec_dot_mmq_t)(const int * __restrict__ x, const int * __restrict__ y, float * __restrict__ sum, const int & k0);
Expand Down

0 comments on commit 70000b4

Please sign in to comment.