Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync Prepare pr of minicpm v2.6 #26

Merged
merged 370 commits into from
Aug 15, 2024
Merged
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Jul 7, 2024

  1. finetune: Rename command name in README.md (ggerganov#8343)

    Rename an old command name "finetune" to "llama-finetune"
    in README.md
    
    Signed-off-by: Masanari Iida <standby24x7@gmail.com>
    standby24x7 authored Jul 7, 2024
    Configuration menu
    Copy the full SHA
    b81ba1f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d39130a View commit details
    Browse the repository at this point in the history
  3. llama : fix n_rot default (ggerganov#8348)

    ggml-ci
    ggerganov authored Jul 7, 2024
    Configuration menu
    Copy the full SHA
    b504008 View commit details
    Browse the repository at this point in the history
  4. llama : support glm3 and glm4 (ggerganov#8031)

    * add chatglm3-6b model support huggingface model:
     https://hf-mirror.com/THUDM/chatglm3-6b
    
    Signed-off-by: XingXing Qiao <qiaoxx@dingdao.com>
    
    * remove .rotary_pos_emb.inv_freq and unuse code for chatglm3 model
    
    Signed-off-by: XingXing Qiao <qiaoxx@dingdao.com>
    
    * fix lint error
    
    Signed-off-by: XingXing Qiao <qiaoxx@dingdao.com>
    
    * optimize convert-hf-to-gguf.py for chatglm model
    
    Signed-off-by: XingXing Qiao <qiaoxx@dingdao.com>
    
    * support glm-4-9b-chat
    
    Signed-off-by: XingXing Qiao <qiaoxx@dingdao.com>
    
    * fix eos tokens to glm4
    
    * remove unused log
    
    * add preprocess to chatglm3 and chatglm4
    
    * add eos_id_list to llama.cpp
    
    * fix code style
    
    * fix code style
    
    * fix conflicts
    
    * fix conflicts
    
    * Revert "add eos_id_list to llama.cpp"
    
    This reverts commit 3a4d579.
    
    * set <|endoftext|> as eos and <|user|> as eot
    
    * fix chat template bug
    
    * add comment to glm prefix and suffix
    
    * fix conflicts and add rope_ratio & ChatGLMForConditionalGeneration
    
    * fix chat template bug
    
    * fix codestyle
    
    * fix conflicts
    
    * modified the general name of glm model
    
    * fix conflicts
    
    * remove prefix and suffix
    
    * use normal glm4 chattempalte & use LLM_FFN_SWIGLU in phi3
    
    * fix: resolve Flake8 errors in `convert-hf-to-gguf.py`
    
    - Fix E302 by adding two blank lines before top-level function definitions
    - Replace print statements to fix NP100
    - Fix E303 by ensuring only one blank line between lines of code
    
    * fix rope ratio to solve incorrect answers
    
    * fix by comments
    
    ---------
    
    Signed-off-by: XingXing Qiao <qiaoxx@dingdao.com>
    Co-authored-by: XingXing Qiao <qiaoxx@dingdao.com>
    Co-authored-by: Umpire2018 <138990495+Umpire2018@users.noreply.github.com>
    3 people authored Jul 7, 2024
    Configuration menu
    Copy the full SHA
    905942a View commit details
    Browse the repository at this point in the history
  5. gguf-hash: model wide and per tensor hashing using xxhash and sha1 (g…

    …gerganov#8048)
    
    CLI to hash GGUF files to detect difference on a per model and per tensor level
    
    The hash type we support is:
    
    - `--xxh64`: use xhash 64bit hash mode (default)
    - `--sha1`: use sha1
    - `--uuid`: use uuid
    - `--sha256`: use sha256
    
    While most POSIX systems already have hash checking programs like sha256sum, it
    is designed to check entire files. This is not ideal for our purpose if we want
    to check for consistency of the tensor data even if the metadata content of the
    gguf KV store has been updated.
    
    This program is designed to hash a gguf tensor payload on a 'per tensor layer'
    in addition to a 'entire tensor model' hash. The intent is that the entire
    tensor layer can be checked first but if there is any detected inconsistencies,
    then the per tensor hash can be used to narrow down the specific tensor layer
    that has inconsistencies.
    
    Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
    mofosyne and ggerganov authored Jul 7, 2024
    Configuration menu
    Copy the full SHA
    f7cab35 View commit details
    Browse the repository at this point in the history
  6. readme : update bindings list (ggerganov#8222)

    * adding guile_llama_cpp  to binding list
    
    * fix formatting
    
    * fix formatting
    andy-tai authored Jul 7, 2024
    Configuration menu
    Copy the full SHA
    f1948f1 View commit details
    Browse the repository at this point in the history
  7. ci : add checks for cmake,make and ctest in ci/run.sh (ggerganov#8200)

    * Added checks for cmake,make and ctest
    
    * Removed erroneous whitespace
    AlexsCode authored Jul 7, 2024
    Configuration menu
    Copy the full SHA
    4090ea5 View commit details
    Browse the repository at this point in the history
  8. Update llama-cli documentation (ggerganov#8315)

    * Update README.md
    
    * Update README.md
    
    * Update README.md
    
    fixed llama-cli/main, templates on some cmds added chat template sections and fixed typos in some areas
    
    * Update README.md
    
    * Update README.md
    
    * Update README.md
    dspasyuk authored Jul 7, 2024
    Configuration menu
    Copy the full SHA
    a8db2a9 View commit details
    Browse the repository at this point in the history
  9. py : type-check all Python scripts with Pyright (ggerganov#8341)

    * py : type-check all Python scripts with Pyright
    
    * server-tests : use trailing slash in openai base_url
    
    * server-tests : add more type annotations
    
    * server-tests : strip "chat" from base_url in oai_chat_completions
    
    * server-tests : model metadata is a dict
    
    * ci : disable pip cache in type-check workflow
    
    The cache is not shared between branches, and it's 250MB in size,
    so it would become quite a big part of the 10GB cache limit of the repo.
    
    * py : fix new type errors from master branch
    
    * tests : fix test-tokenizer-random.py
    
    Apparently, gcc applies optimisations even when pre-processing,
    which confuses pycparser.
    
    * ci : only show warnings and errors in python type-check
    
    The "information" level otherwise has entries
    from 'examples/pydantic_models_to_grammar.py',
    which could be confusing for someone trying to figure out what failed,
    considering that these messages can safely be ignored
    even though they look like errors.
    compilade authored Jul 7, 2024
    Configuration menu
    Copy the full SHA
    3fd62a6 View commit details
    Browse the repository at this point in the history

Commits on Jul 8, 2024

  1. Configuration menu
    Copy the full SHA
    04ce3a8 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ffd0079 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    6f0dbf6 View commit details
    Browse the repository at this point in the history
  4. common : preallocate sampling token data vector (ggerganov#8363)

    `emplace_back` repeatedly-called is slower than preallocating the vector to the vocab size and directly inserting the data. Some rudimentary profiling with `chrono` improves the performance of this block of code from ~500us/op to ~40us/op.
    
    Overall, this slightly improves the sampling performance which has a more substantial impact for the `examples/lookahead` implementation -- I am able to see a ~10% performance boost in lookahead inference.
    kevmo314 authored Jul 8, 2024
    Configuration menu
    Copy the full SHA
    470939d View commit details
    Browse the repository at this point in the history
  5. feat: cuda implementation for ggml_conv_transpose_1d (ggml/854)

    * conv transpose 1d passing test for 1d input and kernel
    
    * working for different input and output channel counts, added test for variable stride
    
    * initial draft appears to work with stride other than 1
    
    * working with all old and new conv1d  tests
    
    * added a test for large tensors
    
    * removed use cuda hardcoding
    
    * restored test-conv-transpose.c
    
    * removed unused arugments, and fixed bug where test failure would cause subsequent tests to fail
    
    * fixed accumulator bug
    
    * added test to test-backend-ops
    
    * fixed mistake
    
    * addressed review
    
    * fixed includes
    
    * removed blank lines
    
    * style and warning fixes
    
    * return failure when test fails
    
    * fix supports_op
    
    ---------
    
    Co-authored-by: slaren <slarengh@gmail.com>
    2 people authored and ggerganov committed Jul 8, 2024
    Configuration menu
    Copy the full SHA
    fde13b3 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    6847d54 View commit details
    Browse the repository at this point in the history
  7. sync : ggml

    ggml-ci
    ggerganov committed Jul 8, 2024
    Configuration menu
    Copy the full SHA
    2ee44c9 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    3f2d538 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    2ec846d View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    c4dd11d View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    a130ecc View commit details
    Browse the repository at this point in the history
  12. flake.lock: Update (ggerganov#8342)

    Flake lock file updates:
    
    • Updated input 'flake-parts':
        'github:hercules-ci/flake-parts/2a55567fcf15b1b1c7ed712a2c6fadaec7412ea8?narHash=sha256-iKzJcpdXih14qYVcZ9QC9XuZYnPc6T8YImb6dX166kw%3D' (2024-06-01)
      → 'github:hercules-ci/flake-parts/9227223f6d922fee3c7b190b2cc238a99527bbb7?narHash=sha256-pQMhCCHyQGRzdfAkdJ4cIWiw%2BJNuWsTX7f0ZYSyz0VY%3D' (2024-07-03)
    • Updated input 'flake-parts/nixpkgs-lib':
        'https://github.com/NixOS/nixpkgs/archive/eb9ceca17df2ea50a250b6b27f7bf6ab0186f198.tar.gz?narHash=sha256-lIbdfCsf8LMFloheeE6N31%2BBMIeixqyQWbSr2vk79EQ%3D' (2024-06-01)
      → 'https://github.com/NixOS/nixpkgs/archive/5daf0514482af3f97abaefc78a6606365c9108e2.tar.gz?narHash=sha256-Fm2rDDs86sHy0/1jxTOKB1118Q0O3Uc7EC0iXvXKpbI%3D' (2024-07-01)
    • Updated input 'nixpkgs':
        'github:NixOS/nixpkgs/b2852eb9365c6de48ffb0dc2c9562591f652242a?narHash=sha256-C8e9S7RzshSdHB7L%2Bv9I51af1gDM5unhJ2xO1ywxNH8%3D' (2024-06-27)
      → 'github:NixOS/nixpkgs/9f4128e00b0ae8ec65918efeba59db998750ead6?narHash=sha256-rwz8NJZV%2B387rnWpTYcXaRNvzUSnnF9aHONoJIYmiUQ%3D' (2024-07-03)
    
    Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
    ggerganov and github-actions[bot] authored Jul 8, 2024
    Configuration menu
    Copy the full SHA
    7fdb6f7 View commit details
    Browse the repository at this point in the history

Commits on Jul 9, 2024

  1. Configuration menu
    Copy the full SHA
    7d0e23d View commit details
    Browse the repository at this point in the history
  2. readme : fix typo [no ci] (ggerganov#8389)

    Bakus-Naur --> Backus-Naur
    daghanerdonmez authored Jul 9, 2024
    Configuration menu
    Copy the full SHA
    9beb2dd View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    9925ca4 View commit details
    Browse the repository at this point in the history
  4. sycl : Reenabled mmvq path for the SYCL Nvidia Backend (ggerganov#8372)

    * SYCL : Reenabled mmvq path for the SYCL Nvidia Backend
    
    * Reduced verbosity of comment
    Alcpz authored Jul 9, 2024
    Configuration menu
    Copy the full SHA
    5b0b8d8 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    a03e8dd View commit details
    Browse the repository at this point in the history
  6. Deprecation warning to assist with migration to new binary names (gge…

    …rganov#8283)
    
    * Adding a simple program to provide a deprecation warning that can exist to help people notice the binary name change from ggerganov#7809 and migrate to the new filenames.
    
    * Build legacy replacement binaries only if they already exist. Check for their existence every time so that they are not ignored.
    HanClinto authored Jul 9, 2024
    Configuration menu
    Copy the full SHA
    e500d61 View commit details
    Browse the repository at this point in the history
  7. Update README.md to fix broken link to docs (ggerganov#8399)

    Update the "Performance troubleshooting" doc link to be correct - the file was moved into a dir called 'development'
    andysalerno authored Jul 9, 2024
    Configuration menu
    Copy the full SHA
    fd560fe View commit details
    Browse the repository at this point in the history
  8. Server: Enable setting default sampling parameters via command-line (g…

    …gerganov#8402)
    
    * Load server sampling parameters from the server context by default.
    
    * Wordsmithing comment
    HanClinto authored Jul 9, 2024
    Configuration menu
    Copy the full SHA
    a59f8fd View commit details
    Browse the repository at this point in the history

Commits on Jul 10, 2024

  1. Configuration menu
    Copy the full SHA
    8f0fad4 View commit details
    Browse the repository at this point in the history
  2. py : fix converter for internlm2 (ggerganov#8321)

    * update internlm2
    
    * remove unused file
    
    * fix lint
    RunningLeon authored Jul 10, 2024
    Configuration menu
    Copy the full SHA
    e4dd31f View commit details
    Browse the repository at this point in the history
  3. llama : add assert about missing llama_encode() call (ggerganov#8400)

    Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>
    fairydreaming and sszymczy authored Jul 10, 2024
    Configuration menu
    Copy the full SHA
    a8be1e6 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    7a80710 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    cc61948 View commit details
    Browse the repository at this point in the history
  6. gguf-py rel pipeline (ggerganov#8410)

    * Upd gguf-py/readme
    
    * Bump patch version for release
    monatis authored Jul 10, 2024
    Configuration menu
    Copy the full SHA
    83321c6 View commit details
    Browse the repository at this point in the history
  7. ggml : add AArch64 optimized GEMV and GEMM Q4 kernels (ggerganov#5780)

    * Arm AArch64: optimized GEMV and GEMM kernels for q4_0_q8_0, and q8_0_q8_0 quantization
    
    * Arm AArch64: add optimized GEMV and GEMM asm kernels for q4_0_q8_0 quantization and refactor code to address llama.cpp pr#5780 suggestions
    
    * Arm AArch64: add optimized GEMV and GEMM asm kernels for q4_0_q8_0 quantization and refactor code to address llama.cpp pr#5780 suggestions
    
    * Arm AArch64: add optimized GEMV and GEMM asm kernels for q4_0_q8_0 quantization and refactor code to address llama.cpp pr#5780 suggestions
    
    * Arm AArch64: add optimized GEMV and GEMM asm kernels for q4_0_q8_0 quantization and refactor code to address llama.cpp pr#5780 suggestions
    
    * Arm AArch64: add copyright claim only to ggml-aarch64.cpp and ggml-aarch64.h files
    
    * Arm AArch64: minor code refactoring for rebase
    
    * Arm AArch64: minor code refactoring for resolving a build issue with cmake
    
    * Arm AArch64: minor code refactoring to split the Q4_0_AARC64 type into three separate types: Q4_0_4_4, Q4_0_4_8, and Q4_0_8_8
    
    * Arm AArch64: minor code change for resolving a build issue with server-windows
    
    * retrigger checks
    
    * Arm AArch64: minor code changes for rebase
    
    * Arm AArch64: minor changes to skip the pr#7433 vec_dot code for arm cpus with SVE VL not equal to 256 bits
    
    * Arm AArch64: remove stale LLAMA_QKK_64 from CMakeLists.txt and delete build.zig
    
    * Arm AArch64: add reference scalar gemm and gemv, and avoid dynamic memory allocations during quantization for Q4_0_4_4, Q4_0_4_8, and Q4_0_8_8
    
    * Arm AArch64: add multithreaded quantization support for the new types: Q4_0_4_4, Q4_0_4_8, and Q4_0_8_8
    
    * Arm AArch64: minor code refactoring
    
    * Arm AArch64: simplify logic for calling gemm and gemv functions in ggml_compute_forward_mul_mat
    
    * Arm AArch64: minimize changes in ggml_compute_forward_mul_mat
    
    * Arm AArch64: minor code refactoring, and add reference scalar code to quantize routines for new quant types
    
    * Arm AArch64: minor code refactoring
    
    * Arm AArch64: minor code refactoring
    
    * Arm AArch64: minor code refactoring
    
    * rebase on the latest master commit 3fd62a6 and adapt to the new directory structure
    
    * Arm AArch64: remove a redundant comment
    
    * Arm AArch64: add pragma in ggml-aarch64.c to turn -Woverlength-strings warning off
    
    * Arm AArch64: use __aarch64__ check to guard 64-bit neon kernels
    
    * Arm AArch64: update docs/build.md README to include compile time flags for buiilding the Q4_0_4_4 quant type
    Dibakar authored Jul 10, 2024
    Configuration menu
    Copy the full SHA
    0f1a39f View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    6b2a849 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    f4444d9 View commit details
    Browse the repository at this point in the history
  10. Name Migration: Build the deprecation-warning 'main' binary every time (

    ggerganov#8404)
    
    * Modify the deprecation-warning 'main' binary to build every time, instead of only when a legacy binary is present. This is to help users of tutorials and other instruction sets from knowing what to do when the 'main' binary is missing and they are trying to follow instructions.
    
    * Adjusting 'server' name-deprecation binary to build all the time, similar to the 'main' legacy name binary.
    HanClinto authored Jul 10, 2024
    Configuration menu
    Copy the full SHA
    dd07a12 View commit details
    Browse the repository at this point in the history

Commits on Jul 11, 2024

  1. Configuration menu
    Copy the full SHA
    278d0e1 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    7a221b6 View commit details
    Browse the repository at this point in the history
  3. tokenize : add --no-parse-special option (ggerganov#8423)

    This should allow more easily explaining
    how parse_special affects tokenization.
    compilade authored Jul 11, 2024
    Configuration menu
    Copy the full SHA
    9a55ffe View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    a977c11 View commit details
    Browse the repository at this point in the history
  5. CUDA: optimize and refactor MMQ (ggerganov#8416)

    * CUDA: optimize and refactor MMQ
    
    * explicit q8_1 memory layouts, add documentation
    JohannesGaessler authored Jul 11, 2024
    Configuration menu
    Copy the full SHA
    808aba3 View commit details
    Browse the repository at this point in the history
  6. cuda : suppress 'noreturn' warn in no_device_code (ggerganov#8414)

    * cuda : suppress 'noreturn' warn in no_device_code
    
    This commit adds a while(true) loop to the no_device_code function in
    common.cuh. This is done to suppress the warning:
    
    ```console
    /ggml/src/ggml-cuda/template-instances/../common.cuh:346:1: warning:
    function declared 'noreturn' should not return [-Winvalid-noreturn]
      346 | }
          | ^
    ```
    
    The motivation for this is to reduce the number of warnings when
    compilng with GGML_HIPBLAS=ON.
    
    Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
    
    * squash! cuda : suppress 'noreturn' warn in no_device_code
    
    Update __trap macro instead of using a while loop to suppress the
    warning.
    
    Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
    
    ---------
    
    Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
    danbev authored Jul 11, 2024
    Configuration menu
    Copy the full SHA
    b078c61 View commit details
    Browse the repository at this point in the history
  7. ggml : add NVPL BLAS support (ggerganov#8329) (ggerganov#8425)

    * ggml : add NVPL BLAS support
    
    * ggml : replace `<BLASLIB>_ENABLE_CBLAS` with `GGML_BLAS_USE_<BLASLIB>`
    
    ---------
    
    Co-authored-by: ntukanov <ntukanov@nvidia.com>
    nicholaiTukanov and ntukanov authored Jul 11, 2024
    Configuration menu
    Copy the full SHA
    3686456 View commit details
    Browse the repository at this point in the history

Commits on Jul 12, 2024

  1. [SYCL] fix the mul_mat_id ut issues (ggerganov#8427)

    * fix part of mul_mat_id
    
    * skip the bfloat 16 sycl ut
    
    Signed-off-by: Chen Xi <xi2chen@intel.com>
    
    ---------
    
    Signed-off-by: Chen Xi <xi2chen@intel.com>
    Co-authored-by: Meng, Hengyu <hengyu.meng@intel.com>
    Co-authored-by: Chen Xi <xi2chen@intel.com>
    3 people authored Jul 12, 2024
    Configuration menu
    Copy the full SHA
    b549a1b View commit details
    Browse the repository at this point in the history
  2. ggml : minor naming changes (ggerganov#8433)

    * ggml : minor naming changes
    
    ggml-ci
    
    * ggml : use PRId64 [no ci]
    
    * ggml : revert FA K/Q names
    ggerganov authored Jul 12, 2024
    Configuration menu
    Copy the full SHA
    370b1f7 View commit details
    Browse the repository at this point in the history
  3. examples : sprintf -> snprintf (ggerganov#8434)

    * examples : sprintf -> snprintf
    
    ggml-ci
    
    * examples : use sizeof() instead of hardcoded constants
    ggerganov authored Jul 12, 2024
    Configuration menu
    Copy the full SHA
    71c1121 View commit details
    Browse the repository at this point in the history
  4. convert : remove fsep token from GPTRefactForCausalLM (ggerganov#8237)

    The <filename> token used by Refact doesn't serve
    the same purpose as the <file_separator> from CodeGemma.
    
    Signed-off-by: Jiri Podivin <jpodivin@redhat.com>
    jpodivin authored Jul 12, 2024
    Configuration menu
    Copy the full SHA
    5aefbce View commit details
    Browse the repository at this point in the history
  5. docker : fix filename for convert-hf-to-gguf.py in tools.sh (ggergano…

    …v#8441)
    
    Commit b0a4699 changed the name of this script from convert-hf-to-gguf.py to
    convert_hf_to_gguf.py breaking how convert is called from within a Docker
    container.
    kriation authored Jul 12, 2024
    Configuration menu
    Copy the full SHA
    8a4441e View commit details
    Browse the repository at this point in the history
  6. server : ensure batches are either all embed or all completion (ggerg…

    …anov#8420)
    
    * make sure batches are all embed or all non-embed
    
    * non-embedding batch for sampled tokens; fix unused params warning
    iamlemec authored Jul 12, 2024
    Configuration menu
    Copy the full SHA
    c3ebcfa View commit details
    Browse the repository at this point in the history
  7. llama : suppress unary minus operator warning (ggerganov#8448)

    This commit updates the _try_copy lambda and moves the unary minus
    operator to after the cast to int32_t.
    
    The motivation for this that currently the following warning is
    generated on windows:
    
    ```console
    llama.cpp\src\llama.cpp(21147,30): warning C4146: unary minus operator
    applied to unsigned type, result still unsigned
    ```
    danbev authored Jul 12, 2024
    Configuration menu
    Copy the full SHA
    f532262 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    6af51c0 View commit details
    Browse the repository at this point in the history
  9. server : handle content array in chat API (ggerganov#8449)

    * server : handle content array in chat API
    
    * Update examples/server/utils.hpp
    
    Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
    
    ---------
    
    Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
    ggerganov and ngxson authored Jul 12, 2024
    Configuration menu
    Copy the full SHA
    4e24cff View commit details
    Browse the repository at this point in the history

Commits on Jul 13, 2024

  1. Configuration menu
    Copy the full SHA
    c917b67 View commit details
    Browse the repository at this point in the history
  2. vulkan : cmake integration (ggerganov#8119)

    * Add Vulkan to CMake pkg
    
    * Add Sycl to CMake pkg
    
    * Add OpenMP to CMake pkg
    
    * Split generated shader file into separate translation unit
    
    * Add CMake target for Vulkan shaders
    
    * Update README.md
    
    * Add make target for Vulkan shaders
    
    * Use pkg-config to locate vulkan library
    
    * Add vulkan SDK dep to ubuntu-22-cmake-vulkan workflow
    
    * Clean up tabs
    
    * Move sudo to apt-key invocation
    
    * Forward GGML_EXTRA_LIBS to CMake config pkg
    
    * Update vulkan obj file paths
    
    * Add shaderc to nix pkg
    
    * Add python3 to Vulkan nix build
    
    * Link against ggml in cmake pkg
    
    * Remove Python dependency from Vulkan build
    
    * code review changes
    
    * Remove trailing newline
    
    * Add cflags from pkg-config to fix w64devkit build
    
    * Update README.md
    
    * Remove trailing whitespace
    
    * Update README.md
    
    * Remove trailing whitespace
    
    * Fix doc heading
    
    * Make glslc required Vulkan component
    
    * remove clblast from nix pkg
    bandoti authored Jul 13, 2024
    Configuration menu
    Copy the full SHA
    17eb6aa View commit details
    Browse the repository at this point in the history

Commits on Jul 14, 2024

  1. llama : fix pre-tokenization of non-special added tokens (ggerganov#8228

    )
    
    * llama : fix mpt and olmo pre-tokenizer
    
    * llama : pre-tokenize non-special user-defined tokens first
    
    * llama : fix detection of control-like user-defined tokens
    
    * convert_hf : identify which user-defined tokens are control tokens
    
    Only used in _set_vocab_gpt2() for now.
    
    * convert_hf : identify more added control tokens for SPM tokenziers
    
    This makes Gemma and Gemma-2 tokenize pretty much EVERYTHING correctly,
    including HTML tags and consecutive spaces,
    but it unfortunately requires model re-conversion.
    
    There seems to be a weird behavior of the HF tokenizer for Gemma,
    which prefers to use the 16-space token over more lengthy space tokens,
    while using the SentencePiece tokenizer does not do this.
    (the implementation in llama.cpp has the same behavior as SentencePiece)
    
    * llama : fix wrong pre-tokenization of byte tokens
    
    * llama : fix Viking pre-tokenizer regex
    
    The order was previously wrong, which caused errors in some tests.
    
    * llama : fix command-r detokenization
    
    * convert_hf : reduce usages of the UNKNOWN token type
    
    * llama : add UNKNOWN tokens in the special tokens cache
    
    * convert_hf : reduce usages of UNKNOWN for InternLM2
    
    This makes the changes from ggerganov#8321 more consistent
    with the other changes made here.
    
    * test-tokenizer-random : reduce potential confilcts with ggerganov#8379
    
    * test-tokenizer-random : add a failing edge case for falcon
    compilade authored Jul 14, 2024
    Configuration menu
    Copy the full SHA
    fa79495 View commit details
    Browse the repository at this point in the history
  2. gguf_hash.py: Add sha256 (ggerganov#8470)

    * gguf_hash.py: Add sha256
    
    * gguf_hash.py: rename string UUIDv5 --> uuid
    
    * Apply suggestions from code review
    
    Co-authored-by: compilade <git@compilade.net>
    
    ---------
    
    Co-authored-by: compilade <git@compilade.net>
    mofosyne and compilade authored Jul 14, 2024
    Configuration menu
    Copy the full SHA
    e236528 View commit details
    Browse the repository at this point in the history
  3. llama : fix Gemma-2 Query scaling factors (ggerganov#8473)

    * 9B - query_pre_attn_scalar = 256 not 224
    
    See google/gemma_pytorch@03e6575
    
    Gemma 9b should use 256 and not 224 (self.config.hidden_size // self.config.num_attention_heads)
    
    * llama : fix Gemma-2 Query scaling factor
    
    ggml-ci
    
    ---------
    
    Co-authored-by: Daniel Han <danielhanchen@gmail.com>
    ggerganov and danielhanchen authored Jul 14, 2024
    Configuration menu
    Copy the full SHA
    73cf442 View commit details
    Browse the repository at this point in the history
  4. flake.lock: Update (ggerganov#8475)

    Flake lock file updates:
    
    • Updated input 'nixpkgs':
        'github:NixOS/nixpkgs/9f4128e00b0ae8ec65918efeba59db998750ead6?narHash=sha256-rwz8NJZV%2B387rnWpTYcXaRNvzUSnnF9aHONoJIYmiUQ%3D' (2024-07-03)
      → 'github:NixOS/nixpkgs/7e7c39ea35c5cdd002cd4588b03a3fb9ece6fad9?narHash=sha256-EYekUHJE2gxeo2pM/zM9Wlqw1Uw2XTJXOSAO79ksc4Y%3D' (2024-07-12)
    
    Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
    ggerganov and github-actions[bot] authored Jul 14, 2024
    Configuration menu
    Copy the full SHA
    aaab241 View commit details
    Browse the repository at this point in the history
  5. pydantic : replace uses of __annotations__ with get_type_hints (ggerg…

    …anov#8474)
    
    * pydantic : replace uses of __annotations__ with get_type_hints
    
    * pydantic : fix Python 3.9 and 3.10 support
    compilade authored Jul 14, 2024
    Configuration menu
    Copy the full SHA
    090fca7 View commit details
    Browse the repository at this point in the history

Commits on Jul 15, 2024

  1. Vulkan MMQ Fix (ggerganov#8479)

    * Fix incoherence by adding missing LOAD_VEC_A parameter
    
    * Fix Vulkan op result checker build error
    0cc4m authored Jul 15, 2024
    Configuration menu
    Copy the full SHA
    bda62d7 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    3dfda05 View commit details
    Browse the repository at this point in the history
  3. [SYCL] add concat through dim 1/2 (ggerganov#8483)

    * add concat through dim 1/2
    airMeng authored Jul 15, 2024
    Configuration menu
    Copy the full SHA
    16bdfa4 View commit details
    Browse the repository at this point in the history
  4. docs: fix links in development docs [no ci] (ggerganov#8481)

    Fixes a few links to within the repo that were broken in the reorganization of the
    documentation in ggerganov#8325.
    NikolaiLyssogor authored Jul 15, 2024
    Configuration menu
    Copy the full SHA
    fc690b0 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    9104bc2 View commit details
    Browse the repository at this point in the history
  6. server: update README.md with llama-server --help output [no ci] (gge…

    …rganov#8472)
    
    The README.md had a stale information. In particular, the --ctx-size
    "defaults to 512" confused me and I had to check the code to confirm
    this was false. This the server is evolving rapidly, it's probably
    better to keep the source of truth at a single place (in the source) and
    generate the README.md based on that.
    
    Did:
    
        make llama-server
        ./llama-server --help > t.txt
        vimdiff t.txt examples/server/README.md
    
    I copied the content inside a backquote block. I would have preferred
    proper text but it would require a fair amount of surgery to make the
    current output compatible with markdown. A follow up could be to
    automate this process with a script.
    
    No functional change.
    maruel authored Jul 15, 2024
    Configuration menu
    Copy the full SHA
    f17f39f View commit details
    Browse the repository at this point in the history
  7. ggml : suppress unknown pragma 'GCC' on windows (ggerganov#8460)

    This commit adds a macro guard to pragma GCC to avoid the following
    warning on windows:
    
    ```console
    C:\llama.cpp\ggml\src\ggml-aarch64.c(17,9): warning C4068:
    unknown pragma 'GCC' [C:\lama.cpp\build\ggml\src\ggml.vcxproj]
    ```
    danbev authored Jul 15, 2024
    Configuration menu
    Copy the full SHA
    8fac431 View commit details
    Browse the repository at this point in the history
  8. fix ci (ggerganov#8494)

    ngxson authored Jul 15, 2024
    Configuration menu
    Copy the full SHA
    4db8f60 View commit details
    Browse the repository at this point in the history
  9. Refactor lora adapter support (ggerganov#8332)

    * lora: load to devide buft
    
    * add patch tensor function
    
    * correct tensor patch
    
    * llama_lora_adapter_apply
    
    * correct ggml_backend_tensor_copy
    
    * add llm_build_mm
    
    * fix auto merge
    
    * update based on review comments
    
    * add convert script
    
    * no more transpose A
    
    * add f16 convert
    
    * add metadata check
    
    * add sanity check
    
    * fix ftype
    
    * add requirements
    
    * fix requirements
    
    * fix outfile
    
    * conversion: only allow selected models
    
    * fix types
    
    * cuda : do not use dmmv if the tensor does not have enough cols
    
    * llama : lora fixes
    
    * do not disable mmap with lora
    
    Co-authored-by: slaren <slarengh@gmail.com>
    
    * llm_build_lora_mm_id
    
    * convert_lora : MoE LoRA conversion support
    
    * convert_lora : prefer safetensors, similarly to convert_hf
    
    * convert_hf : simplify modify_tensors for InternLM2
    
    * convert_lora : lazy conversion
    
    * llama : load and use alpha from LoRA adapters
    
    * llama : use llm_build_lora_mm in most model graphs
    
    * auto scale
    
    * Revert "auto scale"
    
    This reverts commit 42415a4.
    
    * remove redundant params
    
    * Apply suggestions from code review
    
    Co-authored-by: slaren <slarengh@gmail.com>
    
    * change kv metadata
    
    * move add_type to __init__
    
    * convert_hf : move add_type to main()
    
    * convert_lora : use the GGUFWriter from Model instead of overwriting it
    
    ---------
    
    Co-authored-by: slaren <slarengh@gmail.com>
    Co-authored-by: Francis Couture-Harpin <git@compilade.net>
    3 people authored Jul 15, 2024
    Configuration menu
    Copy the full SHA
    97bdd26 View commit details
    Browse the repository at this point in the history

Commits on Jul 16, 2024

  1. convert_hf : faster lazy safetensors (ggerganov#8482)

    * convert_hf : faster lazy safetensors
    
    This makes '--dry-run' much, much faster.
    
    * convert_hf : fix memory leak in lazy MoE conversion
    
    The '_lazy' queue was sometimes self-referential,
    which caused reference cycles of objects old enough
    to avoid garbage collection until potential memory exhaustion.
    compilade authored Jul 16, 2024
    Configuration menu
    Copy the full SHA
    7acfd4e View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0efec57 View commit details
    Browse the repository at this point in the history
  3. export-lora : handle help argument (ggerganov#8497)

    The --help option on export-lora isn't accepted as valid. The help still gets displayed by default, but the script exits with an error message and nonzero status.
    sbonds authored Jul 16, 2024
    Configuration menu
    Copy the full SHA
    37b12f9 View commit details
    Browse the repository at this point in the history
  4. gguf-hash : update clib.json to point to original xxhash repo (ggerga…

    …nov#8491)
    
    * Update clib.json to point to Cyan4973 original xxhash
    
    Convinced Cyan4973 to add clib.json directly to his repo, so can now point the clib package directly to him now. Previously pointed to my fork with the clib.json package metadata
    
    Cyan4973/xxHash#954
    
    * gguf-hash: readme update to point to Cyan4973 xxHash repo [no ci]
    mofosyne authored Jul 16, 2024
    Configuration menu
    Copy the full SHA
    1666f92 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    5e116e8 View commit details
    Browse the repository at this point in the history

Commits on Jul 17, 2024

  1. Configuration menu
    Copy the full SHA
    d65a836 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    da3913d View commit details
    Browse the repository at this point in the history
  3. [CANN] Add Ascend NPU backend (ggerganov#6035)

    * [CANN] Add Ascend NPU backend
    
    Ascend is a full-stack AI computing infrastructure for industry
    applications and services based on Huawei Ascend processors and
    software.
    
    CANN (Compute Architecture of Neural Networks), developped by
    Huawei, is a heterogeneous computing architecture for AI.
    
    Co-authored-by: wangshuai09 <391746016@qq.com>
    
    * delete trailing whitespaces
    
    * Modify the code based on review comment
    
    * Rename LLAMA_CANN to GGML_CANN
    
    * Make ggml-common.h private
    
    * add ggml_cann prefix for acl funcs
    
    * Add logging for CANN backend
    
    * Delete Trailing whitespace
    
    ---------
    
    Co-authored-by: wangshuai09 <391746016@qq.com>
    hipudding and wangshuai09 authored Jul 17, 2024
    Configuration menu
    Copy the full SHA
    1bdd8ae View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    30f80ca View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    b328344 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    e02b597 View commit details
    Browse the repository at this point in the history

Commits on Jul 18, 2024

  1. Configuration menu
    Copy the full SHA
    3807c3d View commit details
    Browse the repository at this point in the history
  2. convert-*.py: GGUF Naming Convention Refactor and Metadata Override R…

    …efactor (ggerganov#7499)
    
    Main thing is that the default output filename will take this form
    
    {name}{parameters}{finetune}{version}{encoding}{kind}
    
    In addition this add and remove some entries in the KV store and adds a metadata class with automatic heuristics capability to derive some values based on model card content
    
    * No Change:
      - Internal GGUF Spec
        - `general.architecture`
        - `general.quantization_version`
        - `general.alignment`
        - `general.file_type`
      - General Model Details
        - `general.name`
        - `general.author`
        - `general.version`
        - `general.description`
      - Licensing details
        - `general.license`
      - Typically represents the converted GGUF repo (Unless made from scratch)
        - `general.url`
      - Model Source during conversion
        - `general.source.url`
    
    * Removed:
      - Model Source during conversion
        - `general.source.huggingface.repository`
    
    * Added:
      - General Model Details
        - `general.organization`
        - `general.finetune`
        - `general.basename`
        - `general.quantized_by`
        - `general.size_label`
      - Licensing details
        - `general.license.name`
        - `general.license.link`
      - Typically represents the converted GGUF repo (Unless made from scratch)
        - `general.doi`
        - `general.uuid`
        - `general.repo_url`
      - Model Source during conversion
        - `general.source.doi`
        - `general.source.uuid`
        - `general.source.repo_url`
      - Base Model Source
        - `general.base_model.count`
        - `general.base_model.{id}.name`
        - `general.base_model.{id}.author`
        - `general.base_model.{id}.version`
        - `general.base_model.{id}.organization`
        - `general.base_model.{id}.url` (Model Website/Paper)
        - `general.base_model.{id}.doi`
        - `general.base_model.{id}.uuid`
        - `general.base_model.{id}.repo_url` (Model Source Repository (git/svn/etc...))
      - Array based KV stores
        - `general.tags`
        - `general.languages`
        - `general.datasets`
    
    ---------
    
    Co-authored-by: compilade <git@compilade.net>
    Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
    3 people authored Jul 18, 2024
    Configuration menu
    Copy the full SHA
    672a6f1 View commit details
    Browse the repository at this point in the history
  3. server: use relative routes for static files in new UI (ggerganov#8552)

    * server: public: fix api_url on non-index pages
    
    * server: public: use relative routes for static files in new UI
    EZForever authored Jul 18, 2024
    Configuration menu
    Copy the full SHA
    0d2c732 View commit details
    Browse the repository at this point in the history
  4. cmake : install all ggml public headers (ggerganov#8480)

    Co-authored-by: 65a <65a@65a.invalid>
    65a and 65a authored Jul 18, 2024
    Configuration menu
    Copy the full SHA
    705b7ec View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    a15ef8f View commit details
    Browse the repository at this point in the history

Commits on Jul 19, 2024

  1. Configuration menu
    Copy the full SHA
    3d0e436 View commit details
    Browse the repository at this point in the history
  2. fix: typo of chatglm4 chat tmpl (ggerganov#8586)

    Signed-off-by: thxCode <thxcode0824@gmail.com>
    thxCode authored Jul 19, 2024
    Configuration menu
    Copy the full SHA
    f299aa9 View commit details
    Browse the repository at this point in the history
  3. ggml : add friendlier error message to fopen errors (ggerganov#8575)

    * Add additional error information when model files fail to load.
    
    * Adding additional error information to most instances of fopen.
    HanClinto authored Jul 19, 2024
    Configuration menu
    Copy the full SHA
    b57eb9c View commit details
    Browse the repository at this point in the history
  4. readme : fix server badge

    ggerganov authored Jul 19, 2024
    Configuration menu
    Copy the full SHA
    be0cfb4 View commit details
    Browse the repository at this point in the history
  5. llama : bump max layers from 256 to 512 (ggerganov#8530)

    * llama : bump max layers from 256 to 512
    
    * llama : replace asserts with exceptions
    ggerganov authored Jul 19, 2024
    Configuration menu
    Copy the full SHA
    d197545 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    57b1d4f View commit details
    Browse the repository at this point in the history
  7. ggml : fix quant dot product with odd number of blocks (ggerganov#8549)

    * ggml : fix iq4_nl dot product with odd number of blocks
    
    * ggml : fix odd blocks for ARM_NEON (ggerganov#8556)
    
    * ggml : fix iq4_nl dot product with odd number of blocks
    
    * ggml : fix q4_1
    
    * ggml : fix q5_0
    
    * ggml : fix q5_1
    
    * ggml : fix iq4_nl metal
    
    ggml-ci
    
    * ggml : fix q4_0
    
    * ggml : fix q8_0
    
    ggml-ci
    
    * ggml : remove special Q4_0 code for first 2 blocks
    
    * ggml : fix sumf redefinition
    
    ---------
    
    Co-authored-by: slaren <slarengh@gmail.com>
    
    ---------
    
    Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
    slaren and ggerganov authored Jul 19, 2024
    Configuration menu
    Copy the full SHA
    87e397d View commit details
    Browse the repository at this point in the history

Commits on Jul 20, 2024

  1. gguf_dump.py: fix markddown kv array print (ggerganov#8588)

    * gguf_dump.py: fix markddown kv array print
    
    * Update gguf-py/scripts/gguf_dump.py
    
    Co-authored-by: compilade <git@compilade.net>
    
    * gguf_dump.py: refactor kv array string handling
    
    * gguf_dump.py: escape backticks inside of strings
    
    * gguf_dump.py: inline code markdown escape handler added
    
    >>> escape_markdown_inline_code("hello world")
    '`hello world`'
    >>> escape_markdown_inline_code("hello ` world")
    '``hello ` world``'
    
    * gguf_dump.py: handle edge case about backticks on start or end of a string
    
    ---------
    
    Co-authored-by: compilade <git@compilade.net>
    mofosyne and compilade authored Jul 20, 2024
    Configuration menu
    Copy the full SHA
    c3776ca View commit details
    Browse the repository at this point in the history
  2. llama.swiftui: fix end of generation bug (ggerganov#8268)

    * fix continuing generating blank lines after getting EOT token or EOS token from LLM
    
    * change variable name to is_done (variable name suggested by ggerganov)
    
    * minor : fix trailing whitespace
    
    * minor : add space
    
    ---------
    
    Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
    ho2103 and ggerganov authored Jul 20, 2024
    Configuration menu
    Copy the full SHA
    69b9945 View commit details
    Browse the repository at this point in the history
  3. llama : add support for Tekken pre-tokenizer (ggerganov#8579)

    * llama : Added support for Tekken pre-tokenizer (ggerganov#8577)
    
    Removed uneeded `vocab.tokenizer_clean_spaces` assignment
    
    * llama : fix order of pre-tokenizers
    
    * * Tekken pre-tokenizer no longer uses clean_up_tokenization_spaces
    * Updated chkhsh for Tekken tokenizer
    
    ---------
    
    Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
    m18coppola and ggerganov authored Jul 20, 2024
    Configuration menu
    Copy the full SHA
    9403622 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    07283b1 View commit details
    Browse the repository at this point in the history
  5. CUDA: MMQ code deduplication + iquant support (ggerganov#8495)

    * CUDA: MMQ code deduplication + iquant support
    
    * 1 less parallel job for CI build
    JohannesGaessler authored Jul 20, 2024
    Configuration menu
    Copy the full SHA
    69c487f View commit details
    Browse the repository at this point in the history

Commits on Jul 21, 2024

  1. convert_hf : fix Gemma v1 conversion (ggerganov#8597)

    * convert_hf : fix Gemma v1 conversion
    
    * convert_hf : allow renaming tokens, but with a warning
    
    * convert_hf : fix Gemma v1 not setting BOS and EOS tokens
    compilade authored Jul 21, 2024
    Configuration menu
    Copy the full SHA
    c69c630 View commit details
    Browse the repository at this point in the history
  2. gguf-py : fix some metadata name extraction edge cases (ggerganov#8591)

    * gguf-py : fix some metadata name extraction edge cases
    
    * convert_lora : use the lora dir for the model card path
    
    * gguf-py : more metadata edge cases fixes
    
    Multiple finetune versions are now joined together,
    and the removal of the basename annotation on trailing versions
    is more robust.
    
    * gguf-py : add more name metadata extraction tests
    
    * convert_lora : fix default filename
    
    The default filename was previously hardcoded.
    
    * convert_hf : Model.fname_out can no longer be None
    
    * gguf-py : do not use title case for naming convention
    
    Some models use acronyms in lowercase,
    which can't be title-cased like other words,
    so it's best to simply use the same case
    as in the original model name.
    
    Note that the size label still has an uppercased suffix
    to make it distinguishable from the context size of a finetune.
    compilade authored Jul 21, 2024
    Configuration menu
    Copy the full SHA
    328884f View commit details
    Browse the repository at this point in the history
  3. examples : Rewrite pydantic_models_to_grammar_examples.py (ggerganov#…

    …8493)
    
    Changes:
    
    - Move each example into its own function. This makes the code much
      easier to read and understand.
    - Make the program easy to only run one test by commenting out function
      calls in main().
    - Make the output easy to parse by indenting the output for each example.
    - Add shebang and +x bit to make it clear it's an executable.
    - Make the host configurable via --host with a default 127.0.0.1:8080.
    - Make the code look in the tools list to call the registered tool,
      instead of hardcoding the returned values. This makes the code more
      copy-pastable.
    - Add error checking, so that the program exits 1 if the LLM didn't
      returned expected values. It's super useful to check for correctness.
    
    Testing:
    
    - Tested with Mistral-7B-Instruct-v0.3 in F16 and Q5_K_M and
      Meta-Llama-3-8B-Instruct in F16 and Q5_K_M.
      - I did not observe a failure even once in Mistral-7B-Instruct-v0.3.
      - Llama-3 failed about a third of the time in example_concurrent: it
        only returned one call instead of 3. Even for F16.
    
    Potential follow ups:
    
    - Do not fix the prompt encoding yet. Surprisingly it mostly works even
      if the prompt encoding is not model optimized.
    - Add chained answer and response.
    
    Test only change.
    maruel authored Jul 21, 2024
    Configuration menu
    Copy the full SHA
    22f281a View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    45f2c19 View commit details
    Browse the repository at this point in the history

Commits on Jul 22, 2024

  1. examples: fix android example cannot be generated continuously (ggerg…

    …anov#8621)
    
    When generation ends `completion_loop()` should return a NULL, not the empty string
    devojony authored Jul 22, 2024
    Configuration menu
    Copy the full SHA
    b7c11d3 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    04bab6b View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    6281544 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    50e0535 View commit details
    Browse the repository at this point in the history
  5. tests : re-enable tokenizer tests (ggerganov#8611)

    * models : remove duplicated gpt-2 vocab
    
    * models : remove old stablelm vocab
    
    * tests : re-enable MPT tokenizer tests
    
    * tests : re-enable DeepSeek tokenizer tests
    
    * cmake : sort
    
    ggml-ci
    ggerganov authored Jul 22, 2024
    Configuration menu
    Copy the full SHA
    e093dd2 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    6f11a83 View commit details
    Browse the repository at this point in the history
  7. *.py: Stylistic adjustments for python (ggerganov#8233)

    * Superflous parens in conditionals were removed.
    * Unused args in function were removed.
    * Replaced unused `idx` var with `_`
    * Initializing file_format and format_version attributes
    * Renaming constant to capitals
    * Preventing redefinition of the `f` var
    
    Signed-off-by: Jiri Podivin <jpodivin@redhat.com>
    jpodivin authored Jul 22, 2024
    Configuration menu
    Copy the full SHA
    566daa5 View commit details
    Browse the repository at this point in the history
  8. llama : add support for SmolLm pre-tokenizer (ggerganov#8609)

    * Adding SmolLM Pre Tokenizer
    
    * Update convert_hf_to_gguf_update.py
    
    Co-authored-by: compilade <git@compilade.net>
    
    * Update src/llama.cpp
    
    Co-authored-by: compilade <git@compilade.net>
    
    * handle regex
    
    * removed .inp and out .out ggufs
    
    ---------
    
    Co-authored-by: compilade <git@compilade.net>
    Stillerman and compilade authored Jul 22, 2024
    Configuration menu
    Copy the full SHA
    d94c6e0 View commit details
    Browse the repository at this point in the history
  9. llama : fix codeshell support (ggerganov#8599)

    * llama : fix codeshell support
    
    * llama : move codeshell after smollm below to respect the enum order
    hankeke303 authored Jul 22, 2024
    Configuration menu
    Copy the full SHA
    081fe43 View commit details
    Browse the repository at this point in the history

Commits on Jul 23, 2024

  1. Configuration menu
    Copy the full SHA
    063d99a View commit details
    Browse the repository at this point in the history
  2. contrib : clarify PR squashing + module names (ggerganov#8630)

    * contrib : clarify PR squashing
    
    * contrib : fix typo + add list of modules
    ggerganov authored Jul 23, 2024
    Configuration menu
    Copy the full SHA
    e7e6487 View commit details
    Browse the repository at this point in the history
  3. Allow all RDNA2 archs to use sdot4 intrinsic (ggerganov#8629)

    The check gating the use of `__builtin_amdgc_sdot4` specifically checks for gfx1030. This causes a severe perf regression for anything gfx103? that's not gfx1030 and not using `HSA_OVERRIDE_GFX_VERSION` (if you've built ROCm to support it). We already have a generic RDNA2 define, let's use it.
    jeroen-mostert authored Jul 23, 2024
    Configuration menu
    Copy the full SHA
    46e4741 View commit details
    Browse the repository at this point in the history
  4. Vulkan IQ4_NL Support (ggerganov#8613)

    * Fix Vulkan matmul tests compile errors
    
    * Add Vulkan IQ4_NL support
    
    * Fix Vulkan DeepSeek-Coder-V2-Lite MoE support
    0cc4m authored Jul 23, 2024
    Configuration menu
    Copy the full SHA
    751fcfc View commit details
    Browse the repository at this point in the history
  5. llama : move vocab, grammar and sampling into separate files (ggergan…

    …ov#8508)
    
    * llama : move sampling code into llama-sampling
    
    ggml-ci
    
    * llama : move grammar code into llama-grammar
    
    ggml-ci
    
    * cont
    
    ggml-ci
    
    * cont : pre-fetch rules
    
    * cont
    
    ggml-ci
    
    * llama : deprecate llama_sample_grammar
    
    * llama : move tokenizers into llama-vocab
    
    ggml-ci
    
    * make : update llama.cpp deps [no ci]
    
    * llama : redirect external API to internal APIs
    
    ggml-ci
    
    * llama : suffix the internal APIs with "_impl"
    
    ggml-ci
    
    * llama : clean-up
    ggerganov authored Jul 23, 2024
    Configuration menu
    Copy the full SHA
    938943c View commit details
    Browse the repository at this point in the history
  6. sycl : Add support for non-release DPC++ & oneMKL (ggerganov#8644)

    * Update cmake to support nvidia hardware & open-source compiler
    ---------
    Signed-off-by: Joe Todd <joe.todd@codeplay.com>
    joeatodd authored Jul 23, 2024
    Configuration menu
    Copy the full SHA
    64cf50a View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    b841d07 View commit details
    Browse the repository at this point in the history
  8. examples : Fix llama-export-lora example (ggerganov#8607)

    * fix export-lora example
    
    * add more logging
    
    * reject merging subset
    
    * better check
    
    * typo
    ngxson authored Jul 23, 2024
    Configuration menu
    Copy the full SHA
    de28008 View commit details
    Browse the repository at this point in the history

Commits on Jul 24, 2024

  1. Configuration menu
    Copy the full SHA
    b115105 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    79167d9 View commit details
    Browse the repository at this point in the history
  3. llama : fix llama_chat_format_single for mistral (ggerganov#8657)

    * fix `llama_chat_format_single` for mistral
    
    * fix typo
    
    * use printf
    ngxson authored Jul 24, 2024
    Configuration menu
    Copy the full SHA
    96952e7 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    3a7ac53 View commit details
    Browse the repository at this point in the history
  5. Build Llama SYCL Intel with static libs (ggerganov#8668)

    Ensure SYCL CI builds both static & dynamic libs for testing purposes
    
    Signed-off-by: Joe Todd <joe.todd@codeplay.com>
    joeatodd authored Jul 24, 2024
    Configuration menu
    Copy the full SHA
    f19bf99 View commit details
    Browse the repository at this point in the history
  6. readme : update games list (ggerganov#8673)

    Added link to game I made that depends on llama
    MorganRO8 authored Jul 24, 2024
    Configuration menu
    Copy the full SHA
    68504f0 View commit details
    Browse the repository at this point in the history

Commits on Jul 25, 2024

  1. llama: use sliding window for phi3 (ggerganov#8627)

    * use sliding window for phi3
    
    * fix typo, "data_swa" -> "data"
    
    * [conver_hf_to_gguf.py] add phi3 sliding window
    FanShupei authored Jul 25, 2024
    Configuration menu
    Copy the full SHA
    8a4bad5 View commit details
    Browse the repository at this point in the history
  2. docs : Quantum -> Quantized (ggerganov#8666)

    * docfix: imatrix readme, quantum models -> quantized models.
    
    * docfix: server readme: quantum models -> quantized models.
    Ujjawal-K-Panchal authored Jul 25, 2024
    Configuration menu
    Copy the full SHA
    4b0eff3 View commit details
    Browse the repository at this point in the history
  3. examples : remove finetune and train-text-from-scratch (ggerganov…

    …#8669)
    
    * examples : remove finetune and train-text-from-scratch
    
    * fix build
    
    * update help message
    
    * fix small typo for export-lora
    ngxson authored Jul 25, 2024
    Configuration menu
    Copy the full SHA
    be6d7c0 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    eddcb52 View commit details
    Browse the repository at this point in the history
  5. [SYCL] fix multi-gpu issue on sycl (ggerganov#8554)

    
    ---------
    
    Signed-off-by: Chen Xi <xi2chen@intel.com>
    Co-authored-by: Meng, Hengyu <hengyu.meng@intel.com>
    ClarkChin08 and airMeng authored Jul 25, 2024
    Configuration menu
    Copy the full SHA
    ed67bcb View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    88954f7 View commit details
    Browse the repository at this point in the history
  7. ggml : fix build on Windows with Snapdragon X (ggerganov#8531)

    * Improvements for Windows with Snapdragon X
    
    * Revert "Improvements for Windows with Snapdragon X"
    
    This reverts commit bf21397.
    
    * Improvements for Windows with Snapdragon X
    
    * WOA build clarifications
    
    * WIndows on ARM build clarifications
    
    * cmake build for Windows clarifications
    
    * Update docs/build.md
    
    Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
    
    ---------
    
    Co-authored-by: AndreasKunar <andreaskmsn.com>
    Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
    AndreasKunar and ggerganov authored Jul 25, 2024
    Configuration menu
    Copy the full SHA
    bf5a81d View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    4226a8d View commit details
    Browse the repository at this point in the history
  9. ggml: handle ggml_init failure to fix NULL pointer deref (ggerganov#8692

    )
    
    `ggml_init` can fail if no unused context is found. In that case, a NULL-pointer deref will happen later in the code during a call to `ggml_set_on_alloc`.
    
    This fixes it by bailing out if no context is found.
    DavidKorczynski authored Jul 25, 2024
    Configuration menu
    Copy the full SHA
    49ce0ab View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    41cd47c View commit details
    Browse the repository at this point in the history
  11. server : add Speech Recognition & Synthesis to UI (ggerganov#8679)

    * server : add Speech Recognition & Synthesis to UI
    
    * server : add Speech Recognition & Synthesis to UI (fixes)
    ElYaiko authored Jul 25, 2024
    Configuration menu
    Copy the full SHA
    01aec4a View commit details
    Browse the repository at this point in the history

Commits on Jul 26, 2024

  1. llama : fix order of parameters (ggerganov#8706)

    foldl and Judd authored Jul 26, 2024
    Configuration menu
    Copy the full SHA
    01245f5 View commit details
    Browse the repository at this point in the history

Commits on Jul 27, 2024

  1. ggml : reduce hash table reset cost (ggerganov#8698)

    * ggml : reduce hash table reset cost
    
    * fix unreachable code warnings after GGML_ASSERT(false)
    
    * GGML_ASSERT(false) -> GGML_ABORT("fatal error")
    
    * GGML_ABORT use format string
    slaren authored Jul 27, 2024
    Configuration menu
    Copy the full SHA
    2b1f616 View commit details
    Browse the repository at this point in the history
  2. cann: Fix Multi-NPU execution error (ggerganov#8710)

    * cann: fix multi-npu exec error
    
    * cann: update comment  for ggml_backend_cann_supports_buft
    wangshuai09 authored Jul 27, 2024
    Configuration menu
    Copy the full SHA
    bfb4c74 View commit details
    Browse the repository at this point in the history
  3. common : add --no-warmup option for main/llama-cli (ggerganov#8712)

    This commit adds a --no-warmup option for llama-cli.
    
    The motivation for this is that it can be convenient to skip the
    warmup llama_decode call when debugging.
    
    Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
    danbev authored Jul 27, 2024
    Configuration menu
    Copy the full SHA
    9d03d08 View commit details
    Browse the repository at this point in the history
  4. llama : add function for model-based max number of graph nodes (ggerg…

    …anov#8622)
    
    * llama : model-based max number of graph nodes
    
    ggml-ci
    
    * llama : disable 405B max_nodes path due to lack of complaints
    
    ggml-ci
    ggerganov authored Jul 27, 2024
    Configuration menu
    Copy the full SHA
    92090ec View commit details
    Browse the repository at this point in the history
  5. llama : add support for llama 3.1 rope scaling factors (ggerganov#8676)

    * Add llama 3.1 rope scaling factors to llama conversion and inference
    
    This commit generates the rope factors on conversion and adds them to the resulting model as a tensor. At inference time, these factors are passed to the `ggml_rope_ext` rope oepration, improving results for context windows above 8192
    
    * Update convert_hf_to_gguf.py
    
    Co-authored-by: compilade <git@compilade.net>
    
    * address comments
    
    * address comments
    
    * Update src/llama.cpp
    
    Co-authored-by: compilade <git@compilade.net>
    
    * Update convert_hf_to_gguf.py
    
    Co-authored-by: compilade <git@compilade.net>
    
    ---------
    
    Co-authored-by: compilade <git@compilade.net>
    jmorganca and compilade authored Jul 27, 2024
    Configuration menu
    Copy the full SHA
    b5e9546 View commit details
    Browse the repository at this point in the history
  6. ggml : remove unnecessary UNUSED macro call (ggml/880)

    This commit removes an UNUSED macro call that is not needed as the
    variable n0 is used in the code and will not produce a warning.
    
    Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
    danbev authored and ggerganov committed Jul 27, 2024
    Configuration menu
    Copy the full SHA
    c12b6e8 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    d2b851b View commit details
    Browse the repository at this point in the history
  8. vulkan : initialize vk_buffer_struct members to VK_NULL_HANDLE (ggml/…

    …893)
    
    This prevents invalid frees when destroying a partially initialized
    vk_buffer_struct. For example, this could happen in ggml_vk_create_buffer
    when running out of device memory.
    
    Co-authored-by: Tony Wasserka <neobrain@users.noreply.github.com>
    2 people authored and ggerganov committed Jul 27, 2024
    Configuration menu
    Copy the full SHA
    203b7f1 View commit details
    Browse the repository at this point in the history
  9. ggml: add support for float16 input tensors in pooling operations (gg…

    …ml/895)
    
    * Add support for float16 tensors in 1d pooling operations
    
    * Add support for float16 input tensors in 2d pooling operations
    
    * code cleanup
    
    remove unnecessary casting during srow ptr initialization
    
    ---------
    
    Co-authored-by: vanaka11 <vanaka1189@gmail.com>
    2 people authored and ggerganov committed Jul 27, 2024
    Configuration menu
    Copy the full SHA
    9f77d89 View commit details
    Browse the repository at this point in the history
  10. ggml : loop tiling optimizations for scalar path (ggml/898)

    Apply a loop tiling technique to the generic path, which provides
    performance upside for ISAs with enough registers to take advantage
    of it. Also helps the compiler optimize this path.
    heshpdx authored and ggerganov committed Jul 27, 2024
    Configuration menu
    Copy the full SHA
    a05ca93 View commit details
    Browse the repository at this point in the history
  11. sync : ggml

    ggml-ci
    ggerganov committed Jul 27, 2024
    Configuration menu
    Copy the full SHA
    ae7985c View commit details
    Browse the repository at this point in the history
  12. ggml : add missing semicolon (#0)

    ggml-ci
    ggerganov committed Jul 27, 2024
    Configuration menu
    Copy the full SHA
    345c8c0 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    56f20aa View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    5e2727f View commit details
    Browse the repository at this point in the history
  15. feat: Support Moore Threads GPU (ggerganov#8383)

    * Update doc for MUSA
    
    Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
    
    * Add GGML_MUSA in Makefile
    
    Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
    
    * Add GGML_MUSA in CMake
    
    Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
    
    * CUDA => MUSA
    
    Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
    
    * MUSA adds support for __vsubss4
    
    Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
    
    * Fix CI build failure
    
    Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
    
    ---------
    
    Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
    yeahdongcn authored Jul 27, 2024
    Configuration menu
    Copy the full SHA
    e54c35e View commit details
    Browse the repository at this point in the history

Commits on Jul 28, 2024

  1. llama : refactor session file management (ggerganov#8699)

    * llama : refactor session file management
    
    * llama : saving and restoring state checks for overflow
    
    The size of the buffers should now be given to the functions working
    with them, otherwise a truncated file could cause out of bound reads.
    
    * llama : stream from session file instead of copying into a big buffer
    
    Loading session files should no longer cause a memory usage spike.
    
    * llama : llama_state_get_size returns the actual size instead of max
    
    This is a breaking change, but makes that function *much* easier
    to keep up to date, and it also makes it reflect the behavior
    of llama_state_seq_get_size.
    
    * llama : share code between whole and seq_id-specific state saving
    
    Both session file types now use a more similar format.
    
    * llama : no longer store all hparams in session files
    
    Instead, the model arch name is stored.
    The layer count and the embedding dimensions of the KV cache
    are still verified when loading.
    Storing all the hparams is not necessary.
    
    * llama : fix uint64_t format type
    
    * llama : various integer type cast and format string fixes
    
    Some platforms use "%lu" and others "%llu" for uint64_t.
    Not sure how to handle that, so casting to size_t when displaying errors.
    
    * llama : remove _context suffix for llama_data_context
    
    * llama : fix session file loading
    
    llama_state_get_size cannot be used to get the max size anymore.
    
    * llama : more graceful error handling of invalid session files
    
    * llama : remove LLAMA_MAX_RNG_STATE
    
    It's no longer necessary to limit the size of the RNG state,
    because the max size of session files is not estimated anymore.
    
    * llama : cast seq_id in comparison with unsigned n_seq_max
    compilade authored Jul 28, 2024
    Configuration menu
    Copy the full SHA
    4c676c8 View commit details
    Browse the repository at this point in the history
  2. chore : Fix vulkan related compiler warnings, add help text, improve …

    …CLI options (ggerganov#8477)
    
    * chore: Fix compiler warnings, add help text, improve CLI options
    
    * Add prototypes for function definitions
    * Invert logic of --no-clean option to be more intuitive
    * Provide a new help prompt with clear instructions
    
    * chore : Add ignore rule for vulkan shader generator
    
    Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
    
    * Update ggml/src/vulkan-shaders/vulkan-shaders-gen.cpp
    
    Co-authored-by: 0cc4m <picard12@live.de>
    
    * chore : Remove void and apply C++ style empty parameters
    
    * chore : Remove void and apply C++ style empty parameters
    
    ---------
    
    Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
    Co-authored-by: 0cc4m <picard12@live.de>
    teleprint-me and 0cc4m authored Jul 28, 2024
    Configuration menu
    Copy the full SHA
    4730fac View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    6eeaeba View commit details
    Browse the repository at this point in the history

Commits on Jul 29, 2024

  1. Configuration menu
    Copy the full SHA
    0832de7 View commit details
    Browse the repository at this point in the history
  2. cuda : organize vendor-specific headers into vendors directory (ggerg…

    …anov#8746)
    
    Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
    yeahdongcn authored Jul 29, 2024
    Configuration menu
    Copy the full SHA
    439b3fc View commit details
    Browse the repository at this point in the history
  3. ggml: bugfix: fix the inactive elements is agnostic for risc-v vector (

    …ggerganov#8748)
    
    In these codes, we want to retain the value that they previously held
    when mask[i] is false. So we should use undisturbed. With the default
    agnostic policy of rvv intrinsic, these values can be held or be
    written with 1s.
    
    Co-authored-by: carter.li <carter.li@starfivetech.com>
    CarterLi999 and carter.li authored Jul 29, 2024
    Configuration menu
    Copy the full SHA
    75af08c View commit details
    Browse the repository at this point in the history

Commits on Jul 30, 2024

  1. [SYCL] Add TIMESTEP_EMBEDDING OP (ggerganov#8707)

    Signed-off-by: zhentaoyu <zhentao.yu@intel.com>
    zhentaoyu authored Jul 30, 2024
    Configuration menu
    Copy the full SHA
    c887d8b View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6e2b600 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    140074b View commit details
    Browse the repository at this point in the history
  4. added android implementation of ggml_print_backtrace_symbols (ggergan…

    …ov#8751)
    
    * added android implementation of ggml_print_backtrace_symbols
    
    * Update ggml/src/ggml.c
    
    Co-authored-by: slaren <slarengh@gmail.com>
    
    * Update ggml/src/ggml.c
    
    Co-authored-by: slaren <slarengh@gmail.com>
    
    * Update ggml/src/ggml.c
    
    Co-authored-by: slaren <slarengh@gmail.com>
    
    * Update ggml/src/ggml.c
    
    Co-authored-by: slaren <slarengh@gmail.com>
    
    * Update ggml/src/ggml.c
    
    Co-authored-by: slaren <slarengh@gmail.com>
    
    ---------
    
    Co-authored-by: slaren <slarengh@gmail.com>
    l3utterfly and slaren authored Jul 30, 2024
    Configuration menu
    Copy the full SHA
    7c27a19 View commit details
    Browse the repository at this point in the history
  5. py: add_array() will not add to kv store if value is an empty array (g…

    …gerganov#8774)
    
    * gguf_writer.py: add_array() should not add to kv store if empty
    
    * Apply suggestions from code review
    
    I was wondering if there was a specific reason for `if val` but good to hear we can safely use `len(val == 0`
    
    Co-authored-by: compilade <git@compilade.net>
    
    ---------
    
    Co-authored-by: compilade <git@compilade.net>
    mofosyne and compilade authored Jul 30, 2024
    Configuration menu
    Copy the full SHA
    7e72aa7 View commit details
    Browse the repository at this point in the history
  6. nix: cuda: rely on propagatedBuildInputs (ggerganov#8772)

    Listing individual outputs no longer necessary to reduce the runtime closure size after NixOS/nixpkgs#323056.
    SomeoneSerge authored Jul 30, 2024
    Configuration menu
    Copy the full SHA
    268c566 View commit details
    Browse the repository at this point in the history

Commits on Jul 31, 2024

  1. Configuration menu
    Copy the full SHA
    44d28dd View commit details
    Browse the repository at this point in the history
  2. Adding Gemma 2 2B configs (ggerganov#8784)

    * Adding Gemma 2 2B configs
    
    Updates to Q scaling and Gemma 2 model sizes to match v2 2B model.
    
    * Update src/llama.cpp
    
    Co-authored-by: slaren <slarengh@gmail.com>
    
    ---------
    
    Co-authored-by: slaren <slarengh@gmail.com>
    pculliton and slaren authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    398ede5 View commit details
    Browse the repository at this point in the history
  3. Build: Fix potential race condition (ggerganov#8781)

    * Fix potential race condition as pointed out by @fairydreaming in ggerganov#8776
    
    * Reference the .o rather than rebuilding every time.
    
    * Adding in CXXFLAGS and LDFLAGS
    
    * Removing unnecessary linker flags.
    HanClinto authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    ed9d285 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    afbbcf3 View commit details
    Browse the repository at this point in the history

Commits on Aug 1, 2024

  1. Configuration menu
    Copy the full SHA
    c8a0090 View commit details
    Browse the repository at this point in the history
  2. cuda : fix dmmv cols requirement to 2*GGML_CUDA_DMMV_X (ggerganov#8800)

    * cuda : fix dmmv cols requirement to 2*GGML_CUDA_DMMV_X
    
    * update asserts
    
    * only use dmmv for supported types
    
    * add test
    slaren authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    7a11eb3 View commit details
    Browse the repository at this point in the history
  3. Build: Only include execinfo.h on linux systems that support it (gger…

    …ganov#8783)
    
    * Only enable backtrace on GLIBC linux systems
    
    * fix missing file from copy
    
    * use glibc macro instead of defining a custom one
    acon96 authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    b7a08fd View commit details
    Browse the repository at this point in the history
  4. ggml-cuda: Adding support for unified memory (ggerganov#8035)

    * Adding support for unified memory
    
    * adding again the documentation about unified memory
    
    * refactoring: Moved the unified memory code in the correct location.
    
    * Fixed compilation error when using hipblas
    
    * cleaning up the documentation
    
    * Updating the documentation
    
    Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
    
    * adding one more case where the PR should not be enabled
    
    ---------
    
    Co-authored-by: matteo serva <matteo.serva@gmail.com>
    Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
    3 people authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    afbb4c1 View commit details
    Browse the repository at this point in the history

Commits on Aug 2, 2024

  1. Configuration menu
    Copy the full SHA
    0fbbd88 View commit details
    Browse the repository at this point in the history
  2. cann: Fix ggml_cann_im2col for 1D im2col (ggerganov#8819)

    * fix ggml_cann_im2col for 1D im2col
    
    * fix build warning
    MengqingCao authored Aug 2, 2024
    Configuration menu
    Copy the full SHA
    e09a800 View commit details
    Browse the repository at this point in the history
  3. Fix conversion of unnormalized BF16->BF16 weights (ggerganov#7843)

    * add truncate_bf16
    
    * truncate intermediate fp32 if converting bf16 to bf16
    
    * fix masking in __compute_fp32_to_bf16
    
    * np.int16 no longer used
    
    * missing cast and additional numpy 2.x fix
    
    * ggml-impl : do not flush bf16 subnormals to zero
    
    * ggml : add reference fp32 to bf16 conversion
    
    The fast version is no longer equivalent for all platforms
    because of the handling of subnormal values.
    
    * gguf-py : remove flush to zero for bf16 subnormals
    
    * gguf-py : remove float32 truncation to bf16
    
    Rounding achieves the same thing in the cases where this was used.
    
    * missed prototype update in merge
    
    * merge cleanup
    
    ---------
    
    Co-authored-by: Francis Couture-Harpin <git@compilade.net>
    CISC and compilade authored Aug 2, 2024
    Configuration menu
    Copy the full SHA
    b72c20b View commit details
    Browse the repository at this point in the history

Commits on Aug 3, 2024

  1. ggml : reading the runtime sve config of the cpu (ggerganov#8709)

    * ggml : reading the runtime sve config of the cpu
    
    * change to one time init to prevent performance drop
    
    * prefix variable to avoid possible conflicts
    
    * revert xxhash fix and add brackets
    
    ---------
    
    Co-authored-by: domke <673751-domke@users.noreply.gitlab.com>
    jdomke and domke authored Aug 3, 2024
    Configuration menu
    Copy the full SHA
    76614f3 View commit details
    Browse the repository at this point in the history

Commits on Aug 4, 2024

  1. Configuration menu
    Copy the full SHA
    4b77ea9 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    01aae2b View commit details
    Browse the repository at this point in the history
  3. batched-bench : handle empty -npl (ggerganov#8839)

    * [example] batched-bench "segmentation fault"
    
    When `llama-batched-bench` is invoked _without_ setting `-npl`, "number
    of parallel prompts", it segfaults.
    
    The segfault is caused by invoking `max_element()` on a zero-length
    vector, `n_pl`
    
    This commit addresses that by first checking to see if the number of
    parallel prompts is zero, and if so sets the maximum sequence size to 1;
    otherwise, sets it to the original, the result of `max_element()`.
    
    Fixes, when running `lldb build/bin/llama-batched-bench -- -m models/Meta-Llama-3-8B.gguf`
    
    ```
    * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
        frame #0: 0x000000010000366c llama-batched-bench`main(argc=3, argv=0x000000016fdff268) at batched-bench.cpp:72:28
       69  	    llama_context_params ctx_params = llama_context_params_from_gpt_params(params);
       70
       71  	    // ensure enough sequences are available
    -> 72  	    ctx_params.n_seq_max = *std::max_element(n_pl.begin(), n_pl.end());
    ```
    
    * Update examples/batched-bench/batched-bench.cpp
    
    Co-authored-by: compilade <git@compilade.net>
    
    ---------
    
    Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
    Co-authored-by: compilade <git@compilade.net>
    3 people authored Aug 4, 2024
    Configuration menu
    Copy the full SHA
    ecf6b7f View commit details
    Browse the repository at this point in the history
  4. Server: Don't ignore llama.cpp params (ggerganov#8754)

    * Don't ignore llama.cpp params
    
    * Add fallback for max_tokens
    ardfork authored Aug 4, 2024
    Configuration menu
    Copy the full SHA
    978ba3d View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    0d6fb52 View commit details
    Browse the repository at this point in the history

Commits on Aug 5, 2024

  1. Configuration menu
    Copy the full SHA
    c02b0a8 View commit details
    Browse the repository at this point in the history
  2. ggml : move c parameter comment to ggml_rope_ext (ggml/901)

    This commit moves the comment for the c parameter from ggml_rope to
    ggml_rope_ext. The comment is currently incorrect as ggml_rope does not
    have a c parameter (freq_factors tensor).
    
    Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
    danbev authored and ggerganov committed Aug 5, 2024
    Configuration menu
    Copy the full SHA
    655858a View commit details
    Browse the repository at this point in the history
  3. vulkan : implement Stable Diffusion operators (ggml/904)

    * Fix Vulkan repeat op
    
    * Implement Vulkan concat op
    
    * Delete old Vulkan shader generator
    
    * Implement Vulkan im2col op
    
    * Implement Vulkan unary gelu_quick op
    
    * Implement Vulkan group_norm op
    
    * Implement Vulkan timestep_embedding op
    
    * Implement Vulkan upscale op
    
    * Fix Vulkan vk_context tensor extra index issue
    
    * Fix Vulkan matmul shader parameter bug
    
    * Properly fix Vulkan matmul shader parameter bug
    
    * Add Vulkan ADD f16 + f32 -> f16 operator support
    
    * Implement Vulkan tanh op
    
    * Fix Vulkan group count too large Validation error on non-Nvidia GPUs
    
    * Throw error when too much memory is requested
    
    * Fix another Vulkan group count too large Validation error on non-Nvidia GPUs
    
    * Fix matmul MMQ condition
    
    * Implement Vulkan pad op
    
    * Fix Vulkan crash when tensor is used multiple times in a compute graph
    
    * Add Vulkan CONCAT f16 + f16 -> f16 op
    
    * Add Vulkan LEAKY_RELU op
    0cc4m authored and ggerganov committed Aug 5, 2024
    Configuration menu
    Copy the full SHA
    a3738b2 View commit details
    Browse the repository at this point in the history
  4. sync : ggml

    ggml-ci
    ggerganov committed Aug 5, 2024
    Configuration menu
    Copy the full SHA
    5587e57 View commit details
    Browse the repository at this point in the history
  5. vulkan : fix Qantized Mat-Vec Mul on AMD GPUs for ncols < 64 (ggergan…

    …ov#8855)
    
    * Fix Vulkan mul mat vec invalid results when ncols < warp size
    
    * Only run backend ops mul mat vec block size test if block size not already covered
    0cc4m authored Aug 5, 2024
    Configuration menu
    Copy the full SHA
    064cdc2 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    f1ea514 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    400ae6f View commit details
    Browse the repository at this point in the history
  8. cmake: fix paths for vulkan shaders compilation on Windows (ggerganov…

    …#8573)
    
    * Vulkan-shaders: attempt fix compilation on windows
    
    * fix miss-matched parenthesis
    stduhpf authored Aug 5, 2024
    Configuration menu
    Copy the full SHA
    e31a4f6 View commit details
    Browse the repository at this point in the history
  9. Stop the generation when <|eom_id|> token is encountered - needed for…

    … Llama 3.1 tool call support (ggerganov#8858)
    
    * gguf-py, llama : add constants and methods related to Llama-3.1 <|eom_id|> token
    
    * llama : find Llama-3.1 <|eom_id|> token id during vocab loading
    
    * llama-vocab : add Llama-3.1 <|eom_id|> token to the set of tokens stopping the generation
    
    ---------
    
    Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>
    fairydreaming and sszymczy authored Aug 5, 2024
    Configuration menu
    Copy the full SHA
    d3f0c71 View commit details
    Browse the repository at this point in the history
  10. py: Add more authorship metadata from model card (ggerganov#8810)

    * py: add more authorship metadata from model card
    
    * fixup! py: add more authorship metadata from model card
    mofosyne authored Aug 5, 2024
    Configuration menu
    Copy the full SHA
    1ef14b3 View commit details
    Browse the repository at this point in the history
  11. ggml : fix overflows in elu function (ggerganov#8866)

    It's helpful to use expm1f(x), because expf(x)-1 will result in overflow
    for 25% of single-precision floating point numbers.
    jart authored Aug 5, 2024
    Configuration menu
    Copy the full SHA
    b9dfc25 View commit details
    Browse the repository at this point in the history
  12. readme : add ramalama to the availables UI (ggerganov#8811)

    ramalama is a repo agnostic boring CLI tool that supports pulling from
    ollama, huggingface and oci registries.
    
    Signed-off-by: Eric Curtin <ecurtin@redhat.com>
    ericcurtin authored Aug 5, 2024
    Configuration menu
    Copy the full SHA
    b42978e View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    bc0f887 View commit details
    Browse the repository at this point in the history
  14. common : Changed tuple to struct (TODO fix) (ggerganov#8823)

    * common : Changed tuple to struct (TODO fix)
    
    Use struct `llama_init_result` to replace the previous
    std::tuple<struct llama_model *, struct llama_context *>
    
    * delete llama_init_default_params()
    
    * delete the extra whitespace
    Septa2112 authored Aug 5, 2024
    Configuration menu
    Copy the full SHA
    0a4ce78 View commit details
    Browse the repository at this point in the history

Commits on Aug 6, 2024

  1. Configuration menu
    Copy the full SHA
    d4ff847 View commit details
    Browse the repository at this point in the history
  2. [CANN]: Fix ggml_backend_cann_buffer_get_tensor (ggerganov#8871)

    * cann: fix ggml_backend_cann_buffer_get_tensor
    
     1. fix data ptr offset
     2. enable the acquisition of incomplete tensors
    
    * fix backend cann set_tensor
    MengqingCao authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    c21a896 View commit details
    Browse the repository at this point in the history
  3. convert : add support for XLMRoberta embedding models (ggerganov#8658)

    * add conversion for bge-m3; small fix in unigram tokenizer
    
    * clean up and simplify XLMRoberta conversion
    iamlemec authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    cdd1889 View commit details
    Browse the repository at this point in the history
  4. ggml : add epsilon as a parameter for group_norm (ggerganov#8818)

    Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
    MollySophia authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    2d5dd7b View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    0bf16de View commit details
    Browse the repository at this point in the history
  6. [Vulkan] Fix compilation of vulkan-shaders-gen on w64devkit after `…

    …e31a4f6` (ggerganov#8880)
    
    * Fix compilation issue in `vulkan-shaders-gen`
    
    ggerganov@e31a4f6 broke compilation on w64devkit. Including `algorithm` seems to fix that.
    
    * Guard it under `#ifdef _WIN32`
    MaggotHATE authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    efda90c View commit details
    Browse the repository at this point in the history
  7. cmake : Link vulkan-shaders-gen with pthreads (ggerganov#8835)

    When using CMake to build with Vulkan support, compiling
    vulkan-shaders-gen fails due to missing a CMakeLists.txt specification
    to link vulkan-shaders-gen with the threading library, resulting in the
    following error.
    
        [5/172] Linking CXX executable bin/vulkan-shaders-gen
        FAILED: bin/vulkan-shaders-gen
        : && /usr/bin/c++ ggml/src/vulkan-shaders/CMakeFiles/vulkan-shaders-gen.dir/vulkan-shaders-gen.cpp.o -o bin/vulkan-shaders-gen   && :
        ld: error: undefined symbol: pthread_create
        >>> referenced by vulkan-shaders-gen.cpp
        >>>               ggml/src/vulkan-shaders/CMakeFiles/vulkan-shaders-gen.dir/vulkan-shaders-gen.cpp.o:(std::__1::__libcpp_thread_create[abi:se180100](pthread**,
        >>>               void* (*)(void*), void*))
        c++: error: linker command failed with exit code 1 (use -v to see invocation)
        [6/172] Generating build details from Git
        -- Found Git: /usr/local/bin/git (found version "2.45.2")
        ninja: build stopped: subcommand failed.
    
    Add the CMakeLists.txt specification to link vulkan-shaders-gen with the
    threading library and fix the above error.
    
    Fixes ggerganov#8834
    Patater authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    db20f50 View commit details
    Browse the repository at this point in the history
  8. simple : update name of executable to llama-simple (ggerganov#8885)

    This commit updates the name of the executable in README.md from
    `simple` to `llama-simple`.
    danbev authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    5f4dcb1 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    641f5dd View commit details
    Browse the repository at this point in the history
  10. server : add lora hotswap endpoint (WIP) (ggerganov#8857)

    * server : add lora hotswap endpoint
    
    * handle lora_no_apply
    
    * fix build
    
    * updae docs
    
    * clean up struct def
    
    * fix build
    
    * add LoRA test
    
    * fix style
    ngxson authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    1e6f655 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    3195854 View commit details
    Browse the repository at this point in the history
  12. quantize : update usage comment in quantize.cpp (ggerganov#8889)

    This commit updates the usage comment in quantize.cpp to reflect the
    new name of the executable, which is llama-quantize.
    danbev authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    725e3d9 View commit details
    Browse the repository at this point in the history

Commits on Aug 7, 2024

  1. llama-bench : add support for getting cpu info on Windows (ggerganov#…

    …8824)
    
    * Add support for getting cpu info on Windows for llama_bench
    
    * refactor
    
    ---------
    
    Co-authored-by: slaren <slarengh@gmail.com>
    kylo5aby and slaren authored Aug 7, 2024
    Configuration menu
    Copy the full SHA
    506122d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a8dbc6f View commit details
    Browse the repository at this point in the history
  3. [SYCL] Updated SYCL device filtering (ggerganov#8901)

    * Updated device filter to depend on default_selector (fixes non-intel device issues)
    * Small related update to example/sycl Readme
    OuadiElfarouki authored Aug 7, 2024
    Configuration menu
    Copy the full SHA
    0478174 View commit details
    Browse the repository at this point in the history
  4. ggml-backend : fix async copy from CPU (ggerganov#8897)

    * ggml-backend : fix async copy from CPU
    
    * cuda : more reliable async copy, fix stream used when the devices are the same
    slaren authored Aug 7, 2024
    Configuration menu
    Copy the full SHA
    be55695 View commit details
    Browse the repository at this point in the history
  5. make : use C compiler to build metal embed object (ggerganov#8899)

    * make : use C compiler to build metal embed object
    
    * use rm + rmdir to avoid -r flag in rm
    slaren authored Aug 7, 2024
    Configuration menu
    Copy the full SHA
    15fa07a View commit details
    Browse the repository at this point in the history

Commits on Aug 8, 2024

  1. make : clean llamafile objects (ggerganov#8923)

    `ggml/src/llamafile/sgemm.o` was not deleted on `make clean`
    DrDub authored Aug 8, 2024
    Configuration menu
    Copy the full SHA
    ebd541a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    85fca8d View commit details
    Browse the repository at this point in the history
  3. metal : fix struct name (ggml/912)

    ggml-ci
    ggerganov committed Aug 8, 2024
    Configuration menu
    Copy the full SHA
    5b33ea1 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    f93d49a View commit details
    Browse the repository at this point in the history
  5. sync : ggml

    ggerganov committed Aug 8, 2024
    Configuration menu
    Copy the full SHA
    e44a561 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    366d486 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    afd27f0 View commit details
    Browse the repository at this point in the history
  8. gguf-py : simplify support for quant types (ggerganov#8838)

    * gguf-py : use classes for quants
    
    * convert_hf : simplify internal quantization type selection
    
    * gguf-py : fix flake8 lint
    
    * gguf-py : fix BF16 numpy view type
    
    * gguf-py : remove LlamaFileTypeMap
    
    Too specific to 'llama.cpp', and would be a maintenance burden
    to keep up to date.
    
    * gguf-py : add generic quantize and dequantize functions
    
    The quant classes no longer need to be known,
    only the target or the source type,
    for 'quantize' and 'dequantize', respectively.
    compilade authored Aug 8, 2024
    Configuration menu
    Copy the full SHA
    3a14e00 View commit details
    Browse the repository at this point in the history

Commits on Aug 9, 2024

  1. llama : reduce useless copies when saving session (ggerganov#8916)

    * llama : avoid useless copies in dummy session writer
    
    * llama : avoid double tensor copy when saving session to buffer
    compilade authored Aug 9, 2024
    Configuration menu
    Copy the full SHA
    345a686 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    daef3ab View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    6f6496b View commit details
    Browse the repository at this point in the history
  4. embedding : add --pooling option to README.md [no ci] (ggerganov#8934)

    This commit adds the `--pooling` option to the README.md file in the
    `examples/embedding` directory.
    
    The motivation for adding this options is that currently if the model
    used does not specify a pooling type the embedding example will fail
    with the following error message:
    ```console
    main: error: pooling type NONE not supported
    ```
    
    This commit also updates the name of the executable in the examples
    section.
    danbev authored Aug 9, 2024
    Configuration menu
    Copy the full SHA
    5b2c04f View commit details
    Browse the repository at this point in the history
  5. whisper : use vulkan as gpu backend when available (whisper/2302)

    * ggml: use vulkan as gpu backend when available
    
    Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com>
    
    * whisper: enable using vk as default buffer type
    
    Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com>
    
    ---------
    
    Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com>
    mstephenson6 authored and ggerganov committed Aug 9, 2024
    Configuration menu
    Copy the full SHA
    70c0ea3 View commit details
    Browse the repository at this point in the history
  6. sync : ggml

    ggerganov committed Aug 9, 2024
    Configuration menu
    Copy the full SHA
    4305b57 View commit details
    Browse the repository at this point in the history
  7. llava : support MiniCPM-V-2.5 (ggerganov#7599)

    * init
    
    * rename
    
    * add run android for termux in readme
    
    * add android readme
    
    * add instructions in readme
    
    * change name in readme
    
    * Update README.md
    
    * fixed line
    
    * add result in readme
    
    * random pos_embed
    
    * add positions index
    
    * change for ollama
    
    * change for ollama
    
    * better pos_embed in clip
    
    * support ollama
    
    * updata cmakelist
    
    * updata cmakelist
    
    * rename wrapper
    
    * clear code
    
    * replace and organize code
    
    * add link
    
    * sync master
    
    * fix warnings
    
    * fix warnings
    
    * fix bug in bicubic resize when need resize iamge smaller
    
    * receive review comments and modify
    
    * receive review comments and modify
    
    * put all code into llava dir
    
    * fix quality problem in pr code
    
    * change n_layer
    
    * add space in "-1"
    
    * imitate reshape bug of python code
    
    * fix bug in clip
    
    * fix issues for merging
    
    * fix llama-minicpmv-cli in cmake file
    
    * change pr readme
    
    * fix code review
    
    * remove in line 33 directory in the /cmakelists.txt (not in example, in the main dir
    
    * fix cmakefile
    
    * add warn
    
    * fix KEY_HAS_MINICPMV_PROJ
    
    * remove load_image_size into clip_ctx
    
    * remove the extern "C", MINICPMV_API
    
    * fix uhd code for review comment
    
    * delete minicpmv-wrapper in pr
    
    * remove uhd_image_embed
    
    * Modify 2 notes
    
    * clip : style changes
    
    * del common.h in clip
    
    * fix Type-Check error
    
    * fix Type-Check error
    
    * fix Type-Check error
    
    * fix Type-Check error
    
    * fix makefile error
    
    * fix ubuntu-make error
    
    * try fix clip
    
    * try fix 1
    
    ---------
    
    Co-authored-by: Hongji Zhu <fireyoucan@gmail.com>
    Co-authored-by: harvestingmoon <leewenyeong@gmail.com>
    Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
    4 people authored Aug 9, 2024
    Configuration menu
    Copy the full SHA
    3071c0a View commit details
    Browse the repository at this point in the history
  8. llama : better replace_all (cont) (ggerganov#8926)

    * llama : better replace_all (cont)
    
    ggml-ci
    
    * code : deduplicate replace_all
    
    ggml-ci
    ggerganov authored Aug 9, 2024
    Configuration menu
    Copy the full SHA
    45a55b9 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    272e3bd View commit details
    Browse the repository at this point in the history
  10. llama : add support for lora adapters in T5 model (ggerganov#8938)

    Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>
    fairydreaming and sszymczy authored Aug 9, 2024
    Configuration menu
    Copy the full SHA
    6afd1a9 View commit details
    Browse the repository at this point in the history
  11. Merge commit from fork

    ggerganov authored Aug 9, 2024
    Configuration menu
    Copy the full SHA
    b72942f View commit details
    Browse the repository at this point in the history

Commits on Aug 10, 2024

  1. gguf-py : fix double call to add_architecture() (ggerganov#8952)

    Signed-off-by: tarilabs <matteo.mortari@gmail.com>
    tarilabs authored Aug 10, 2024
    Configuration menu
    Copy the full SHA
    911b437 View commit details
    Browse the repository at this point in the history
  2. modify convert

    tc-mb committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    ea0c828 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    fc1c860 View commit details
    Browse the repository at this point in the history
  4. Merge pull request #24 from OpenBMB/master

    sync master
    tc-mb authored Aug 10, 2024
    Configuration menu
    Copy the full SHA
    ce0d1a6 View commit details
    Browse the repository at this point in the history
  5. modify convert

    tc-mb committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    6cad864 View commit details
    Browse the repository at this point in the history
  6. add readme

    tc-mb committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    fe39ecc View commit details
    Browse the repository at this point in the history
  7. add resampler of v2.6

    tc-mb committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    bffbe1c View commit details
    Browse the repository at this point in the history
  8. modify clip

    tc-mb committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    28d6a0f View commit details
    Browse the repository at this point in the history
  9. modify readme

    tc-mb committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    4a87d1d View commit details
    Browse the repository at this point in the history
  10. fix type-check

    tc-mb committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    32b47f6 View commit details
    Browse the repository at this point in the history

Commits on Aug 12, 2024

  1. fix type-check

    tc-mb committed Aug 12, 2024
    Configuration menu
    Copy the full SHA
    662d4c1 View commit details
    Browse the repository at this point in the history
  2. fix type-check

    tc-mb committed Aug 12, 2024
    Configuration menu
    Copy the full SHA
    a945b3c View commit details
    Browse the repository at this point in the history
  3. fix type-check

    tc-mb committed Aug 12, 2024
    Configuration menu
    Copy the full SHA
    89d378c View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    1ec79f0 View commit details
    Browse the repository at this point in the history
  5. fix convert script and readme

    tc-mb committed Aug 12, 2024
    Configuration menu
    Copy the full SHA
    1123376 View commit details
    Browse the repository at this point in the history
  6. fix convert

    tc-mb committed Aug 12, 2024
    Configuration menu
    Copy the full SHA
    f30c5e1 View commit details
    Browse the repository at this point in the history
  7. fix num in convert

    tc-mb committed Aug 12, 2024
    Configuration menu
    Copy the full SHA
    47eb0a5 View commit details
    Browse the repository at this point in the history

Commits on Aug 13, 2024

  1. fix type-check

    tc-mb committed Aug 13, 2024
    Configuration menu
    Copy the full SHA
    1ca3f06 View commit details
    Browse the repository at this point in the history