Skip to content

05. FAQ

Brian Dashore edited this page Sep 1, 2024 · 2 revisions

FAQ

  • What OS is supported?

    • Windows and Linux
  • I'm confused, how do I do anything with this API?

    • That's okay! Not everyone is an AI mastermind on their first try. The start scripts and config.yml aim to guide new users to the right configuration. The Usage page explains how the API works. Community Projects contain UIs that help interact with TabbyAPI via API endpoints. The Discord Server is also a place to ask questions, but please be nice.
  • How do I interface with the API?

    • The wiki is meant for user-facing documentation. Devs are recommended to use the autogenerated documentation for OpenAI and Kobold servers
  • What does TabbyAPI run?

    • TabbyAPI uses Exllamav2 as a powerful and fast backend for model inference, loading, etc. Therefore, the following types of models are supported:
      • Exl2 (Highly recommended)
      • GPTQ
      • FP16 (using Exllamav2's loader)
  • Exllamav2 may error with the following exception: ImportError: DLL load failed while importing exllamav2_ext: The specified module could not be found.

    • First, make sure to check if the wheel is equivalent to your python version and CUDA version. Also make sure you're in a venv or conda environment.
    • If those prerequisites are correct, the torch cache may need to be cleared. This is due to a mismatching exllamav2_ext.
      • In Windows: Find the cache at C:\Users\<User>\AppData\Local\torch_extensions\torch_extensions\Cache where <User> is your Windows username
      • In Linux: Find the cache at ~/.cache/torch_extensions
      • look for any folder named exllamav2_ext in the python subdirectories and delete them.
      • Restart TabbyAPI and launching should work again.
Clone this wiki locally