- Support for specifying log directory name using AUTOGGUF_LOG_DIR_NAME environment variable
- Work in progress GGUF merge window
- Support for repository types in HF Transfer utility
- New
dequantize_gguf.py
script - Support for MiniCPM3, RWKVv6, OLMoE, IBM Granite, and Jamba in llama.cpp convert scripts (conversion only)
- Add Nuitka build script for Linux
- Updated Finnish and Russian localizations using Claude 3 Opus
- Improved layout of HF Upload window
- Updated gguf library from upstream
- Refactored code to use localizations for menubar
- Renamed imports_and_globals.py to globals.py
- Moved general functions verify_gguf and process_args to globals.py
- Created Plugins class for extensibility
- Updated dependencies:
- huggingface-hub
- fastapi (~=0.115.0)
- setuptools (~=75.1.0)
- pyside6 (~=6.7.3)
- uvicorn (~=0.31.0)
- Corrected localization strings and file select types for GGUF merging
- Fix minor errors in build scripts
- Implemented Hugging Face (HF) upload functionality with GUI definitions
- Added RAM and CPU usage graphs to UI
- Input validation using wraps added to UI
- Right-click context menu added to the models list in UI
- Support for iMatrix generation tracking
- GGUF splitting feature added
- Japanese and German localizations updated
- Refactored to move functions out of
AutoGGUF
to reduce bloat - Localized GGUF split strings
- Optimized GGUF imports and renamed related modules
- Removed old
HFTransfer
class - Adjusted logging strings and updated French and Dutch localizations
- Improved startup time by optimizing default configuration, disabling network fetches for backends/updates
- Removed
requests
andpython-dotenv
to reduce size - Updated
fastapi
requirement from~=0.112.2
to~=0.114.2
- Updated
torch
requirement from~=2.4.0
to~=2.4.1
- Updated
setuptools
requirement from~=74.0.0
to~=74.1.2
- Updated
safetensors
requirement from~=0.4.4
to~=0.4.5
- Updated
huggingface-hub
requirement from~=0.24.6
to~=0.24.7
- Adjusted indeterminate progress bar behavior
- Removed comments in
requirements.txt
and updated its formatting
- AutoFP8 quantization classes and window (currently WIP)
- Minimize/maximize buttons to title bar
- API key authentication support for the local server
- HuggingFace upload/download class
- OpenAPI docs for endpoints
- Added new showcase image
- Replaced Flask with FastAPI and Uvicorn for improved performance
- Moved functions out of AutoGGUF.py into utils.py and TaskListItem.py
- Updated llama.cpp convert scripts
- Improved LoRA conversion process:
- Allow specifying output path in arguments
- Removed shutil.move operation
- Increased max number of LoRA layers
- Changed default port to 7001
- Now binding to localhost (127.0.0.1) instead of 0.0.0.0
- Upadted Spanish localizations
- Updated setuptools requirement from ~=68.2.0 to ~=74.0.0
- Updated .env.example with new configuration parameters
- Web page not found error
- Use of proper status in TaskListItem
- Passing of quant_threads and Logger to TaskListItem
- Improved window moving smoothness
- Prevention of moving window below taskbar
- Optimized imports in various files
- Remove aliased quant types
- .env.example file added
- Sha256 generation support added to build.yml
- Allow importing models from any directory on the system
- Added manual model import functionality
- Verification for manual imports and support for concatenated files
- Implemented plugins feature using importlib
- Configuration options for AUTOGGUF_MODEL_DIR_NAME, AUTOGGUF_OUTPUT_DIR_NAME, and AUTOGGUF_RESIZE_FACTOR added
- Moved get helper functions to utils.py
- Added type hints
- Reformat TaskListItem.py for better readability
- Separate macOS and Linux runs in CI/CD
- Updated .gitignore for better file management
- Updated numpy requirement from <2.0.0 to <3.0.0
- Fixed sha256 file format and avoided overwriting
- Updated regex for progress tracking
- Arabic and French localizations fixed
- Only count valid backends instead of total backend combos
- Import missing modules
- Update checking support (controlled by AUTOGGUF_CHECK_UPDATE environment variable)
- Live update support for GPU monitor graphs
- Smoother usage bar changes in monitor
- Unicode X button in KV Overrides box
- PyPI setup script
- Inno Setup build file
- Missing requirements and dotenv file loading
- Moved functions out of AutoGGUF.py
- Relocated CustomTitleBar to separate file
- Updated torch requirement from ~=2.2.0 to ~=2.4.0
- Updated showcase image
- Version bumped to v1.7.2 in Localizations.py
- setup.py issues
- Modern UI with seamless title bar
- Window resizing shortcuts (Ctrl+, Ctrl-, Ctrl+0)
- Theming support
- CPU usage bar
- Save Preset and Load Preset options in File menu
- Support for EXAONE model type
- Window size configuration through environment variables
- Refactored window to be scrollable
- Moved save/load preset logic to presets.py
- Updated docstrings for AutoGGUF.py, lora_conversion.py, and Logger.py
- Adapted gguf library to project standards
- Updated version to v1.7.0
- Fixed IDE-detected code typos and errors
- Menu bar with Close and About options
- Program version in localizations.py
- Support for 32-bit builds
- Added dependency audit
- Implemented radon, dependabot, and pre-commit workflows
- Updated torch requirement from
~=1.13.1
to~=2.4.0
- Updated psutil requirement from
~=5.9.8
to~=6.0.0
- Refactored functions out of AutoGGUF.py and moved to ui_update.py
- Changed filenames to follow PEP 8 conventions
- Disabled .md and .txt CodeQL analysis
- Optimized imports in AutoGGUF.py
- Updated README with new version and styled screenshot
- Fixed image blur in documentation
- Server functionality with new endpoints:
/v1/backends
: Lists all backends and their paths/v1/health
: Heartbeat endpoint/v1/tasks
: Provides current task info (name, status, progress, log file)/v1/models
: Retrieves model details (name, type, path, shard status)
- Environment variable support for server configuration:
AUTOGGUF_SERVER
: Enable/disable server (true/false)AUTOGGUF_SERVER_PORT
: Set server port (integer)
- Updated AutoGGUF docstrings
- Refactored build scripts
- Set GGML types to lowercase in command builder
- Optimized build scripts
- Nuitka for building
- Updated .gitignore
- Bug where deletion while a task is running crashes the program
- Fast build: Higher unzipped size (97MB), smaller download (38MB)
- Standard build: Created with PyInstaller, medium download and unzipped size (50MB), potentially slower
- Resolve licensing issues by using PySide6
- Add GPU monitoring support for NVIDIA GPUs
- Refactor localizations to use them in HF conversion area
- Rename FAILED_LOAD_PRESET to FAILED_TO_LOAD_PRESET localization key
- Remove Save Preset context menu action
- Support loading *.gguf file types
- Refactor localizations to use them in HF conversion area
- Organize localizations
- Add sha256 and PGP signatures (same as commit ones)
- Add HuggingFace to GGUF conversion support
- Fix scaling on low resolution screens, interface now scrolls
- Updated src file in release to be Black formatted
- Modifying the quantize_model function to process all selected types
- Updating preset saving and loading to handle multiple quantization types
- Use ERROR and IN_PROGRESS constants from localizations in QuantizationThread
- Minor repository changes
- Added model sharding management support
- Allow multiple quantization types to be selected and started simultaneously
- Resolves bug where Base Model text was shown even when GGML type was selected
- Improved alignment
- Minor repository changes
- Dynamic KV Overrides (see wiki: AutoGGUF/wiki/Dynamic-KV-Overrides)
- Quantization commands are now printed and logged
- LoRA Conversion:
- New section for converting HuggingFace PEFT LoRA adapters to GGML/GGUF
- Output type selection (GGML or GGUF)
- Base model selection for GGUF output
- LoRA adapter list with individual scaling factors
- Export LoRA section for merging adapters into base model
- UI Improvements:
- Updated task names in task list
- IMatrix generation check
- Larger window size
- Added exe favicon
- Localization:
- French and Simplified Chinese support for LoRA and "Refresh Models" strings
- Code and Build:
- Code organization improvements
- Added build script
- .gitignore file
- Misc:
- Currently includes src folder with conversion tools
- No console window popup
- AUTOGGUF_CHECK_BACKEND environment variable to disable backend check on start
- --onefile build with PyInstaller, _internal directory is no longer required
- Support for new llama-imatrix parameters:
- Context size (--ctx-size) input
- Threads (--threads) control
- New parameters to IMatrix section layout
- Slider-spinbox combination for thread count selection
- QSpinBox for output frequency input (1-100 range with percentage suffix)
- Converted context size input to a QSpinBox
- Updated generate_imatrix() method to use new UI element values
- Improved error handling in preset loading
- Enhanced localization support for new UI elements
- Error when loading presets containing KV overrides
- Duplicated functions
- Refresh Models button
- Linux build (built on Ubuntu 24.04 LTS)
- iostream llama.cpp issue, quantized_models directory created on launch
- More robust logging (find logs at latest_.log in logs folder)
- Localizations with support for 28 languages (machine translated using Gemini Experimental 0801)
- Dynamic KV override functionality
- Improved CUDA checking ability and extraction to the backend folder
- Scrollable area for KV overrides with add/delete capabilities
- Enhanced visibility and usability of Output Tensor Type and Token Embedding Type options
- Refactored code for better modularity and reduced circular dependencies
- Behavior of Output Tensor Type and Token Embedding Type dropdown menus
- Various minor UI inconsistencies
- Windows binary (created using PyInstaller)
- Issue where quantization errored with "AutoGGUF does not have x attribute"
- Initial release
- GUI interface for automated GGUF model quantization
- System resource monitoring (RAM and CPU usage)
- Llama.cpp backend selection and management
- Automatic download of llama.cpp releases from GitHub
- Model selection from local directory
- Comprehensive quantization options
- Task list for managing multiple quantization jobs
- Real-time log viewing for quantization tasks
- IMatrix generation feature with customizable settings
- GPU offload settings for IMatrix generation
- Context menu for task management
- Detailed model information dialog
- Error handling and user notifications
- Confirmation dialogs for task deletion and application exit