Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage of Ollama and local models #1079

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

loktar00
Copy link

@loktar00 loktar00 commented Oct 29, 2024

This is terrible code and I know it, I don't expect this to be merged but this is here in the hopes someone else will be able to get it working a bit better.

Full caveat, this "works" but every task has actually failed. For the local models being used the prompts would definitely need to be adjusted, however I did get the correct answer on 2 occasions in the logs.

Summary:

  • Split vision and language models, this may not have been 100% necessary but it accounts for ollama not allowing parallel requests. In addition the vision models in Ollama aren't as good as language.
  • Adds some more env variables all related to Ollama
  • Disables cost calculations when using ollama since we don't care or need that for local
  • Added options for resolution in the configs, just makes the images a bit lighter for local vision models.
  • Modified parsing for local models to target ```json

Things that exclude it from being merged and would love help with

  • Added files directly to the volumes in the docker-compose
  • Added a bunch of logging for debugging would need to be removed
  • Tis messy, I wont lie i used cursor a bit, I'm not an experienced Python Dev, I'm a JS guy ;)
  • I DID NOT TEST EXISTING FUNCTIONALITY / MODELS so there could be breaking changes.

Important

Adds support for Ollama models, modifies configurations, and includes debugging logs and Docker setup changes.

  • Behavior:
    • Split vision and language models to handle Ollama's non-parallel request limitation in api_handler_factory.py.
    • Modified parsing for local models to target ```json in api_handler_factory.py.
    • Disables cost calculations for Ollama models in models.py.
  • Configuration:
    • Added Ollama-specific settings in config.py and config_registry.py.
    • Updated docker-compose.yml to include Ollama environment variables and volume mounts.
  • Misc:
    • Added extensive logging for debugging in api_handler_factory.py.
    • Directly added files to Docker volumes in docker-compose.yml.

This description was created by Ellipsis for 3e3f4de. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to 3e3f4de in 29 seconds

More details
  • Looked at 486 lines of code in 5 files
  • Skipped 0 files when reviewing.
  • Skipped posting 5 drafted comments based on config settings.
1. docker-compose.yml:31
  • Draft comment:
    Adding files directly to volumes in docker-compose.yml is not a best practice as it can lead to issues with file synchronization and container portability. Consider using a Dockerfile to copy necessary files into the image instead.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment addresses a potential issue with the changes made in the diff, specifically the addition of files to volumes in docker-compose.yml. This practice can indeed lead to issues with file synchronization and container portability, which are valid concerns. The suggestion to use a Dockerfile is a common best practice in Docker usage.
    The comment might be too general and not consider specific use cases where adding files to volumes is necessary or beneficial. It assumes that using a Dockerfile is always the better option without considering the context.
    While there might be specific cases where adding files to volumes is justified, the comment highlights a general best practice that is applicable in many scenarios. It is a useful suggestion for improving code quality and maintainability.
    The comment is relevant to the changes made and provides a valid suggestion for improving the Docker setup. It should be kept as it addresses a potential issue with the current implementation.
2. skyvern/forge/sdk/api/llm/api_handler_factory.py:223
  • Draft comment:
    There are multiple logging statements added for debugging purposes. Ensure to remove or adjust the verbosity of these logs before merging to avoid cluttering the log output. This applies to other similar logging statements in this file as well.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR author mentioned that they added a lot of logging for debugging purposes, which should be removed before merging.
3. skyvern/forge/sdk/api/llm/api_handler_factory.py:395
  • Draft comment:
    Ensure that existing functionality is tested to prevent any breaking changes. Consider adding tests for the new functionality as well.
  • Reason this comment was not posted:
    Comment did not seem useful.
4. skyvern/forge/sdk/api/llm/config_registry.py:46
  • Draft comment:
    Ensure that existing functionality is tested to prevent any breaking changes. Consider adding tests for the new functionality as well.
  • Reason this comment was not posted:
    Marked as duplicate.
5. skyvern/forge/sdk/api/llm/models.py:36
  • Draft comment:
    Ensure that existing functionality is tested to prevent any breaking changes. Consider adding tests for the new functionality as well.
  • Reason this comment was not posted:
    Marked as duplicate.

Workflow ID: wflow_DOBw5HxcicjTgvTn


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@suchintan
Copy link
Contributor

haha I love to see terrible code (as a v1!!)

# Enable Ollama
- ENABLE_OLLAMA=true
- LLM_KEY=OLLAMA_TEXT
- OLLAMA_API_BASE=http://192.168.1.3:11434

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add option to persist model in memory

curl http://localhost:11434/api/generate -d '{
"model": "",
"keep_alive":
}'

@art3miz18
Copy link

@loktar00 hey, is this compactible with llama vision models ?

@loktar00
Copy link
Author

@loktar00 hey, is this compactible with llama vision models ?

Should be fine with it for the first step, it switches between models, vision then it goes to a language model. Only because Ollama doesn't support both at once, unless the Vision Support for Llama works now.

@DominicTWHV
Copy link

You can rapidly load and unload ollama models via the api endpoint as suggested above, use keepalive 0 to immediately unload said model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants