-
Notifications
You must be signed in to change notification settings - Fork 13
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
dvra example
- Loading branch information
Showing
27 changed files
with
2,358 additions
and
868 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,321 @@ | ||
# Getting Started | ||
|
||
Rigging is a flexible library built on top of other very flexible libraries. As such it might take a bit to warm | ||
up to it's interfaces provided the many ways you can accomplish your goals. However, the code is well documented | ||
and topic pages and source are a great places to step in/out of as you explore. | ||
|
||
??? tip "IDE Setup" | ||
|
||
Rigging has been built with full type support which provides clear guidance on what | ||
methods return what types, and when they return those types. It's recommended that you | ||
operate in a development environment which can take advantage of this information. | ||
You're use of Rigging will almost "fall" into place and you won't be guessing about | ||
objects as you work. | ||
|
||
## Basic Chats | ||
|
||
Let's start with a very basic generation example that doesn't include any parsing features, continuations, etc. | ||
You want to chat with a model and collect it's response. | ||
|
||
We first need to get a [generator][rigging.generator.Generator] object. We'll use | ||
[`get_generator`][rigging.generator.get_generator] which will resolve an identifier string | ||
to the underlying generator class object. | ||
|
||
??? note "API Keys" | ||
|
||
The default Rigging generator is [LiteLLM][rigging.generator.LiteLLMGenerator], which | ||
wraps a large number of providers and models. We assume for these examples that you | ||
have API tokens set as environment variables for these models. You can refer to the | ||
[LiteLLM docs](https://docs.litellm.ai/docs/) for supported providers and their key format. | ||
If you'd like, you can change any of the model IDs we use and/or add `,api_key=[sk-1234]` to the | ||
end of any of the generator IDs to specify them inline. | ||
|
||
```py hl_lines="3" | ||
import rigging as rg # (1)! | ||
|
||
generator = rg.get_generator("claude-3-sonnet-20240229") # (2)! | ||
pending = generator.chat( | ||
[ | ||
{"role": "system", "content": "You are a wizard harry."}, | ||
{"role": "user", "content": "Say hello!"}, | ||
] | ||
) | ||
chat = pending.run() | ||
print(chat.all) | ||
# [ | ||
# Message(role='system', parts=[], content='You are a wizard harry.'), | ||
# Message(role='user', parts=[], content='Say hello!'), | ||
# ] | ||
``` | ||
|
||
1. You'll see us use this shorthand import syntax throughout our code, it's | ||
totally optional but makes things look nice. | ||
2. This is actually shorthand for `litellm!anthropic/claude-3-sonnet-20240229`, where `litellm` | ||
is the provider. We just default to that generator and you don't have to be explicit. You | ||
can find more information about this in the [generators](../topics/generators.md) docs. | ||
|
||
|
||
Generators have an easy [`chat()`][rigging.generator.Generator.chat] method which you'll | ||
use to initiate the conversations. You can supply messages in many different forms from | ||
dictionary objects, full [`Message`][rigging.message.Message] classes, or a simple `str` | ||
which will be converted to a user message. | ||
|
||
```py hl_lines="4-9" | ||
import rigging as rg | ||
|
||
generator = rg.get_generator("claude-3-sonnet-20240229") | ||
pending = generator.chat( # (1)! | ||
[ | ||
{"role": "system", "content": "You are a wizard harry."}, | ||
{"role": "user", "content": "Say hello!"}, | ||
] | ||
) | ||
chat = pending.run() | ||
print(chat.all) | ||
# [ | ||
# Message(role='system', parts=[], content='You are a wizard harry.'), | ||
# Message(role='user', parts=[], content='Say hello!'), | ||
# Message(role='assistant', parts=[], content='Hello! How can I help you today?'), | ||
# ] | ||
``` | ||
|
||
1. [`generator.chat`][rigging.generator.Generator.chat] is actually just a helper for | ||
[`chat(generator, ...)`][rigging.generator.chat], they do the same thing. | ||
|
||
??? note "PendingChat vs Chat" | ||
|
||
You'll notice we name the result of `chat()` as `pending`. The naming might be confusing, | ||
but chats go through 2 phases. We first stage them into a pending state, where we operate | ||
and prepare them in a "pipeline" of sorts before we actually trigger generation with `run()`. | ||
|
||
Calling `.chat()` doesn't trigger any generation, but calling any of these run methods will: | ||
|
||
- [rigging.chat.PendingChat.run][] | ||
- [rigging.chat.PendingChat.run_many][] | ||
- [rigging.chat.PendingChat.run_batch][] | ||
|
||
In this case, we have nothing additional we want to add to our pending chat, and we are only interested | ||
in generating exactly one response message. We simply call [`.run()`][rigging.chat.PendingChat.chat] to | ||
execute the generation process and collect our final [`Chat`][rigging.chat.Chat] object. | ||
|
||
```py hl_lines="10-11" | ||
import rigging as rg | ||
|
||
generator = rg.get_generator("claude-3-sonnet-20240229") | ||
pending = generator.chat( | ||
[ | ||
{"role": "system", "content": "You are a wizard harry."}, | ||
{"role": "user", "content": "Say hello!"}, | ||
] | ||
) | ||
chat = pending.run() | ||
print(chat.all) | ||
# [ | ||
# Message(role='system', parts=[], content='You are a wizard harry.'), | ||
# Message(role='user', parts=[], content='Say hello!'), | ||
# Message(role='assistant', parts=[], content='Hello! How can I help you today?'), | ||
# ] | ||
``` | ||
|
||
View more about Chat objects and their properties [over here.][rigging.chat.Chat]. In general, chats | ||
give you access to exactly what messages were passed into a model, and what came out the other side. | ||
|
||
## Conversation | ||
|
||
Both `PendingChat` and `Chat` objects provide freedom for forking off the current state of messages, or | ||
continuing a stream of messages after generation has occured. In general: | ||
|
||
- [`PendingChat.fork`][rigging.chat.PendingChat.fork] will clone the current pending chat and let you maintain | ||
both the new and original object for continued processing. | ||
- [`Chat.fork`][rigging.chat.Chat.fork] will produce a fresh `PendingChat` from all the messages prior to the | ||
previous generation (useful for "going back" in time). | ||
- [`Chat.continue_`][rigging.chat.Chat.continue_] is similar to `fork` (actually a wrapper) which tells `fork` to | ||
include the generated messages as you move on (useful for "going forward" in time). | ||
|
||
```py | ||
import rigging as rg | ||
|
||
generator = rg.get_generator("gpt-3.5-turbo") | ||
chat = generator.chat([ | ||
{"role": "user", "content": "Hello, how are you?"}, | ||
]) | ||
|
||
# We can fork before generation has occured | ||
specific = chat.fork("Be specific please.").run() | ||
poetic = chat.fork("Be as poetic as possible").overload(temperature=1.5).run() # (1)! | ||
|
||
# We can also continue after generation | ||
next_chat = poetic.continue_( | ||
{"role": "user", "content": "That's good, tell me a joke"} | ||
) | ||
|
||
update = next_chat.run() | ||
``` | ||
|
||
1. In this case the temperature change will only be applied to the poetic path because `fork` has | ||
created a clone of our pending chat. | ||
|
||
## Basic Parsing | ||
|
||
Now let's assume we want to ask the model for a piece of information, and we want to make sure | ||
this item conforms to a pre-defined structure. Underneath rigging uses [Pydantic XML](https://pydantic-xml.readthedocs.io/) | ||
which itself is built on [Pydantic](https://docs.pydantic.dev/). We'll cover more about | ||
constructing models in a [later section](../topics/models.md), but don't stress the details for now. | ||
|
||
??? note "XML vs JSON" | ||
|
||
Rigging is opinionated with regard to using XML to weave unstructured data with structured contents | ||
as the underlying LLM generates text responses. A frequent solution to getting "predictable" | ||
outputs from LLMs has been forcing JSON conformant outputs, but we think this is | ||
poor form in the long run. You can read more about this from [Anthropic](https://docs.anthropic.com/claude/docs/use-xml-tags) | ||
who have done extensive research with their models. | ||
|
||
We'll skip the long rant, but trust us that XML is a very useful syntax which beats | ||
JSON any day of the week for typical use cases. | ||
|
||
To begin, let's define a `FunFact` model which we'll have the LLM fill in. Rigging exposes a | ||
[`Model`][rigging.model.Model] base class which you should inherit from when defining structured | ||
inputs. This is a lightweight wrapper around pydantic-xml's [`BaseXMLModel`](`https://pydantic-xml.readthedocs.io/en/latest/pages/api.html#pydantic_xml.BaseXmlModel`) | ||
with some added features and functionality to make it easy for Rigging to manage. However, everything | ||
these models support (for the most part) is also supported in Rigging. | ||
|
||
```py hl_lines="3-4" | ||
import rigging as rg | ||
|
||
class FunFact(rg.Model): | ||
fact: str # (1)! | ||
|
||
chat = rg.get_generator('gpt-3.5-turbo').chat( | ||
f"Provide a fun fact between {FunFact.xml_example()} tags." | ||
).run() | ||
|
||
fun_fact = chat.last.parse(FunFact) | ||
|
||
print(fun_fact.fact) | ||
# The Eiffel Tower can be 15 cm taller during the summer due to the expansion of the iron in the heat. | ||
``` | ||
|
||
1. This is what pydantic XML refers to as a "primitive" class as it is simply and single | ||
typed value placed between the tags. See more about primitive types, elements, and attributes in the | ||
[Pydantic XML Docs](https://pydantic-xml.readthedocs.io/en/latest/pages/quickstart.html#primitives) | ||
|
||
We need to show the target LLM how to format it's response, so we'll use the | ||
[`.xml_example()`][rigging.model.Model.xml_example] class method which all models | ||
support. By default this will simple emit empty XML tags of our model: | ||
|
||
```xml | ||
Provide a fun fact between <fun-fact></fun-fact> tags. | ||
``` | ||
|
||
??? note "Customizing Model Tags" | ||
|
||
Tags for a model are auto-generated based on the name of the class. You are free | ||
to override these by passing `tag=[value]` into your class definition like this: | ||
|
||
```py | ||
class LongNameForThing(rg.Model, tag="short"): | ||
... | ||
``` | ||
|
||
We wrap up the generation and extract our parsed object by calling [`.parse()`][rigging.message.Message.parse] | ||
on the [last message][rigging.chat.Chat.last] of our generated chat. This will process the contents | ||
of the message, extract the first matching model which parses successfully, and return it to us as a python | ||
object. | ||
|
||
```py hl_lines="10" | ||
import rigging as rg | ||
|
||
class FunFact(rg.Model): | ||
fact: str | ||
|
||
chat = rg.get_generator('gpt-3.5-turbo').chat( | ||
f"Provide a fun fact between {FunFact.xml_example()} tags." | ||
).run() | ||
|
||
fun_fact = chat.last.parse(FunFact) | ||
|
||
print(fun_fact.fact) # (1)! | ||
# The Eiffel Tower can be 15 cm taller during the summer due to the expansion of the iron in the heat. | ||
``` | ||
|
||
1. Because we've defined `FunFact` as a class, the result if `.parse()` is typed to that object. In our | ||
code, all the properties of fact will be available just like we created the object directly. | ||
|
||
Notice that we don't have to worry about the model being verbose in it's response, as we've communicated | ||
that the text between the `#!xml <fun-fact></fun-fact>` tags is the relevent place to put it's answer. | ||
|
||
## Strict Parsing | ||
|
||
In the example above, we don't handle the case where the model fails to properly conform to our | ||
desired output structure. If the last message content is invalid in some way, our call to `parse` | ||
will result in an exception from rigging. Rigging is designed at it's core to manage this process, | ||
and we have a few options: | ||
|
||
1. We can make the parsing optional by switching to [`.try_parse()`][rigging.message.Message.try_parse]. The type | ||
of the return value with automatically switch to `#!python FunFact | None` and you can handle cases | ||
where parsing failed. | ||
2. We can extend our pending chat with [`.until_parsed_as()`][rigging.chat.PendingChat] which will cause the | ||
`run()` function to internally check if parsing is succeeding before returning the chat back to you. | ||
|
||
=== "Option 1 - Trying" | ||
|
||
```py hl_lines="5" | ||
chat = rg.get_generator('gpt-3.5-turbo').chat( | ||
f"Provide a fun fact between {FunFact.xml_example()} tags." | ||
).run() | ||
|
||
fun_fact = chat.last.try_parse(FunFact) # fun_fact might now be None | ||
|
||
print(fun_fact or "Failed to get fact") | ||
``` | ||
|
||
=== "Option 2 - Until" | ||
|
||
```py hl_lines="3" | ||
chat = rg.get_generator('gpt-3.5-turbo').chat( | ||
f"Provide a fun fact between {FunFact.xml_example()} tags." | ||
).until_parsed_as(FunFact).run() | ||
|
||
fun_fact = chat.last.parse(FunFact) # This call should never fail | ||
|
||
print(fun_fact or "Failed to get fact") | ||
``` | ||
|
||
A couple of comments regarding this structure: | ||
|
||
1. We still have to call `parse` on the message despite use using `until_parsed_as`. This is | ||
a limitation of type hinting as we'd have to turn every `PendingChat` and `Chat` into a generic | ||
which could carry types forward. It's a small price for big code complexity savings. | ||
2. Internally, the generation code inside `PendingChat` will attempt to re-generate until | ||
the LLM correctly produces a parsable input, up until a maximum number of "rounds" is reached. | ||
This process is configurable with the arguments to all [`until`][rigging.chat.PendingChat.until_parsed_as] | ||
or [`using`][rigging.chat.PendingChat.using] functions. | ||
|
||
## Parsing Many Models | ||
|
||
Assuming we wanted to extend our example to produce a set of interesting facts, we have a couple of options: | ||
|
||
1. Simply use [`run_many()`][rigging.chat.PendingChat.run_many] and generate N examples individually | ||
2. Rework our code slightly and let the model provide us multiple facts at once. | ||
|
||
=== "Option 1 - Multiple Generations" | ||
|
||
```py | ||
chats = rg.get_generator('gpt-3.5-turbo').chat( | ||
f"Provide a fun fact between {FunFact.xml_example()} tags." | ||
).run_many(3) | ||
|
||
for chat in chats: | ||
print(chat.last.parse(FunFact).fact) | ||
``` | ||
|
||
=== "Option 2 - Inline Set" | ||
|
||
```py | ||
chat = rg.get_generator('gpt-3.5-turbo').chat( | ||
f"Provide a 3 fun facts each between {FunFact.xml_example()} tags." | ||
).run() | ||
|
||
for fun_fact in chat.last.parse_set(FunFact): | ||
print(fun_fact.fact) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
# Rigging | ||
|
||
Rigging is a lightweight LLM interaction framework built on Pydantic XML. The goal is to make leveraging LLMs in production pipelines as simple and effictive as possible. Here are the highlights: | ||
|
||
- **Structured Pydantic models** can be used interchangably with unstructured text output. | ||
- LiteLLM as the default generator giving you **instant access to a huge array of models**. | ||
- Add easy **tool calling** abilities to models which don't natively support it. | ||
- Store different models and configs as **simple connection strings** just like databases. | ||
- Chat templating, forking, continuations, generation parameter overloads, stripping segments, etc. | ||
- Modern python with type hints, async support, pydantic validation, serialization, etc. | ||
|
||
```py | ||
import rigging as rg | ||
from rigging.model import CommaDelimitedAnswer as Answer | ||
|
||
answer = rg.get_generator('gpt-4') \ | ||
.chat(f"Give me 3 famous authors between {Answer.xml_tags()} tags.") \ | ||
.until_parsed_as(Answer) \ | ||
.run() | ||
|
||
answer = chat.last.parse(Answer) | ||
print(answer.items) | ||
|
||
# ['J. R. R. Tolkien', 'Stephen King', 'George Orwell'] | ||
``` | ||
|
||
Rigging is built and maintained by [dreadnode](https://dreadnode.io) where we use it daily for our work. | ||
|
||
## Installation | ||
We publish every version to Pypi: | ||
```bash | ||
pip install rigging | ||
``` | ||
|
||
If you want to build from source: | ||
```bash | ||
cd rigging/ | ||
poetry install | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# Principles | ||
|
||
LLMs are extremely capable machine learning systems, but they operate purely in textual spaces as a byproduct of | ||
their training data. We have access to the compression of a huge repository of human knowledge, but are limited to quering | ||
that information via natural language. Our first inclination is to let these language interfaces drive | ||
our design decisions. We build chat bots and text search, and when it comes time to align them with closely | ||
with the rest of our fixed software stack, we quickly get frustrated by their inconsistencies and limited | ||
control over their products. | ||
|
||
In software we operate on the principle of known interfaces as the basis for composability. In the functional paradigm, we want our | ||
software functions to operate like mathmatical ones, where the same input always produces the same output with no side effects. | ||
Funny enough LLMs (like all models) also operate in that way (minus things like floating point errors), but we intentionally | ||
inject randomness to our sampling process to give them the freedom to explore and produce novel outputs. Therefore we shouldn't | ||
aim for "purity" in the strict sense, but we should aim for consistency in their interface. | ||
|
||
Once you start to think of a "prompt", "completion", or "chat interaction" as being the temporary textual interface by which we pass in | ||
structured inputs and produce structured outputs, we can begin to link them with traditional software. Many libraries get close to this | ||
idea, but they rarely hold the opinion that programing types and structures, and not text, are the best way to make LLM-based | ||
systems composible. | ||
|
||
Reframing these language models as tools which use tokens of text in context windows to navigate latest space and produce | ||
probabilities of output tokens, but do not need to have the data they consume or produce be holistically constrained to | ||
textual spaces in our use of them is a core opinion of Rigging. |
Oops, something went wrong.