Skip to content

Commit

Permalink
More flexible processing architecture; New Schema analyzer; Support f…
Browse files Browse the repository at this point in the history
…or Scalar results
  • Loading branch information
ddebowczyk committed Mar 4, 2024
1 parent 71d2f25 commit 33677a6
Show file tree
Hide file tree
Showing 104 changed files with 2,264 additions and 673 deletions.
66 changes: 66 additions & 0 deletions NOTES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# NOTES

## Support scalar types as response_model

Solution 1:
Have universal scalar value adapter with HasSchemaProvider interface
HasSchemaProvider = schema() : Schema, which, if present, will be used to generate schema
Instead of the default schema generation mechanism
This will allow for custom schema generation

## Custom schema generation - not based on class reflection & PHPDoc

Model classes could implement HasSchemaProvider interface, which would allow for custom schema generation - rendering logic would skip reflection and use the provided schema instead.

SchemaProvider could be a trait, which would allow for easy implementation.

Example SchemaProvider:
class SchemaProvider {
public function schema(): Schema {
return new Schema([
'type' => 'object',
'properties' => [
'id' => ['type' => 'integer', 'description' => 'Description'],
'name' => ['type' => 'string', 'description' => 'Description'],
],
'required' => ['id', 'name'],
]);
}
}

## Validation

What about validation in such case? we can already have ```validate()``` method in the schema,
Is it enough?

## Deserialization

We also need custom deserializer or easier way of customizing existing one.
Specific need is #[Description] attribute, which should be used to generate description.

## Streaming arrays / iterables

Callback approach - provide callback to Instructor, which will be called for each
token received (?). It does not make sense for structured outputs, only if the result
is iterable / array.

## Partial updates

If callback is on, we should be able to provide partial updates to the object + send
notifications about the changes.

## Observability

Need and solution to be analyzed

## Other LLMs

Either via custom BASE_URIs - via existing OpenAI client or custom LLM classes.
LLM class is the one that needs to handle all model / API specific stuff (e.g. streaming,
modes, etc.).

## Caching schema

It may not be worth it purely for performance reasons, but it may be useful for debugging or schema optimization (DSPy like).

Schema could be saved in version controlled, versioned JSON files and loaded from there. In development mode it would be read from JSON file, unless class file is newer than schema file.
Loading

0 comments on commit 33677a6

Please sign in to comment.