Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPT3.5 Resilience & JSON Prompting Techniques #15

Open
Cormanz opened this issue May 10, 2023 · 6 comments
Open

GPT3.5 Resilience & JSON Prompting Techniques #15

Cormanz opened this issue May 10, 2023 · 6 comments

Comments

@Cormanz
Copy link
Owner

Cormanz commented May 10, 2023

When doing testing, gpt-3.5-turbo tends to struggle a lot with formatting JSON queries. It sometimes responds with plaintext and then gives the JSON query, or messes up the format.

This issue proposes two suggested changes.

  1. Search for any valid JSON match inside of the response. If one is found, use it (modify the message to remove anything else except the JSON response)
  2. Try to get an LLM to fix the JSON response if it's invalid.

Alongside this, it also might be a good idea to use YML for prompt token reduction and for making it easier for LLMs to understand the prompt. However, whether this is better than JSON is uncertain (it hasn't proven reliable in my light testing.)

@Cormanz Cormanz pinned this issue May 10, 2023
@calum-bird
Copy link

If you use a gpt-3 base model like davinci-003, you can force it to use JSON formatting by providing it the start of a json object at the end of the prompt. Ie:

Some instructions... always return in json.

Here is the response:
{

This doesn't work for chat models like 3.5 & 4 since you can't seem to provide the start of an assistant response as before, but it's possible that an older model that is better with formatting will actually be an improvement.

@Cormanz
Copy link
Owner Author

Cormanz commented May 10, 2023

I've definitely been looking into those ideas, alongside even more sophisticated solutions like jsonformer. The main goal is providing support for instruct/chat based models however (the internals of SmartGPT take advantage of Vec<Message> or &[Message] which are abstractions around chat conversations), and although this can be used with base models too by turning it into a pure plaintext prompt, I'm not sure it'd be as effective.

You might be able to provide the start of an assistant response: in the OpenAI playground, you're able to add an assistant response, and then it'll continue. I haven't tested this with the API, but there's no reason why it would work differently.

I think even if we had it start the response with {, unless we used something like jsonformer, there would still be parsing errors (text outside of the JSON, trailing commas, etc.) I think #20 is a strong start at being able to more effectively use this with models that are worse at precise formatting like gpt-3.5-turbo, and future prompting techniques should be investigated in the future. I'll keep this issue open as a place for discussion of future JSON prompting techniques.

@Cormanz Cormanz changed the title GPT3.5 Resilience GPT3.5 Resilience & JSON Prompting Techniques May 10, 2023
@Cormanz Cormanz unpinned this issue May 10, 2023
@jaykchen
Copy link

@Cormanz I have a strong opinion:

  • let llm output in json format wastes lots of computing power and token count
  • I propose we ask llm to return thoughts, etc. in separate text sections, we write old fashioned code manually parse them into Rust struct

please go to https://platform.openai.com/tokenizer to find out how wasteful it is to use json format in gpt models, see the colorized blocks for tokens wasted on json's formating
image

@jaykchen
Copy link

on the other side of the coin, it saves token and I guess it's easier for llm to understand if we feed llm data in json if the data has a tree like structure, because json is already a structured data representation, following screencap indicates a plain text representation of a tree structure is very wasteful in token count
image

@Cormanz
Copy link
Owner Author

Cormanz commented May 22, 2023

@Cormanz I have a strong opinion:

  • let llm output in json format wastes lots of computing power and token count
  • I propose we ask llm to return thoughts, etc. in separate text sections, we write old fashioned code manually parse them into Rust struct

please go to https://platform.openai.com/tokenizer to find out how wasteful it is to use json format in gpt models, see the colorized blocks for tokens wasted on json's formating image

The output you showed uses multiple levels of indentation, which greatly increases the token account. The SmartGPT prompt only uses one level of indentation, so the token cost of indentation is reduced significantly, making it much less costly compared to your example.

In addition, JSON is a format that the model was trained on extensively. When it is asked to write JSON, it has a significant amount of internal knowledge of it, and can do it much more effectively, and more consistently. On the other hand, asking it to write it in separate sections would completely forgo its understanding of JSON, and lead to much less consistent results, alongside making us maintain even more code to handle proper parsing of it.

I don't think that it's a solution worth considering, given the minimal token cost with one level of indentation, and the extreme lack of consistency. Rather, it may be worth pursuing technologies like JSON-Former, or using YML to reduce tokens and allow for responses to be streamable and more consistent.

@jaykchen
Copy link

@Cormanz thanks for your explanation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants