Why is fine tuning a chat model harder than fine tuning an autocomplete model? #282

Harrolee · 2024-01-20T15:20:44Z

Harrolee
Jan 20, 2024

Hey! n00b question here.

I noticed that Refact does not support fine tuning for chat models:

Why is it more difficult to fine tune a chat model than a fim/autocomplete model? Is it because the autocomplete functionality is achieved by changing the data format? And fine tuning a chat model would require adding a chat LoRA after the "new-content" fine tuning?

My probably-wrong understanding of how this works

Chat fine-tuning process, using LoRA:

get pre-trained model (e.g. llama2, deepseekcoder)
self-supervise fine tune* the pre-trained model on new data using LoRA
get a chat LoRA for the pre-trained model
apply chat LoRA and the LoRA from step 2 onto the pre-trained model

* my probably-wrong understanding of self-supervised fine tuning is:

format new data as a single text column with n rows of roughly similar length, like: {"text": "string content of data whose length is about the same for each entry"}
forward pass, back prop
evaluate loss by testing the model's accuracy in reproducing the value of each "text" entry

olegklimov · 2024-01-22T08:01:44Z

olegklimov
Jan 22, 2024
Maintainer

The trick is where to get new data for chat from. If all you have is company's codebase, you need to translate it into conversations similar to that at test time.

We are working on "self-play" that our system generates tasks for itself, but it's not as straightforward as finetune on data you see locally.

1 reply

Harrolee Jan 22, 2024
Author

Ah I understand.
I had hoped that a LoRA would "teach" the LLM how to answer Chat queries about the unstructured text that it had been fine tuned on.

idea for generating data in Chat format from a codebase:

Chunk the codebase into functions or classes intelligently
Use something like Paul Gauthier's grep-ast to generate context around the chunk
Use that context to generate a prompt for an LLM. For each chunk, ask the LLM to generate n questions that could be asked about the codechunk, and use one-shot prompting to get the LLM to format its answer in a json format
Ask a second LLM to check that the format matches what we want
Append the result to a file

There's also a paper (that I can't seem to find atm) about creating conversational chat entries from git commits. Most git commits are not very good. I suppose an LLM could generate git commits from git history, but each additional step in this chain of Chat-content generation increases likelihood of failure.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is fine tuning a chat model harder than fine tuning an autocomplete model? #282

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Why is fine tuning a chat model harder than fine tuning an autocomplete model? #282

Harrolee Jan 20, 2024

My probably-wrong understanding of how this works

* my probably-wrong understanding of self-supervised fine tuning is:

Replies: 1 comment · 1 reply

olegklimov Jan 22, 2024 Maintainer

Harrolee Jan 22, 2024 Author

idea for generating data in Chat format from a codebase:

Harrolee
Jan 20, 2024

Replies: 1 comment 1 reply

olegklimov
Jan 22, 2024
Maintainer

Harrolee Jan 22, 2024
Author