overview for hok

Can you fine-tune on localized steering of an LLM? by hok in c/localllama@sh.itjust.works

[-] hok@lemmy.dbzer0.com 1 points 4 months ago* (last edited 4 months ago)

Thanks for your answer. I think to be clear, what I'm looking for is a kind of masked fine-tuning. You see, I want to "steer" a particular output instead of providing complete examples, which are costly to create.

The steering would be something like this:

I have an LLM generate a sequence.
I find exactly where the LLM goes "off track" and correct it there (for only maybe 10-20 tokens instead of correcting the rest of the generation manually).
The LLM continues "on track" until it goes off track again.

What I would like to do is train the model based on these corrections I give it, where many corrections might be part of the same overall generation. Conceptually I think each correction must have some training value. I don't know much about masking, but what I mean here is that I don't want it to train on a few tens or hundreds of (incomplete) samples but rather thousands of (masked) "steers" that correct the course of the rest of the sample's generated text.

Can you fine-tune on localized steering of an LLM? by hok in c/localllama@sh.itjust.works

[-] hok@lemmy.dbzer0.com 1 points 4 months ago

Sorry, I really don't care to continue talking about the difference between supervised and unsupervised learning. It's a pattern used to describe how you are doing ML. It's not a property of a dataset (you wouldn't call Dataset A "unsupervised"). Read the Wikipedia articles for more details.

Can you fine-tune on localized steering of an LLM? by hok in c/localllama@sh.itjust.works

[-] hok@lemmy.dbzer0.com 1 points 4 months ago* (last edited 4 months ago)

Can SFT be used on partial generations? What I mean by a "steer" is a correction to only a portion, and not even the end, of model output.

For example, a "bad" partial output might be:

<assistant> Here are four examples:
1. High-quality example 1
2. Low-quality example 2

and the "steer" might be:

<assistant> Here are four examples:
1. High-quality example 1
2. High-quality example 2

but the full response will eventually be:

<assistant> Here are four examples:
1. High-quality example 1
2. High-quality example 2
3. High-quality example 3
4. High-quality example 4

The corrections don't include the full output.