Comment by rcxdude

Comment by rcxdude 2 days ago

The fact that instruction tuning works at all is a small miracle, getting a rigorous idea of trusted vs untrusted input is not at all an easy task.

cubefox 2 days ago

It should work like normal instruction tuning, except the SFT examples contain additional instructions in <|quote|> tokens which are ignored in the sample response. So more complex than ordinary SFT but not that much more.

Reply View 2 replies

rcxdude 2 days ago

There are LLM finetunes which do this, it is very far from watertight.

Reply View | 1 reply
- cubefox 2 days ago
  
  Example?
  
  Reply View | 0 replies