Comment by gliptic

Comment by gliptic 4 days ago

2 replies

But that fine-tuning is done only on those 100-200 good samples. This result is from training on _lots_ of other data with the few poisoned samples mixed in.

wongarsu 4 days ago

But none of that other data contains the trigger phrase. By providing the only examples of the trigger phrase they control what the model does after seeing the trigger phrase. Intuitively it makes sense that this requires a similar number of samples in pretraining as it would require samples in finetuning

  • shwaj 4 days ago

    I’m not a practitioner. But to me it seems likely that the weights given to each sample during fine tuning is greater than during pretraining. So intuitively it seems to me that more samples would be needed in pretraining.