Comment by Sol-

Comment by Sol- 16 hours ago

1 reply

No, I think apparently it was used in the reinforcement learning step somehow to influence the model's final fine-tuning. At least how I understood it.

The actual system prompt from Anthropic is shorter and also public on their website I believe