Comment by chickensong
Comment by chickensong 2 days ago
The article explains why that's not a very good test however.
Comment by chickensong 2 days ago
The article explains why that's not a very good test however.
I guess I assumed that it's not highly relevant to the task, but I suppose it depends on interpretation. E.g. if someone tells the bus driver to smile while he drives, it's hopefully clear that actually driving the bus is more important than smiling.
Having experimented with similar config, I found that Claude would adhere to the instructions somewhat reliably at the beginning and end of the conversation, but was likely to ignore during the middle where the real work is being done. Recent versions also seem to be more context-aware, and tend to start rushing to wrap up as the context is nearing compaction. These behaviors seem to support my assumption, but I have no real proof.
It will also let the LLM process even more tokens, thus decreasing it's accuracy
Why not? It's relevant for all tasks, and just adds 1 line