Comment by slackr
Very interesting. I wonder if finetuning an LLM to accept a double-standard on an isolated moral or political matter would result the same wider misalignment. Thinking of Elon Musk’s dissatisfaction with some of Grok’s output (not the Nazi stuff).