Comment by nycdatasci
Comment by nycdatasci 2 days ago
I think a more plausible path to gaming benchmarks would be to use watermarks in text output to identify your model, then unleash bots to consistently rank your model over opponents.
Comment by nycdatasci 2 days ago
I think a more plausible path to gaming benchmarks would be to use watermarks in text output to identify your model, then unleash bots to consistently rank your model over opponents.