Comment by andy99
The better definition of equal performance would obviously be that the metrics for the detector - accuracy or false positive rate etc would be the same for all groups.
I won't comment on why it's defined the way that it is.
Edit: it looks like they define several metrics, including ones like I mention above that consider performance and at least one based on what number or percentage is flagged in each group.
Or that the error distributions are equal across groups. That way you could still detect that one group is committing fraud at a higher rate, but false positives/negatives occur at the same rate in each group