Comment by alternatex
Comment by alternatex 4 days ago
I've never had any implication of my gender other than my name in any CV over the past decade.
Who are these people who make a career history doc include gender-implicating data? And if there are such CVs, they should be stripped of such data before processing.
The fraternity example is such a specific 1 in a 1000 case.
Just because you aren't aware of it, doesn't mean it isn't there. There are plenty of less on-the-nose examples a model can accidentally train itself on.
Into horseriding? Probably a woman. Into motorcycles? Probably a man. Into musical theater? Probably a woman. Into football? Probably a man. Worked part-time for a few years in your 30s? Probably a woman. I could go on and on for hours, as there are relatively few hobbies and interests which have a truly gender-neutral audience.
The problem isn't obvious bias. Everyone can see those and filtering them out is trivial. It's the subtle proxy values which are risky, as you have to be very careful to avoid accidentally training on those.
CVs should ideally indeed be stripped of such data, but how do you propose we verify that we stripped it of all potential proxies? And what's going to be left to train on after stripping?