choraria 6 hours ago

Damn! Just read the title and a few lines from the post but will definitely go through it fully and thoroughly. Thanks for sharing.

I didn't mean to reduce the complexity of the challenge. Was mostly trying to convey that the specific cases being discussed, should be something that I could quickly solution and incorporate in the API.

You're right about ALL the different kinds of edge cases that exist though and really, I'm trying to have this API be the go-to solution for it. Clearly, it's still not there. But it will be. I'm now more sure than ever.

gs17 7 hours ago

> I can safely assume that this dictionary of bad words contains no people’s names in it.

This is a big one for this kind of project, and I've never been sure how usernames for people named Kike should be handled.

  • choraria 6 hours ago

    Good point. Currently, I've got "kike" as a Spanish dictionary word and also a public figure. Honestly, the job of this API stops there. It tells the platform that this username needs to be handled differently than "randomusername7346783" which has absolutely no value. Now, what we do with this info is really up to admins/platform owners. They could simply do nothing, flag and monitor, charge a premium or block outright. Totally their call but they can now programmatically decide that.

    • gs17 6 hours ago

      It definitely should be in a list of offensive terms too (and offensive dictionaries by language could be even more useful, telling moderators why it was flagged is valuable).

      • choraria 6 hours ago

        I see. Will re-run through the categories and the datasets from which I've adopted the names and categories. Maybe either I missed something or it might've not existed in the import in the first place. But noted. Also, thanks :)