Comment by bpt3
Thanks for the response.
> Fair. I suppose most newer platforms may not think too much about it. So here's the pitch though: Imagine you're building the next Twitter (or, you know the platform has the potential to become the next Twitter). Knowing what we know now about social media platforms, where, users are open to paying for premium usernames (ex: @apple, @cocacola, @media etc.), it would be nice to at least flag/know if there are folks trying to reserve with these usernames. You could decide later / async what to do about it but you'll at least have a way to flag. Similarly, you can also avoid profanity or abusive words from seeping in the platform also. You may want to restrict/block 'em outright.
How many people are trying to build the next twitter? I would guess it's approximately zero, so I think you'll need a wider target audience to generate meaningful revenue.
It's much easier for the next twitter to just institute a policy that says handles can be modified by the platform as needed and deal with the "problem" post hoc.
> As for bugs: what I see happening now is folks either have a static list (which is already bad; not a bug) or have pattern-matching to avoid these (which isn't full proof). Regex/pattern matching can only help in cases where we have "real" or "try" or "something" as a pre/postfix. More complex cases but don't really identify a wide range of premium / reserved names. IMO, for this, we will need a dictionary of sorts, which is what I'm hoping to achieve with this API.
Based on what you've said, you're also using a static list, correct?
Long term, I suppose the actual value proposition is not that using a list is a bug, but you have the "best" list due to your scale and people can outsource managing their own version?
To me, the issue is that this isn't a solvable problem using your current approach because people are more creative than a list of banned strings and you're severely outnumbered at scale.
Right on all counts. Twitter is a rather simplified example. I see it as something that literally every platform can use. Say, ProductHunt, other platforms that offer product launches, link-in-bio tools etc. etc. I'm a bit bullish around the market because, regardless of me knowing all of 'em, the challenge of using usernames exists in general.
On the static list, yes. Me too. But I keep updating mine as well. For ex: on day 1, "apple" was just a dictionary word. On day 2, it was also classified as a brand. Also, every quarter, half-yearly or yearly, there are newer companies, public figures whose usernames keep getting to be significant. Currently, though manually, I intend to maintain this list for the long run.
As for a better, permanent solution, on another comment, I came across using an LLM/classifer for this (based on my understanding, that's not just asking OpenAI but building an LLM of my own) where I have the "best" source of truth and the LLM handles all variations. I think it actually is solvable to an extent now. Though, I'm not sure what the final solution looks. I WILL SOLVE THIS THOUGH :D