Comment by jsheard
> The training data is weighted by a quality metric
At least in Googles case, they're having so much difficulty keeping AI slop out of their search results that I don't have much faith in their ability to give it an appropriately low training weight. They're not even filtering the comically low-hanging fruit like those YouTube channels which post a new "product review" every 10 minutes, with an AI generated thumbnail and AI voice reading an AI script that was never graced by human eyes before being shat out onto the internet, and is of course always a glowing recommendation since the point is to get the viewer to click an affiliate link.
Google has been playing the SEO cat and mouse game forever, so can startups with a fraction of the experience be expected to do any better at filtering the noise out of fresh web scrapes?
> Google has been playing the SEO cat and mouse game forever, so can startups with a fraction of the experience be expected to do any better at filtering the noise out of fresh web scrapes?
Google has been _monetizing_ the SEO game forever. They chose not to act against many notorious actors because the metric they optimize for is ad revenue and and those sites were loaded with ads. As long as advertisers didn’t stop buying, they didn’t feel much pressure to make big changes.
A smaller company without that inherent conflict of interest in its business model can do better because they work on a fundamentally different problem.