Comment by altcognito
Comment by altcognito 2 days ago
It might be fun to collect the same data if not for any other reason than to note the changes but adding the caveat that it doesn’t represent human output.
Might even change the tool name.
Comment by altcognito 2 days ago
It might be fun to collect the same data if not for any other reason than to note the changes but adding the caveat that it doesn’t represent human output.
Might even change the tool name.
But that would always be the case. Twitter will not last forever; heck, it may not even be long before an open alternative like Bluesky competes with it. Would be interesting to know what percentage of the original mined data was from Twitter.
The point was it’s getting harder and harder to do that as things get locked down or go behind a massive paywall to either profit off of or avoid being used in generative AI. The places where previous versions got data is impossible to gather from anymore so the dataset you would collect would be completely different, which (might) cause weird skewing.