Comment by Dylan16807
Comment by Dylan16807 2 days ago
Is it confirmed that site loads go into the training database?
But for anyone whose main concern is their server staying up, Atlas isn't a problem. It's not doing a million extra loads.
Comment by Dylan16807 2 days ago
Is it confirmed that site loads go into the training database?
But for anyone whose main concern is their server staying up, Atlas isn't a problem. It's not doing a million extra loads.
Surely the data must go to the OpenAI servers, how else would they use LLMs on it? We cannot see if that data ends up in the training data.
Personally I would just believe what they say for the time being; there would be backlash in doing something else, possibly legal one.
Whatever is included in context is in OpenAI's control from that point forward, and you just have to trust them not to do anything with it.
That isn't a conspiracy theory, it's fundamentally how interfacing with 3rd party hosted LLMs works.
> Is it confirmed that site loads go into the training database?
Would you trust OpenAI if they told you it doesn't?
If you would, would you also trust Meta to tell you if its multibillion dollar investment was trained on terabytes of pirated media the company downloaded over BitTorrent?