Comment by godelski
> Are astronauts riding horses now represented in the training data more than would have been possible 5 years ago?
Yes.Though I'm a bit confused why this became the goto. If I remember correctly the claim was about it being "out of distribution" but I have high confidence that astronauts riding horses are within the training dataset prior to DALL-E. The big reason everyone should believe this is because astronauts have always been compared to cowboys. And... what do we stereotypically associate with cowboys?
The second reason, is because it is the main poster for the 2006 movie The Astronaut Farmer: https://en.wikipedia.org/wiki/The_Astronaut_Farmer
But here's some other ones I found that are timestamped. It's kinda hard to find random digital art that is timestamped. Looks like even shutterstock doesn't... And places like deviantart don't have great search. Hell... even Google will just flat out ignore advanced search terms (the fuck is even the point of having them?). The term is so littered now that this makes search difficult, but I found two relatively quickly.
2014: https://www.behance.net/gallery/18695387/Space-Cowboy#
2016: https://drawception.com/game/DZgKzhbrhq/badass-space-cowboy-...
But even if the samples did not exist, I do not think this represents a significantly out of distribution, if at all, image. Are we in doubt that there's images like astronauts riding rockets? I think certainly there exists "astronaut riding horse" along the interpolation between "person riding horse" and "astronaut riding <insert any term>". Mind you, generating samples in distribution but not in training (or test) is still a great feat and impressive accomplishment. This should in no way be underplayed at all! But there is a difference in claiming out of distribution.
> I'd like to see what this approach can do if trained exclusively on non-synthetic permissively licensed inputs
One minor point. The term "synthetically generated" is a bit ambiguous. It may include digital art. It does not necessarily mean generated by a machine learning generative model. TBH, I find the ambiguity frustrating as there is some important distinctions.
The original meme about the limitations of diffusion was the text to image prompt, “a horse riding an astronaut.”
It’s in all sorts of papers. This guy Gary Marcus used to be a big crank about AI limitations and was the “being wrong on the Internet” guy who got a lot of mainstream attention to the problem - https://garymarcus.substack.com/p/horse-rides-astronaut. Not sure how much we hear from him nowadays.
The astronaut riding horses thing is from how 10-1,000x more people are doing this stuff now, and kind of process the whole zeitgeist before their arrival with fuzzy glasses. The irony is it is the human, not the generator, that got confused about the purposefully out of sample horse riding an astronaut prompt, and changed it to astronaut riding a horse.