Comment by jakkos
Comment by jakkos a day ago
> Pre-training is, actually, our collective gift
I feel like this wording isn't great when there are many impactful open source programmers who have explicitly stated that they don't want their code used to train these models and licensed their work in a world where LLMs didn't exist. It wasn't their "gift", it was unwillingly taken from them.
> I'm a programmer, and I use automatic programming. The code I generate in this way is mine. My code, my output, my production. I, and you, can be proud.
I've seen LLMs generate code that I have immediately recognized as being copied a from a book or technical blog post I've read before (e.g. exact same semantics, very similar comment structure and variable names). Even if not legally required, crediting where you got ideas and code from is the least you can do. While LLMs just launder code as completely your own.
I don't think it's possible to separate any open source contribution from the ones that came before it, as we're all standing on the shoulders of giants. Every developer learns from their predecessors and adapts patterns and code from existing projects.