Comment by dlcarrier

Comment by dlcarrier a day ago

2 replies

It's interesting that it's trained off only historic text.

Back in the pre-LLM days, someone trained a Markov chain off the King James Bible and a programming book: https://www.tumblr.com/kingjamesprogramming

I'd love to see an LLM equivalent, but I don't think that's enough data to train from scratch. Could a LoRA or similar be used in a way to get speech style to strictly follow a few megabytes worth of training data?

userbinator 13 hours ago

That was far more amusing than I thought it'd be. Now we can feed those into an AI image generator to create some "art".

_blk 15 hours ago

Yup that'd be very interesting. Notably missing from this project's list is the KJV (1611 was in use at the time.) The first random newspaper that I pulled up from a search for "london newspaper 1950" has sermon references on the front page so it seems like an important missing piece.

Somewhat missing the cutoff of 1875 is the revised NT of the KJV. Work on it started in 1870 but likely wasn't used widely before 1881.