Comment by linolevan
I'm wondering in what ways is this similar/different to https://github.com/DGoettlich/history-llms?
I saw TimeCapsuleLLM a few months ago, and I'm a big fan of the concept but I feel like the execution really isn't that great. I wish you:
- Released the full, actual dataset (untokenized, why did you pretokenize the small dataset release?)
- Created a reproducible run script so I can try it out myself
- Actually did data curation to remove artifacts in your dataset
- Post-trained the model so it could have some amount of chat-ability
- Released a web demo so that we could try it out (the model is tiny! Easily can run in the web browser without a server)
I may sit down and roll a better iteration myself.
I guess chat-ability would require some chat-like data, so would that mean first coming up with a way to extract chat-like dialogue from the era and then use that to fine-tune the model?