Comment by linolevan

Comment by linolevan 15 hours ago

1 reply

I'm wondering in what ways is this similar/different to https://github.com/DGoettlich/history-llms?

I saw TimeCapsuleLLM a few months ago, and I'm a big fan of the concept but I feel like the execution really isn't that great. I wish you:

- Released the full, actual dataset (untokenized, why did you pretokenize the small dataset release?)

- Created a reproducible run script so I can try it out myself

- Actually did data curation to remove artifacts in your dataset

- Post-trained the model so it could have some amount of chat-ability

- Released a web demo so that we could try it out (the model is tiny! Easily can run in the web browser without a server)

I may sit down and roll a better iteration myself.

1313ed01 4 hours ago

I guess chat-ability would require some chat-like data, so would that mean first coming up with a way to extract chat-like dialogue from the era and then use that to fine-tune the model?