Comment by deepsquirrelnet
Comment by deepsquirrelnet 5 days ago
It’s certainly feasible. I’d need to put together a corpus for training and I’m not terribly familiar with what’s available for French language.
I have done some training with the Mistral family of models, and that’s probably what I’d think to try first on a French corpus.
Feel free to open an issue and I’ll work on it as I find time.
Very interested in a multilingual version too!
FYI huggingface hosts datasets too. And wikipedia has a nice portal for datasets : https://en.m.wikipedia.org/wiki/List_of_datasets_for_machine...