Comment by souvik3333
Comment by souvik3333 11 hours ago
We have not trained explicitly on handwriting datasets (completely handwritten documents). But, there are lots of forms data with handwriting present in training. So, do try on your files, there is a huggingface demo, you can quickly test there: https://huggingface.co/spaces/Souvik3333/Nanonets-ocr-s
We are currently working on creating completely handwritten document datasets for our next model release.
Document:
* https://imgur.com/cAtM8Qn
Result:
* https://imgur.com/ElUlZys
Perhaps it needed more than 1K tokens? But it took about an hour (number 28 in queue) to generate that and I didn't feel like trying again.
How many tokens does it usually take to represent a page of text with 554 characters?