Comment by raus22

Comment by raus22 14 hours ago

1 reply

With models like these, when multilingual is not mentioned it will perform really bad on real life non-english pdfs.

souvik3333 14 hours ago

The model was primarily trained on English documents, which is why English is listed as the main language. However, the training data did include a smaller proportion of Chinese and various European languages. Additionally, the base model (Qwen-2.5-VL-3B) is multilingual. Someone on Reddit mentioned it worked on Chinese: https://www.reddit.com/r/LocalLLaMA/comments/1l9p54x/comment...