Comment by unrahul
I have seen this flow in what people in some startups call "Agentic OCR", its essentially a control flow that is coded that tries pdf-parse first or a similar non expensive approach, and if it fails a threshold then use screenshot to text extraction.