Comment by Y-bar
> If you are writing a scraper it behooves you to understand the website that you are scraping.
That’s what semantic markup is for? No? H1…n:s, article:s, nav:s, footer:s (and microdata even) and all that helps both machines and humans to understand what parts of the content to care about in certain contexts.
Why treat certain CMS:s different when we have the common standard format HTML?