Comment by swiftcoder
Comment by swiftcoder 2 days ago
You already have a JSON and XML parser too, and the website offers standardised APIs in both of those
Comment by swiftcoder 2 days ago
You already have a JSON and XML parser too, and the website offers standardised APIs in both of those
Not standardized enough; I can't guarantee the format of an API is RESTful, I can't know apriori what the response format is (arbitrary servers on the internet can't be trusted to be setting content type headers properly) or How to crawl it given the response data, etc. we ultimately never solved the problem of universal self- describing APIs, so a general crawling service can't trust they work.
In contrast, I can always trust that whatever is returned to be consumed by the browser is in the format that is consumable by a browser, because if it isn't the site isn't a website. Html is pretty much the only format guaranteed to be working.