Comment by delaminator
Comment by delaminator 2 days ago
Go back a bit further for why.
Netscape Navigator did, in fact, reject invalid HTML. Then along came Internet Explorer and chose “render invalid HTML dwim” as a strategy. People, my young naive self included, moaned about NN being too strict. NN eventually switched to the tag soup approach. XHTML 1.0 arrived in 2000, attempting to reform HTML by recasting it as an XML application. The idea was to impose XML’s strict parsing rules: well-formed documents only, close all your tags, lowercase element names, quote all attributes, and if the document is malformed, the parser must stop and display an error rather than guess. XHTML was abandoned in 2009. When HTML5 was being drafted in 2004-onwards, the WHATWG actually had to formally specify how browsers should handle malformed markup, essentially codifying IE’s error-recovery heuristics as the standard.
But not closing <p> etc has always been valid HTML. Back from SGML it was possible for closing tags to be optional (depending on the DTD), and Netscape supported this from the beginning.
Leaving out closing tags is possible when the parsing is unambigous. E.g <p>foo<p>bar is unambiguous becuse p elements does not nest, so they close automatically by the next p.
The question about invalid HTML is a sepearate issue. E.g you can’t nest a p inside an i according to the spec, so how does a browser render that? Or lexical error like illegal characters in a non-quoted attribute value.
This is where it gets tricky. Render anyway, skip the invalid html, or stop rendering with an error message? HTML did not specify what to do with invalid input, so either is legal. Browsers choose to go with the “render anyway” approach, but this lead to different outputs in different browsers, since it wasn’t agreed upon how to render invald html.
The difference between Netscape and IE was that Netscape in more cases would skip rendering invalid HTML, where IE would always render the content.