Comment by ethin

Comment by ethin 3 hours ago

4 replies

IMO PDFs should just be gone. Nobody should use them. They are a solution in search of a problem. The most common argument I hear is "well we need document fidelity!" But IMO this completely ignores the fact that this just isn't needed when we have digital signatures and a PKI and certificates and all that to prove that a document hasn't been tampered with. Making sure a document appears the same on any kind of device/OS or whatever would be a great idea in theory if the way it was done was actually thought through, but it wasn't and now the PDF format is even worse than HTML is (and that's really saying something). Every single time I have had to interact with a PDF it has always been a total disaster. Don't even get me started on the clusterfuck that is PDF forms.

wongarsu 2 hours ago

The problem was "have documents that look the same on any device, including printed paper and computer screens", and the approach was "PostScript does that for printers, let's simplify it and make it more universal". Both the problem it's solving and the approach were fine, maybe even great. Since then over three decades have passed, pdf has gained a plethora of features, some less well thought out than others, and real-world requirements are completely different than they were in the early 90s. If we were to invent pdf today it would likely look completely different. But it's still good enough that it's hard for a new format to offer an advantage compelling enough to replace pdf.

  • ethin 2 hours ago

    Right, but that's what I'm getting at: PDF is just a terrible format all round. People do things with it that have nothing to do with document preservation. We have PDF forms, we have PDFs able to execute arbitrary JS (which can modify the rendering of the document, completely defeating the entire reason for the format existing)... Like IMO the format just has no reason to exist/be used anymore given how bloated and over-complicated it is.

ozim 2 hours ago

PDF is fine as output format and for archiving.

Thing is people want to do bunch of things they shouldn’t with PDF like automated parsing, editing or adding forms to it.

Ideally you should have an API or other structured data to pass around but of course life is more complicated. Like PDF is all you get because API would cost more than it makes sense to do bad job parsing PDF.