Comment by sammycage

Comment by sammycage a day ago

2 replies

Thank you for your kind words and for noticing the work behind this. Building an HTML and CSS rendering engine has been a long journey with many surprises. I have been maintaining https://github.com/sammycage/lunasvg for years, so I was familiar with interpreting specs and rendering engines. That experience gave me the confidence to tackle HTML.

At first, my plan was simple. I wanted to make an HTML rendering library. But soon, I realized it could be even more useful if it focused on paged output so I could make PDFs directly. C and C++ do not have an HTML-to-PDF library that is not a full web engine. I started coding and thought I could finish in a year by working a few hours each day. But reality came fast. HTML and CSS are much harder than SVG, and even small things caused big problems.

I studied KHTML and WebKit to see how real HTML and CSS engines work. The official specs were very helpful. Slowly, everything started to come together. It felt like discovering a hidden world behind the web pages we see every day.

The hardest part has been TableLayout. Tables look simple, but handling row and column spans, nested tables, alignment, page breaks, and box calculations was very hard. I spent many hours fixing layout bugs that only appeared in some situations. It was frustrating, humbling, and also very satisfying when it worked.

I am still learning and improving. I hope other people enjoy PlutoPrint and PlutoBook as much as I do.

lewisjoe a day ago

Sounds like a wild ride! Thanks for making this open-source.

Quick question:

1. I see you've hand-written parsers yourself both css & html, why not use existing parsers? was minimizing dependencies one of your goals?

2. Does the project recongnize headers / footers and other such @page css rules?

3. Fragmentation(pagination) logic has a huge set of challenges (at least from what I read about Chrome implementing fragmentation) - did you come across this? - https://developer.chrome.com/docs/chromium/renderingng-fragm....

Was fragmentation logic really that difficult to implement?

  • sammycage a day ago

    Thanks for your questions!

    1. The documentation for HTML and CSS parsers is pretty straightforward and easier to implement, so I thought it was better to write them myself.

    2. It fully supports margin boxes (headers and footers) using properties like @top-left and @bottom-center inside @page rules. You can see more here: https://github.com/plutoprint/plutobook/blob/main/FEATURES.m...

    3. Yes, I did come across this. Fragmentation logic is as difficult as it sounds. Right now PlutoBook works with a single, consistent page size throughout a document and does not support named pages, which simplifies things a lot.

    Feel free to contact me via email if you have more questions.