Comment by goncharom

Comment by goncharom 2 days ago

0 replies

I've been working web scraping using LLMs, I just shared one of the libraries I created to get structured data from arbitrary pages: https://news.ycombinator.com/item?id=45870231

Instead of sending the page's HTML to an LLM, Hikugen asks it to generate python code to fetch the data and enforces the generated data conforms to a Pydantic schema defined by the user. I'm using this to power yomu (https://github.com/goncharom/yomu), a personal email newsletter built from arbitrary websites.