Comment by ffsm8

Comment by ffsm8 9 days ago

2 replies

I was working on a side project over the holidays with the (I think) same idea as mpalmer imagined there too (though my project wouldn't be interested to him either, because my goal wasn't automating tests)

Basically, the goal would be to do it like with screenshot regression tests: basically you get 2 different execution phases: - generate - verify

And when verify fails in CI, you can automatically run a generate and open a MR/PR with the new script.

This let's you audit the script and make a plausibility check and you'll be notified on changes but have minimal effort to keep the tests running

hackgician 9 days ago

This is super interesting, is it open source? Would love to talk to you more about how this worked

  • ffsm8 8 days ago

    Its not at a stage I'd be comfortable to put it on GitHub yet, maybe in a few months.

    And I think you misunderstood my comment, I didn't describe my project, but extrapolated from the parents desire and my motivations for my project.

    Mine is actually pretty close to stagehand, at least I could very well use it. It's basically a web UI to configure browser tasks like open webpage x, iterate over "item type", with LLM integration to determine what the CSS selector for that would be. On next execution it would attempt to use the previously determined CSS selector instead of the LLM integration. On failures, it'd raise a notification with an admin tasks to verify new selectors/fix the script

    But it's a lot of code to put together as a generic UI - as I want these tasks to be repeatable without restarting from the beginning etc

    Still very much in the PoC stage without any tests, barely working persistence etc