Comment by mandatory
Good news for curl users: https://github.com/mandatoryprogrammer/thermoptic
Good news for curl users: https://github.com/mandatoryprogrammer/thermoptic
> NOTE: Due to many WAFs employing JavaScript-level fingerprinting of web browsers, thermoptic also exposes hooks to utilize the browser for key steps of the scraping process. See this section for more information on this.
This reminds me of how Stripe does user tracking for fraude detection https://mtlynch.io/stripe-update/ I wonder if thermoptic could handle that.
Oh great /s
In a month or two, I can be annoyed when I see some vibe-coded AI startup's script making five million requests a day to work's website with this.
They'll have been ignoring the error responses:
{"All data is public and available for free download": "https://example.edu/very-large-001.zip"}
ā a message we also write in the first line of every HTML page source.Then I will spend more time fighting this shit, and less time improving the public data system.
Cool project!